All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V9 00/22] arch: Add basic LoongArch support
@ 2022-04-30  9:04 Huacai Chen
  2022-04-30  9:04 ` [PATCH V9 01/24] Documentation: LoongArch: Add basic documentations Huacai Chen
                   ` (24 more replies)
  0 siblings, 25 replies; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:04 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen

LoongArch is a new RISC ISA, which is a bit like MIPS or RISC-V.
LoongArch includes a reduced 32-bit version (LA32R), a standard 32-bit
version (LA32S) and a 64-bit version (LA64). LoongArch use ACPI as its
boot protocol LoongArch-specific interrupt controllers (similar to APIC)
are already added in the next revision of ACPI Specification (current
revision is 6.4).

This patchset is adding basic LoongArch support in mainline kernel, we
can see a complete snapshot here:
https://github.com/loongson/linux/tree/loongarch-next

Cross-compile tool chain to build kernel:
https://github.com/loongson/build-tools/releases/download/2021.12.21/loongarch64-clfs-2022-03-03-cross-tools-gcc-glibc.tar.xz

A CLFS-based Linux distro:
https://github.com/loongson/build-tools/releases/download/2021.12.21/loongarch64-clfs-system-2022-03-03.tar.bz2

Open-source tool chain which is under review (Binutils and Gcc are already upstream):
https://github.com/loongson/binutils-gdb/tree/upstream_v3.1
https://github.com/loongson/gcc/tree/loongarch_upstream_v6.3
https://github.com/loongson/glibc/tree/loongarch_2_35_dev_v2.2

Loongson and LoongArch documentations:
https://github.com/loongson/LoongArch-Documentation

LoongArch-specific interrupt controllers:
https://mantis.uefi.org/mantis/view.php?id=2203
https://mantis.uefi.org/mantis/view.php?id=2313

V1 -> V2:
1, Add documentation patches;
2, Restore copyright statements;
3, Split the big header patch;
4, Cleanup signal-related headers;
5, Cleanup incomplete 32-bit support;
6, Move the major PCI work to drivers/pci;
7, Rework Loongson64 platform support;
8, Rework lpj and __udelay()/__ndelay();
9, Rework page table layout config options;
10, Rework syscall/exception/interrupt with generic entry framework;
11, Simplify the VDSO/VSYSCALL implementation;
12, Use generic I/O access macros and functions;
13, Remove unaligned access emulation at present;
14, Keep clocksource code in arch since it is the "native clocksource";
15, Some other minor fixes and improvements.

V2 -> V3:
1, Rebased on 5.15-rc1;
2, Cleanup PCI code on V2;
3, Support multiple msi domain;
4, Support cacheable ioremap();
5, Use irq stack for interrupt handling;
6, Adjust struct ucontext and rt_sigframe;
7, Some other minor fixes and improvements.

V3 -> V4:
1, Rebased on 5.15-rc3;
2, Rework SMP support and remove legacy IPI;
3, Rework signal support and remove loongarch_abi;
4, Simplify phys_to_dma() and dma_to_phys();
5, Remove unused sys_mmap2() implementation;
6, Remove unused strncpy_user.S and strnlen_user.S; 
7, Some other minor fixes and improvements.

V4 -> V5:
1, Rebased on 5.15-rc4;
2, Fix a _PAGE_CHG_MASK bug;
3, Fix vdso_base() calculation;
4, Adjust syscall and ptrace code;
5, Use generic bitops implementation;
6, Avoid syscall restart handling in sys_rt_sigreturn();
7, Update commit messages.

V5 -> V6:
1, Rebased on 5.17-rc4;
2, Use GENERIC_IRQ_MIGRATION;
3, Improve sigcontext definition; 
4, Improve numa_default_distance(); 
5, Increse MINSIGSTKSZ and SIGSTKSZ;
6, Restruct pt_regs and user_pt_regs;
7, Fix a corner case of protection_map;
8, Fix some corner cases of system calls;
9, Separate module region and vmalloc region;
10, Rename registers to match official documents.

V6 -> V7:
1, Rebased on 5.17-rc6;
2, Refactor do_page_fault();
3, Adjust memblock initialization;
4, Use -mstrict-align to build kernel;
5, Reimplement elf_read_implies_exec() as other archs;
6, Some other minor fixes and improvements.

V7 -> V8:
1, Rebased on 5.17-rc8;
2, Remove useless abidefs.h;
3, Remove useless HAVE_ARCH_NODEDATA_EXTENSION;
4, Fix and simplify uaccess.h;
5, Fix bugs after pt_regs restruction;
6, Use generic copy_from_user_page()/copy_to_user_page();
7, Some other minor fixes and improvements.

V8 -> V9:
1, Rebased on 5.18-rc4;
2, Fix 4-level page tables;
3, Always use 16KB kernel stack;
4, Add efistub and zboot support;
5, Remove useless HAVE_FUTEX_CMPXCHG;
6, Optimize module machanism for size;
7, Define copy_user_page() to avoid build errors;
8, Some other minor fixes and improvements.

Huacai Chen(24):
 Documentation: LoongArch: Add basic documentations.
 Documentation/zh_CN: Add basic LoongArch documentations.
 LoongArch: Add elf-related definitions.
 LoongArch: Add writecombine support for drm.
 LoongArch: Add build infrastructure.
 LoongArch: Add CPU definition headers.
 LoongArch: Add atomic/locking headers.
 LoongArch: Add other common headers.
 LoongArch: Add boot and setup routines.
 LoongArch: Add exception/interrupt handling. 
 LoongArch: Add process management.
 LoongArch: Add memory management.
 LoongArch: Add system call support.
 LoongArch: Add signal handling support.
 LoongArch: Add elf and module support.
 LoongArch: Add misc common routines.
 LoongArch: Add some library functions.
 LoongArch: Add PCI controller support.
 LoongArch: Add VDSO and VSYSCALL support.
 LoongArch: Add efistub booting support.
 LoongArch: Add zboot (compressed kernel) support.
 LoongArch: Add multi-processor (SMP) support.
 LoongArch: Add Non-Uniform Memory Access (NUMA) support.
 LoongArch: Add Loongson-3 default config file.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 Documentation/arch.rst                             |    1 +
 Documentation/loongarch/features.rst               |    3 +
 Documentation/loongarch/index.rst                  |   21 +
 Documentation/loongarch/introduction.rst           |  345 +++++
 Documentation/loongarch/irq-chip-model.rst         |  168 +++
 Documentation/translations/zh_CN/index.rst         |    1 +
 .../translations/zh_CN/loongarch/features.rst      |    8 +
 .../translations/zh_CN/loongarch/index.rst         |   26 +
 .../translations/zh_CN/loongarch/introduction.rst  |  318 ++++
 .../zh_CN/loongarch/irq-chip-model.rst             |  167 +++
 arch/loongarch/.gitignore                          |    9 +
 arch/loongarch/Kbuild                              |    6 +
 arch/loongarch/Kconfig                             |  434 ++++++
 arch/loongarch/Kconfig.debug                       |    0
 arch/loongarch/Makefile                            |  131 ++
 arch/loongarch/boot/Makefile                       |   78 +
 arch/loongarch/boot/boot.lds.S                     |   64 +
 arch/loongarch/boot/decompress.c                   |   98 ++
 arch/loongarch/boot/string.c                       |  166 +++
 arch/loongarch/boot/zheader.S                      |  100 ++
 arch/loongarch/boot/zkernel.S                      |   99 ++
 arch/loongarch/configs/loongson3_defconfig         |  770 ++++++++++
 arch/loongarch/include/asm/Kbuild                  |   29 +
 arch/loongarch/include/asm/acenv.h                 |   17 +
 arch/loongarch/include/asm/acpi.h                  |   38 +
 arch/loongarch/include/asm/addrspace.h             |  110 ++
 arch/loongarch/include/asm/asm-offsets.h           |    5 +
 arch/loongarch/include/asm/asm-prototypes.h        |    7 +
 arch/loongarch/include/asm/asm.h                   |  190 +++
 arch/loongarch/include/asm/asmmacro.h              |  294 ++++
 arch/loongarch/include/asm/atomic.h                |  362 +++++
 arch/loongarch/include/asm/barrier.h               |  159 ++
 arch/loongarch/include/asm/bitops.h                |   33 +
 arch/loongarch/include/asm/bitrev.h                |   34 +
 arch/loongarch/include/asm/boot_param.h            |   97 ++
 arch/loongarch/include/asm/bootinfo.h              |   30 +
 arch/loongarch/include/asm/branch.h                |   21 +
 arch/loongarch/include/asm/bug.h                   |   23 +
 arch/loongarch/include/asm/cache.h                 |   13 +
 arch/loongarch/include/asm/cacheflush.h            |   80 +
 arch/loongarch/include/asm/cacheops.h              |   37 +
 arch/loongarch/include/asm/clocksource.h           |   12 +
 arch/loongarch/include/asm/cmpxchg.h               |  136 ++
 arch/loongarch/include/asm/compiler.h              |   15 +
 arch/loongarch/include/asm/cpu-features.h          |   69 +
 arch/loongarch/include/asm/cpu-info.h              |  136 ++
 arch/loongarch/include/asm/cpu.h                   |  127 ++
 arch/loongarch/include/asm/cpufeature.h            |   24 +
 arch/loongarch/include/asm/delay.h                 |   26 +
 arch/loongarch/include/asm/dma-direct.h            |   11 +
 arch/loongarch/include/asm/dma.h                   |   13 +
 arch/loongarch/include/asm/dmi.h                   |   24 +
 arch/loongarch/include/asm/efi.h                   |   33 +
 arch/loongarch/include/asm/elf.h                   |  299 ++++
 arch/loongarch/include/asm/entry-common.h          |   13 +
 arch/loongarch/include/asm/exec.h                  |   10 +
 arch/loongarch/include/asm/fb.h                    |   23 +
 arch/loongarch/include/asm/fixmap.h                |   13 +
 arch/loongarch/include/asm/fpregdef.h              |   49 +
 arch/loongarch/include/asm/fpu.h                   |  129 ++
 arch/loongarch/include/asm/futex.h                 |  108 ++
 arch/loongarch/include/asm/fw.h                    |   18 +
 arch/loongarch/include/asm/hardirq.h               |   26 +
 arch/loongarch/include/asm/hugetlb.h               |   79 +
 arch/loongarch/include/asm/hw_irq.h                |   17 +
 arch/loongarch/include/asm/idle.h                  |    9 +
 arch/loongarch/include/asm/inst.h                  |   63 +
 arch/loongarch/include/asm/io.h                    |  129 ++
 arch/loongarch/include/asm/irq.h                   |  133 ++
 arch/loongarch/include/asm/irq_regs.h              |   27 +
 arch/loongarch/include/asm/irqflags.h              |   78 +
 arch/loongarch/include/asm/kdebug.h                |   23 +
 arch/loongarch/include/asm/linkage.h               |   36 +
 arch/loongarch/include/asm/local.h                 |  138 ++
 arch/loongarch/include/asm/loongarch.h             | 1528 ++++++++++++++++++++
 arch/loongarch/include/asm/loongson.h              |  159 ++
 arch/loongarch/include/asm/mmu.h                   |   16 +
 arch/loongarch/include/asm/mmu_context.h           |  152 ++
 arch/loongarch/include/asm/mmzone.h                |   18 +
 arch/loongarch/include/asm/module.h                |   80 +
 arch/loongarch/include/asm/module.lds.h            |    7 +
 arch/loongarch/include/asm/numa.h                  |   69 +
 arch/loongarch/include/asm/page.h                  |  113 ++
 arch/loongarch/include/asm/pci.h                   |   40 +
 arch/loongarch/include/asm/percpu.h                |  222 +++
 arch/loongarch/include/asm/perf_event.h            |   10 +
 arch/loongarch/include/asm/pgalloc.h               |  103 ++
 arch/loongarch/include/asm/pgtable-bits.h          |  139 ++
 arch/loongarch/include/asm/pgtable.h               |  572 ++++++++
 arch/loongarch/include/asm/prefetch.h              |   29 +
 arch/loongarch/include/asm/processor.h             |  207 +++
 arch/loongarch/include/asm/ptrace.h                |  152 ++
 arch/loongarch/include/asm/reboot.h                |   10 +
 arch/loongarch/include/asm/regdef.h                |   43 +
 arch/loongarch/include/asm/seccomp.h               |   20 +
 arch/loongarch/include/asm/serial.h                |   11 +
 arch/loongarch/include/asm/setup.h                 |   21 +
 arch/loongarch/include/asm/shmparam.h              |   12 +
 arch/loongarch/include/asm/smp.h                   |  124 ++
 arch/loongarch/include/asm/sparsemem.h             |   23 +
 arch/loongarch/include/asm/spinlock.h              |   12 +
 arch/loongarch/include/asm/spinlock_types.h        |   11 +
 arch/loongarch/include/asm/stackframe.h            |  219 +++
 arch/loongarch/include/asm/stacktrace.h            |   74 +
 arch/loongarch/include/asm/string.h                |   17 +
 arch/loongarch/include/asm/switch_to.h             |   37 +
 arch/loongarch/include/asm/syscall.h               |   74 +
 arch/loongarch/include/asm/thread_info.h           |  106 ++
 arch/loongarch/include/asm/time.h                  |   50 +
 arch/loongarch/include/asm/timex.h                 |   31 +
 arch/loongarch/include/asm/tlb.h                   |  216 +++
 arch/loongarch/include/asm/tlbflush.h              |   48 +
 arch/loongarch/include/asm/topology.h              |   41 +
 arch/loongarch/include/asm/types.h                 |   33 +
 arch/loongarch/include/asm/uaccess.h               |  270 ++++
 arch/loongarch/include/asm/unistd.h                |   11 +
 arch/loongarch/include/asm/vdso.h                  |   38 +
 arch/loongarch/include/asm/vdso/clocksource.h      |    8 +
 arch/loongarch/include/asm/vdso/gettimeofday.h     |   99 ++
 arch/loongarch/include/asm/vdso/processor.h        |   14 +
 arch/loongarch/include/asm/vdso/vdso.h             |   30 +
 arch/loongarch/include/asm/vdso/vsyscall.h         |   27 +
 arch/loongarch/include/asm/vermagic.h              |   19 +
 arch/loongarch/include/asm/vmalloc.h               |    4 +
 arch/loongarch/include/uapi/asm/Kbuild             |    2 +
 arch/loongarch/include/uapi/asm/auxvec.h           |   17 +
 arch/loongarch/include/uapi/asm/bitfield.h         |   15 +
 arch/loongarch/include/uapi/asm/bitsperlong.h      |    9 +
 arch/loongarch/include/uapi/asm/break.h            |   23 +
 arch/loongarch/include/uapi/asm/byteorder.h        |   13 +
 arch/loongarch/include/uapi/asm/hwcap.h            |   20 +
 arch/loongarch/include/uapi/asm/inst.h             |   57 +
 arch/loongarch/include/uapi/asm/ptrace.h           |   52 +
 arch/loongarch/include/uapi/asm/reg.h              |   59 +
 arch/loongarch/include/uapi/asm/sigcontext.h       |   63 +
 arch/loongarch/include/uapi/asm/signal.h           |   13 +
 arch/loongarch/include/uapi/asm/swab.h             |   52 +
 arch/loongarch/include/uapi/asm/ucontext.h         |   35 +
 arch/loongarch/include/uapi/asm/unistd.h           |    6 +
 arch/loongarch/kernel/Makefile                     |   26 +
 arch/loongarch/kernel/access-helper.h              |   13 +
 arch/loongarch/kernel/acpi.c                       |  501 +++++++
 arch/loongarch/kernel/asm-offsets.c                |  262 ++++
 arch/loongarch/kernel/cacheinfo.c                  |  122 ++
 arch/loongarch/kernel/cmdline.c                    |   31 +
 arch/loongarch/kernel/cmpxchg.c                    |  105 ++
 arch/loongarch/kernel/cpu-probe.c                  |  305 ++++
 arch/loongarch/kernel/dma.c                        |   40 +
 arch/loongarch/kernel/efi-header.S                 |  100 ++
 arch/loongarch/kernel/efi.c                        |  235 +++
 arch/loongarch/kernel/elf.c                        |   30 +
 arch/loongarch/kernel/entry.S                      |   89 ++
 arch/loongarch/kernel/env.c                        |  176 +++
 arch/loongarch/kernel/fpu.S                        |  264 ++++
 arch/loongarch/kernel/genex.S                      |   95 ++
 arch/loongarch/kernel/head.S                       |  144 ++
 arch/loongarch/kernel/idle.c                       |   16 +
 arch/loongarch/kernel/image-vars.h                 |   30 +
 arch/loongarch/kernel/inst.c                       |   40 +
 arch/loongarch/kernel/io.c                         |   94 ++
 arch/loongarch/kernel/irq.c                        |  140 ++
 arch/loongarch/kernel/mem.c                        |   83 ++
 arch/loongarch/kernel/module-sections.c            |  121 ++
 arch/loongarch/kernel/module.c                     |  385 +++++
 arch/loongarch/kernel/numa.c                       |  461 ++++++
 arch/loongarch/kernel/proc.c                       |  127 ++
 arch/loongarch/kernel/process.c                    |  267 ++++
 arch/loongarch/kernel/ptrace.c                     |  431 ++++++
 arch/loongarch/kernel/reset.c                      |  102 ++
 arch/loongarch/kernel/rtc.c                        |   36 +
 arch/loongarch/kernel/setup.c                      |  464 ++++++
 arch/loongarch/kernel/signal.c                     |  568 ++++++++
 arch/loongarch/kernel/smp.c                        |  762 ++++++++++
 arch/loongarch/kernel/switch.S                     |   35 +
 arch/loongarch/kernel/syscall.c                    |   63 +
 arch/loongarch/kernel/time.c                       |  220 +++
 arch/loongarch/kernel/topology.c                   |   52 +
 arch/loongarch/kernel/traps.c                      |  753 ++++++++++
 arch/loongarch/kernel/vdso.c                       |  138 ++
 arch/loongarch/kernel/vmlinux.lds.S                |  121 ++
 arch/loongarch/lib/Makefile                        |    7 +
 arch/loongarch/lib/clear_user.S                    |   43 +
 arch/loongarch/lib/copy_user.S                     |   47 +
 arch/loongarch/lib/delay.c                         |   43 +
 arch/loongarch/lib/dump_tlb.c                      |  111 ++
 arch/loongarch/lib/memcpy.S                        |   32 +
 arch/loongarch/lib/memmove.S                       |   45 +
 arch/loongarch/lib/memset.S                        |   30 +
 arch/loongarch/mm/Makefile                         |    9 +
 arch/loongarch/mm/cache.c                          |  140 ++
 arch/loongarch/mm/extable.c                        |   22 +
 arch/loongarch/mm/fault.c                          |  261 ++++
 arch/loongarch/mm/hugetlbpage.c                    |   87 ++
 arch/loongarch/mm/init.c                           |  178 +++
 arch/loongarch/mm/ioremap.c                        |   27 +
 arch/loongarch/mm/maccess.c                        |   10 +
 arch/loongarch/mm/mmap.c                           |  125 ++
 arch/loongarch/mm/page.S                           |   84 ++
 arch/loongarch/mm/pgtable.c                        |  130 ++
 arch/loongarch/mm/tlb.c                            |  303 ++++
 arch/loongarch/mm/tlbex.S                          |  546 +++++++
 arch/loongarch/pci/Makefile                        |    7 +
 arch/loongarch/pci/acpi.c                          |  175 +++
 arch/loongarch/pci/pci.c                           |   98 ++
 arch/loongarch/tools/Makefile                      |   15 +
 arch/loongarch/tools/calc_vmlinuz_load_addr.c      |   51 +
 arch/loongarch/tools/elf-entry.c                   |   66 +
 arch/loongarch/vdso/Makefile                       |   96 ++
 arch/loongarch/vdso/elf.S                          |   15 +
 arch/loongarch/vdso/gen_vdso_offsets.sh            |   13 +
 arch/loongarch/vdso/sigreturn.S                    |   24 +
 arch/loongarch/vdso/vdso.S                         |   22 +
 arch/loongarch/vdso/vdso.lds.S                     |   72 +
 arch/loongarch/vdso/vgettimeofday.c                |   25 +
 drivers/firmware/efi/Kconfig                       |    4 +-
 drivers/firmware/efi/libstub/Makefile              |   14 +-
 drivers/firmware/efi/libstub/loongarch-stub.c      |  425 ++++++
 drivers/gpu/drm/drm_vm.c                           |    2 +-
 drivers/gpu/drm/ttm/ttm_module.c                   |    2 +-
 include/drm/drm_cache.h                            |    8 +
 include/linux/cpuhotplug.h                         |    1 +
 include/linux/efi.h                                |    1 +
 include/linux/pe.h                                 |    1 +
 include/uapi/linux/audit.h                         |    2 +
 include/uapi/linux/elf-em.h                        |    1 +
 include/uapi/linux/elf.h                           |    5 +
 include/uapi/linux/kexec.h                         |    1 +
 scripts/sorttable.c                                |    5 +
 scripts/subarch.include                            |    2 +-
 tools/include/uapi/asm/bitsperlong.h               |    2 +
 230 files changed, 23918 insertions(+), 7 deletions(-)
 create mode 100644 Documentation/loongarch/features.rst
 create mode 100644 Documentation/loongarch/index.rst
 create mode 100644 Documentation/loongarch/introduction.rst
 create mode 100644 Documentation/loongarch/irq-chip-model.rst
 create mode 100644 Documentation/translations/zh_CN/loongarch/features.rst
 create mode 100644 Documentation/translations/zh_CN/loongarch/index.rst
 create mode 100644 Documentation/translations/zh_CN/loongarch/introduction.rst
 create mode 100644 Documentation/translations/zh_CN/loongarch/irq-chip-model.rst
 create mode 100644 arch/loongarch/.gitignore
 create mode 100644 arch/loongarch/Kbuild
 create mode 100644 arch/loongarch/Kconfig
 create mode 100644 arch/loongarch/Kconfig.debug
 create mode 100644 arch/loongarch/Makefile
 create mode 100644 arch/loongarch/boot/Makefile
 create mode 100644 arch/loongarch/boot/boot.lds.S
 create mode 100644 arch/loongarch/boot/decompress.c
 create mode 100644 arch/loongarch/boot/string.c
 create mode 100644 arch/loongarch/boot/zheader.S
 create mode 100644 arch/loongarch/boot/zkernel.S
 create mode 100644 arch/loongarch/configs/loongson3_defconfig
 create mode 100644 arch/loongarch/include/asm/Kbuild
 create mode 100644 arch/loongarch/include/asm/acenv.h
 create mode 100644 arch/loongarch/include/asm/acpi.h
 create mode 100644 arch/loongarch/include/asm/addrspace.h
 create mode 100644 arch/loongarch/include/asm/asm-offsets.h
 create mode 100644 arch/loongarch/include/asm/asm-prototypes.h
 create mode 100644 arch/loongarch/include/asm/asm.h
 create mode 100644 arch/loongarch/include/asm/asmmacro.h
 create mode 100644 arch/loongarch/include/asm/atomic.h
 create mode 100644 arch/loongarch/include/asm/barrier.h
 create mode 100644 arch/loongarch/include/asm/bitops.h
 create mode 100644 arch/loongarch/include/asm/bitrev.h
 create mode 100644 arch/loongarch/include/asm/boot_param.h
 create mode 100644 arch/loongarch/include/asm/bootinfo.h
 create mode 100644 arch/loongarch/include/asm/branch.h
 create mode 100644 arch/loongarch/include/asm/bug.h
 create mode 100644 arch/loongarch/include/asm/cache.h
 create mode 100644 arch/loongarch/include/asm/cacheflush.h
 create mode 100644 arch/loongarch/include/asm/cacheops.h
 create mode 100644 arch/loongarch/include/asm/clocksource.h
 create mode 100644 arch/loongarch/include/asm/cmpxchg.h
 create mode 100644 arch/loongarch/include/asm/compiler.h
 create mode 100644 arch/loongarch/include/asm/cpu-features.h
 create mode 100644 arch/loongarch/include/asm/cpu-info.h
 create mode 100644 arch/loongarch/include/asm/cpu.h
 create mode 100644 arch/loongarch/include/asm/cpufeature.h
 create mode 100644 arch/loongarch/include/asm/delay.h
 create mode 100644 arch/loongarch/include/asm/dma-direct.h
 create mode 100644 arch/loongarch/include/asm/dma.h
 create mode 100644 arch/loongarch/include/asm/dmi.h
 create mode 100644 arch/loongarch/include/asm/efi.h
 create mode 100644 arch/loongarch/include/asm/elf.h
 create mode 100644 arch/loongarch/include/asm/entry-common.h
 create mode 100644 arch/loongarch/include/asm/exec.h
 create mode 100644 arch/loongarch/include/asm/fb.h
 create mode 100644 arch/loongarch/include/asm/fixmap.h
 create mode 100644 arch/loongarch/include/asm/fpregdef.h
 create mode 100644 arch/loongarch/include/asm/fpu.h
 create mode 100644 arch/loongarch/include/asm/futex.h
 create mode 100644 arch/loongarch/include/asm/fw.h
 create mode 100644 arch/loongarch/include/asm/hardirq.h
 create mode 100644 arch/loongarch/include/asm/hugetlb.h
 create mode 100644 arch/loongarch/include/asm/hw_irq.h
 create mode 100644 arch/loongarch/include/asm/idle.h
 create mode 100644 arch/loongarch/include/asm/inst.h
 create mode 100644 arch/loongarch/include/asm/io.h
 create mode 100644 arch/loongarch/include/asm/irq.h
 create mode 100644 arch/loongarch/include/asm/irq_regs.h
 create mode 100644 arch/loongarch/include/asm/irqflags.h
 create mode 100644 arch/loongarch/include/asm/kdebug.h
 create mode 100644 arch/loongarch/include/asm/linkage.h
 create mode 100644 arch/loongarch/include/asm/local.h
 create mode 100644 arch/loongarch/include/asm/loongarch.h
 create mode 100644 arch/loongarch/include/asm/loongson.h
 create mode 100644 arch/loongarch/include/asm/mmu.h
 create mode 100644 arch/loongarch/include/asm/mmu_context.h
 create mode 100644 arch/loongarch/include/asm/mmzone.h
 create mode 100644 arch/loongarch/include/asm/module.h
 create mode 100644 arch/loongarch/include/asm/module.lds.h
 create mode 100644 arch/loongarch/include/asm/numa.h
 create mode 100644 arch/loongarch/include/asm/page.h
 create mode 100644 arch/loongarch/include/asm/pci.h
 create mode 100644 arch/loongarch/include/asm/percpu.h
 create mode 100644 arch/loongarch/include/asm/perf_event.h
 create mode 100644 arch/loongarch/include/asm/pgalloc.h
 create mode 100644 arch/loongarch/include/asm/pgtable-bits.h
 create mode 100644 arch/loongarch/include/asm/pgtable.h
 create mode 100644 arch/loongarch/include/asm/prefetch.h
 create mode 100644 arch/loongarch/include/asm/processor.h
 create mode 100644 arch/loongarch/include/asm/ptrace.h
 create mode 100644 arch/loongarch/include/asm/reboot.h
 create mode 100644 arch/loongarch/include/asm/regdef.h
 create mode 100644 arch/loongarch/include/asm/seccomp.h
 create mode 100644 arch/loongarch/include/asm/serial.h
 create mode 100644 arch/loongarch/include/asm/setup.h
 create mode 100644 arch/loongarch/include/asm/shmparam.h
 create mode 100644 arch/loongarch/include/asm/smp.h
 create mode 100644 arch/loongarch/include/asm/sparsemem.h
 create mode 100644 arch/loongarch/include/asm/spinlock.h
 create mode 100644 arch/loongarch/include/asm/spinlock_types.h
 create mode 100644 arch/loongarch/include/asm/stackframe.h
 create mode 100644 arch/loongarch/include/asm/stacktrace.h
 create mode 100644 arch/loongarch/include/asm/string.h
 create mode 100644 arch/loongarch/include/asm/switch_to.h
 create mode 100644 arch/loongarch/include/asm/syscall.h
 create mode 100644 arch/loongarch/include/asm/thread_info.h
 create mode 100644 arch/loongarch/include/asm/time.h
 create mode 100644 arch/loongarch/include/asm/timex.h
 create mode 100644 arch/loongarch/include/asm/tlb.h
 create mode 100644 arch/loongarch/include/asm/tlbflush.h
 create mode 100644 arch/loongarch/include/asm/topology.h
 create mode 100644 arch/loongarch/include/asm/types.h
 create mode 100644 arch/loongarch/include/asm/uaccess.h
 create mode 100644 arch/loongarch/include/asm/unistd.h
 create mode 100644 arch/loongarch/include/asm/vdso.h
 create mode 100644 arch/loongarch/include/asm/vdso/clocksource.h
 create mode 100644 arch/loongarch/include/asm/vdso/gettimeofday.h
 create mode 100644 arch/loongarch/include/asm/vdso/processor.h
 create mode 100644 arch/loongarch/include/asm/vdso/vdso.h
 create mode 100644 arch/loongarch/include/asm/vdso/vsyscall.h
 create mode 100644 arch/loongarch/include/asm/vermagic.h
 create mode 100644 arch/loongarch/include/asm/vmalloc.h
 create mode 100644 arch/loongarch/include/uapi/asm/Kbuild
 create mode 100644 arch/loongarch/include/uapi/asm/auxvec.h
 create mode 100644 arch/loongarch/include/uapi/asm/bitfield.h
 create mode 100644 arch/loongarch/include/uapi/asm/bitsperlong.h
 create mode 100644 arch/loongarch/include/uapi/asm/break.h
 create mode 100644 arch/loongarch/include/uapi/asm/byteorder.h
 create mode 100644 arch/loongarch/include/uapi/asm/hwcap.h
 create mode 100644 arch/loongarch/include/uapi/asm/inst.h
 create mode 100644 arch/loongarch/include/uapi/asm/ptrace.h
 create mode 100644 arch/loongarch/include/uapi/asm/reg.h
 create mode 100644 arch/loongarch/include/uapi/asm/sigcontext.h
 create mode 100644 arch/loongarch/include/uapi/asm/signal.h
 create mode 100644 arch/loongarch/include/uapi/asm/swab.h
 create mode 100644 arch/loongarch/include/uapi/asm/ucontext.h
 create mode 100644 arch/loongarch/include/uapi/asm/unistd.h
 create mode 100644 arch/loongarch/kernel/Makefile
 create mode 100644 arch/loongarch/kernel/access-helper.h
 create mode 100644 arch/loongarch/kernel/acpi.c
 create mode 100644 arch/loongarch/kernel/asm-offsets.c
 create mode 100644 arch/loongarch/kernel/cacheinfo.c
 create mode 100644 arch/loongarch/kernel/cmdline.c
 create mode 100644 arch/loongarch/kernel/cmpxchg.c
 create mode 100644 arch/loongarch/kernel/cpu-probe.c
 create mode 100644 arch/loongarch/kernel/dma.c
 create mode 100644 arch/loongarch/kernel/efi-header.S
 create mode 100644 arch/loongarch/kernel/efi.c
 create mode 100644 arch/loongarch/kernel/elf.c
 create mode 100644 arch/loongarch/kernel/entry.S
 create mode 100644 arch/loongarch/kernel/env.c
 create mode 100644 arch/loongarch/kernel/fpu.S
 create mode 100644 arch/loongarch/kernel/genex.S
 create mode 100644 arch/loongarch/kernel/head.S
 create mode 100644 arch/loongarch/kernel/idle.c
 create mode 100644 arch/loongarch/kernel/image-vars.h
 create mode 100644 arch/loongarch/kernel/inst.c
 create mode 100644 arch/loongarch/kernel/io.c
 create mode 100644 arch/loongarch/kernel/irq.c
 create mode 100644 arch/loongarch/kernel/mem.c
 create mode 100644 arch/loongarch/kernel/module-sections.c
 create mode 100644 arch/loongarch/kernel/module.c
 create mode 100644 arch/loongarch/kernel/numa.c
 create mode 100644 arch/loongarch/kernel/proc.c
 create mode 100644 arch/loongarch/kernel/process.c
 create mode 100644 arch/loongarch/kernel/ptrace.c
 create mode 100644 arch/loongarch/kernel/reset.c
 create mode 100644 arch/loongarch/kernel/rtc.c
 create mode 100644 arch/loongarch/kernel/setup.c
 create mode 100644 arch/loongarch/kernel/signal.c
 create mode 100644 arch/loongarch/kernel/smp.c
 create mode 100644 arch/loongarch/kernel/switch.S
 create mode 100644 arch/loongarch/kernel/syscall.c
 create mode 100644 arch/loongarch/kernel/time.c
 create mode 100644 arch/loongarch/kernel/topology.c
 create mode 100644 arch/loongarch/kernel/traps.c
 create mode 100644 arch/loongarch/kernel/vdso.c
 create mode 100644 arch/loongarch/kernel/vmlinux.lds.S
 create mode 100644 arch/loongarch/lib/Makefile
 create mode 100644 arch/loongarch/lib/clear_user.S
 create mode 100644 arch/loongarch/lib/copy_user.S
 create mode 100644 arch/loongarch/lib/delay.c
 create mode 100644 arch/loongarch/lib/dump_tlb.c
 create mode 100644 arch/loongarch/lib/memcpy.S
 create mode 100644 arch/loongarch/lib/memmove.S
 create mode 100644 arch/loongarch/lib/memset.S
 create mode 100644 arch/loongarch/mm/Makefile
 create mode 100644 arch/loongarch/mm/cache.c
 create mode 100644 arch/loongarch/mm/extable.c
 create mode 100644 arch/loongarch/mm/fault.c
 create mode 100644 arch/loongarch/mm/hugetlbpage.c
 create mode 100644 arch/loongarch/mm/init.c
 create mode 100644 arch/loongarch/mm/ioremap.c
 create mode 100644 arch/loongarch/mm/maccess.c
 create mode 100644 arch/loongarch/mm/mmap.c
 create mode 100644 arch/loongarch/mm/page.S
 create mode 100644 arch/loongarch/mm/pgtable.c
 create mode 100644 arch/loongarch/mm/tlb.c
 create mode 100644 arch/loongarch/mm/tlbex.S
 create mode 100644 arch/loongarch/pci/Makefile
 create mode 100644 arch/loongarch/pci/acpi.c
 create mode 100644 arch/loongarch/pci/pci.c
 create mode 100644 arch/loongarch/tools/Makefile
 create mode 100644 arch/loongarch/tools/calc_vmlinuz_load_addr.c
 create mode 100644 arch/loongarch/tools/elf-entry.c
 create mode 100644 arch/loongarch/vdso/Makefile
 create mode 100644 arch/loongarch/vdso/elf.S
 create mode 100755 arch/loongarch/vdso/gen_vdso_offsets.sh
 create mode 100644 arch/loongarch/vdso/sigreturn.S
 create mode 100644 arch/loongarch/vdso/vdso.S
 create mode 100644 arch/loongarch/vdso/vdso.lds.S
 create mode 100644 arch/loongarch/vdso/vgettimeofday.c
 create mode 100644 drivers/firmware/efi/libstub/loongarch-stub.c
--
2.27.0


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH V9 01/24] Documentation: LoongArch: Add basic documentations
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
@ 2022-04-30  9:04 ` Huacai Chen
  2022-05-01  7:48   ` Bagas Sanjaya
  2022-05-01  9:32   ` WANG Xuerui
  2022-04-30  9:04 ` [PATCH V9 02/24] Documentation/zh_CN: Add basic LoongArch documentations Huacai Chen
                   ` (23 subsequent siblings)
  24 siblings, 2 replies; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:04 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen

Add some basic documentation for LoongArch. LoongArch is a new RISC ISA,
which is a bit like MIPS or RISC-V. LoongArch includes a reduced 32-bit
version (LA32R), a standard 32-bit version (LA32S) and a 64-bit version
(LA64).

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 Documentation/arch.rst                     |   1 +
 Documentation/loongarch/features.rst       |   3 +
 Documentation/loongarch/index.rst          |  21 ++
 Documentation/loongarch/introduction.rst   | 345 +++++++++++++++++++++
 Documentation/loongarch/irq-chip-model.rst | 168 ++++++++++
 5 files changed, 538 insertions(+)
 create mode 100644 Documentation/loongarch/features.rst
 create mode 100644 Documentation/loongarch/index.rst
 create mode 100644 Documentation/loongarch/introduction.rst
 create mode 100644 Documentation/loongarch/irq-chip-model.rst

diff --git a/Documentation/arch.rst b/Documentation/arch.rst
index 14bcd8294b93..41a66a8b38e4 100644
--- a/Documentation/arch.rst
+++ b/Documentation/arch.rst
@@ -13,6 +13,7 @@ implementation.
    arm/index
    arm64/index
    ia64/index
+   loongarch/index
    m68k/index
    mips/index
    nios2/index
diff --git a/Documentation/loongarch/features.rst b/Documentation/loongarch/features.rst
new file mode 100644
index 000000000000..ebacade3ea45
--- /dev/null
+++ b/Documentation/loongarch/features.rst
@@ -0,0 +1,3 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. kernel-feat:: $srctree/Documentation/features loongarch
diff --git a/Documentation/loongarch/index.rst b/Documentation/loongarch/index.rst
new file mode 100644
index 000000000000..d127e07a7ed3
--- /dev/null
+++ b/Documentation/loongarch/index.rst
@@ -0,0 +1,21 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+================================
+LoongArch-specific Documentation
+================================
+
+.. toctree::
+   :maxdepth: 2
+   :numbered:
+
+   introduction
+   irq-chip-model
+
+   features
+
+.. only::  subproject and html
+
+   Indices
+   =======
+
+   * :ref:`genindex`
diff --git a/Documentation/loongarch/introduction.rst b/Documentation/loongarch/introduction.rst
new file mode 100644
index 000000000000..420c0d2ebcfb
--- /dev/null
+++ b/Documentation/loongarch/introduction.rst
@@ -0,0 +1,345 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=========================
+Introduction of LoongArch
+=========================
+
+LoongArch is a new RISC ISA, which is a bit like MIPS or RISC-V. LoongArch
+includes a reduced 32-bit version (LA32R), a standard 32-bit version (LA32S)
+and a 64-bit version (LA64). LoongArch has 4 privilege levels (PLV0~PLV3),
+PLV0 is the highest level which used by kernel, and PLV3 is the lowest level
+which used by applications. This document introduces the registers, basic
+instruction set, virtual memory and some other topics of LoongArch.
+
+Registers
+=========
+
+LoongArch registers include general purpose registers (GPRs), floating point
+registers (FPRs), vector registers (VRs) and control status registers (CSRs)
+used in privileged mode (PLV0).
+
+GPRs
+----
+
+LoongArch has 32 GPRs ($r0 - $r31), each one is 32bit wide in LA32 and 64bit
+wide in LA64. $r0 is always zero, and other registers has no special feature,
+but we actually have an ABI register convention as below.
+
+================= =============== =================== ============
+Name              Alias           Usage               Preserved
+                                                      across calls
+================= =============== =================== ============
+``$r0``           ``$zero``       Constant zero       Unused
+``$r1``           ``$ra``         Return address      No
+``$r2``           ``$tp``         TLS                 Unused
+``$r3``           ``$sp``         Stack pointer       Yes
+``$r4``-``$r11``  ``$a0``-``$a7`` Argument registers  No
+``$r4``-``$r5``   ``$v0``-``$v1`` Return value        No
+``$r12``-``$r20`` ``$t0``-``$t8`` Temp registers      No
+``$r21``          ``$x``          Reserved            Unused
+``$r22``          ``$fp``         Frame pointer       Yes
+``$r23``-``$r31`` ``$s0``-``$s8`` Static registers    Yes
+================= =============== =================== ============
+
+FPRs
+----
+
+LoongArch has 32 FPRs ($f0 - $f31), each one is 64bit wide. We also have an
+ABI register conversion as below.
+
+================= ================== =================== ============
+Name              Alias              Usage               Preserved
+                                                         across calls
+================= ================== =================== ============
+``$f0``-``$f7``   ``$fa0``-``$fa7``  Argument registers  No
+``$f0``-``$f1``   ``$fv0``-``$fv1``  Return value        No
+``$f8``-``$f23``  ``$ft0``-``$ft15`` Temp registers      No
+``$f24``-``$f31`` ``$fs0``-``$fs7``  Static registers    Yes
+================= ================== =================== ============
+
+VRs
+----
+
+LoongArch has 128bit vector extension (LSX, short for Loongson SIMD eXtention)
+and 256bit vector extension (LASX, short for Loongson Advanced SIMD eXtension).
+There are also 32 vector registers, for LSX is $v0 - $v31, and for LASX is $x0
+- $x31. FPRs and VRs are reused, e.g. the lower 128bits of $x0 is $v0, and the
+lower 64bits of $v0 is $f0, etc.
+
+CSRs
+----
+
+CSRs can only be used in privileged mode (PLV0):
+
+================= ===================================== ==============
+Address           Full Name                             Abbrev Name
+================= ===================================== ==============
+0x0               Current Mode information              CRMD
+0x1               Pre-exception Mode information        PRMD
+0x2               Extended Unit Enable                  EUEN
+0x3               Miscellaneous controller              MISC
+0x4               Exception Configuration               ECFG
+0x5               Exception Status                      ESTAT
+0x6               Exception Return Address              ERA
+0x7               Bad Virtual Address                   BADV
+0x8               Bad Instruction                       BADI
+0xC               Exception Entry Base address          EENTRY
+0x10              TLB Index                             TLBIDX
+0x11              TLB Entry High-order bits             TLBEHI
+0x12              TLB Entry Low-order bits 0            TLBELO0
+0x13              TLB Entry Low-order bits 1            TLBELO1
+0x18              Address Space Identifier              ASID
+0x19              Page Global Directory address for     PGDL
+                  Lower half address space
+0x1A              Page Global Directory address for     PGDH
+                  Higher half address space
+0x1B              Page Global Directory address         PGD
+0x1C              Page Walk Controller for Lower        PWCL
+                  half address space
+0x1D              Page Walk Controller for Higher       PWCH
+                  half address space
+0x1E              STLB Page Size                        STLBPS
+0x1F              Reduced Virtual Address Configuration RVACFG
+0x20              CPU Identifier                        CPUID
+0x21              Privileged Resource Configuration 1   PRCFG1
+0x22              Privileged Resource Configuration 2   PRCFG2
+0x23              Privileged Resource Configuration 3   PRCFG3
+0x30+n (0≤n≤15)   Data Save register                    SAVEn
+0x40              Timer Identifier                      TID
+0x41              Timer Configuration                   TCFG
+0x42              Timer Value                           TVAL
+0x43              Compensation of Timer Count           CNTC
+0x44              Timer Interrupt Clearing              TICLR
+0x60              LLBit Controller                      LLBCTL
+0x80              Implementation-specific Controller 1  IMPCTL1
+0x81              Implementation-specific Controller 2  IMPCTL2
+0x88              TLB Refill Exception Entry Base       TLBRENTRY
+                  address
+0x89              TLB Refill Exception BAD Virtual      TLBRBADV
+                  address
+0x8A              TLB Refill Exception Return Address   TLBRERA
+0x8B              TLB Refill Exception data SAVE        TLBRSAVE
+                  register
+0x8C              TLB Refill Exception Entry Low-order  TLBRELO0
+                  bits 0
+0x8D              TLB Refill Exception Entry Low-order  TLBRELO1
+                  bits 1
+0x8E              TLB Refill Exception Entry High-order TLBEHI
+                  bits
+0x8F              TLB Refill Exception Pre-exception    TLBRPRMD
+                  Mode information
+0x90              Machine Error Controller              MERRCTL
+0x91              Machine Error Information 1           MERRINFO1
+0x92              Machine Error Information 2           MERRINFO2
+0x93              Machine Error Exception Entry Base    MERRENTRY
+                  address
+0x94              Machine Error Exception Return        MERRERA
+                  address
+0x95              Machine Error Exception data SAVE     MERRSAVE
+                  register
+0x98              Cache TAGs                            CTAG
+0x180+n (0≤n≤3)   Direct Mapping configuration Window n DMWn
+0x200+2n (0≤n≤31) Performance Monitor Configuration n   PMCFGn
+0x201+2n (0≤n≤31) Performance Monitor overall Counter n PMCNTn
+0x300             Memory load/store WatchPoint          MWPC
+                  overall Controller
+0x301             Memory load/store WatchPoint          MWPS
+                  overall Status
+0x310+8n (0≤n≤7)  Memory load/store WatchPoint n        MWPnCFG1
+                  Configuration 1
+0x311+8n (0≤n≤7)  Memory load/store WatchPoint n        MWPnCFG2
+                  Configuration 2
+0x312+8n (0≤n≤7)  Memory load/store WatchPoint n        MWPnCFG3
+                  Configuration 3
+0x313+8n (0≤n≤7)  Memory load/store WatchPoint n        MWPnCFG4
+                  Configuration 4
+0x380             Fetch WatchPoint overall Controller   FWPC
+0x381             Fetch WatchPoint overall Status       FWPS
+0x390+8n (0≤n≤7)  Fetch WatchPoint n Configuration 1    FWPnCFG1
+0x391+8n (0≤n≤7)  Fetch WatchPoint n Configuration 2    FWPnCFG2
+0x392+8n (0≤n≤7)  Fetch WatchPoint n Configuration 3    FWPnCFG3
+0x393+8n (0≤n≤7)  Fetch WatchPoint n Configuration 4    FWPnCFG4
+0x500             Debug register                        DBG
+0x501             Debug Exception Return address        DERA
+0x502             Debug data SAVE register              DSAVE
+================= ===================================== ==============
+
+ERA,TLBRERA,MERREEA and ERA sometimes are also called EPC,TLBREPC
+MERREPC and DEPC.
+
+Basic Instruction Set
+=====================
+
+Instruction formats
+-------------------
+
+LoongArch has 32-bit wide instructions, and there are 9 instruction formats::
+
+  2R-type:    Opcode + Rj + Rd
+  3R-type:    Opcode + Rk + Rj + Rd
+  4R-type:    Opcode + Ra + Rk + Rj + Rd
+  2RI8-type:  Opcode + I8 + Rj + Rd
+  2RI12-type: Opcode + I12 + Rj + Rd
+  2RI14-type: Opcode + I14 + Rj + Rd
+  2RI16-type: Opcode + I16 + Rj + Rd
+  1RI21-type: Opcode + I21L + Rj + I21H
+  I26-type:   Opcode + I26L + I26H
+
+Rj and Rk are source operands (register), Rd is destination operand (register),
+and Ra is the additional operand (register) in 4R-type. I8/I12/I16/I21/I26 are
+8-bits/12-bits/16-bits/21-bits/26bits immediate data. 21bits/26bits immediate
+data are split into higher bits and lower bits in an instruction word, so you
+can see I21L/I21H and I26L/I26H here.
+
+Instruction names (Mnemonics)
+-----------------------------
+
+We only list the instruction names here, for details please read the references.
+
+Arithmetic Operation Instructions::
+
+  ADD.W SUB.W ADDI.W ADD.D SUB.D ADDI.D
+  SLT SLTU SLTI SLTUI
+  AND OR NOR XOR ANDN ORN ANDI ORI XORI
+  MUL.W MULH.W MULH.WU DIV.W DIV.WU MOD.W MOD.WU
+  MUL.D MULH.D MULH.DU DIV.D DIV.DU MOD.D MOD.DU
+  PCADDI PCADDU12I PCADDU18I
+  LU12I.W LU32I.D LU52I.D ADDU16I.D
+
+Bit-shift Instructions::
+
+  SLL.W SRL.W SRA.W ROTR.W SLLI.W SRLI.W SRAI.W ROTRI.W
+  SLL.D SRL.D SRA.D ROTR.D SLLI.D SRLI.D SRAI.D ROTRI.D
+
+Bit-manipulation Instructions::
+
+  EXT.W.B EXT.W.H CLO.W CLO.D SLZ.W CLZ.D CTO.W CTO.D CTZ.W CTZ.D
+  BYTEPICK.W BYTEPICK.D BSTRINS.W BSTRINS.D BSTRPICK.W BSTRPICK.D
+  REVB.2H REVB.4H REVB.2W REVB.D REVH.2W REVH.D BITREV.4B BITREV.8B BITREV.W BITREV.D
+  MASKEQZ MASKNEZ
+
+Branch Instructions::
+
+  BEQ BNE BLT BGE BLTU BGEU BEQZ BNEZ B BL JIRL
+
+Load/Store Instructions::
+
+  LD.B LD.BU LD.H LD.HU LD.W LD.WU LD.D ST.B ST.H ST.W ST.D
+  LDX.B LDX.BU LDX.H LDX.HU LDX.W LDX.WU LDX.D STX.B STX.H STX.W STX.D
+  LDPTR.W LDPTR.D STPTR.W STPTR.D
+  PRELD PRELDX
+
+Atomic Operation Instructions::
+
+  LL.W SC.W LL.D SC.D
+  AMSWAP.W AMSWAP.D AMADD.W AMADD.D AMAND.W AMAND.D AMOR.W AMOR.D AMXOR.W AMXOR.D
+  AMMAX.W AMMAX.D AMMIN.W AMMIN.D
+
+Barrier Instructions::
+
+  IBAR DBAR
+
+Special Instructions::
+
+  SYSCALL BREAK CPUCFG NOP IDLE ERTN DBCL RDTIMEL.W RDTIMEH.W RDTIME.D ASRTLE.D ASRTGT.D
+
+Privileged Instructions::
+
+  CSRRD CSRWR CSRXCHG
+  IOCSRRD.B IOCSRRD.H IOCSRRD.W IOCSRRD.D IOCSRWR.B IOCSRWR.H IOCSRWR.W IOCSRWR.D
+  CACOP TLBP(TLBSRCH) TLBRD TLBWR TLBFILL TLBCLR TLBFLUSH INVTLB LDDIR LDPTE
+
+Virtual Memory
+==============
+
+LoongArch can use direct-mapped virtual memory and page-mapped virtual memory.
+
+Direct-mapped virtual memory is configured by CSR.DMWn (n=0~3), it has a simple
+relationship between virtual address (VA) and physical address (PA)::
+
+ VA = PA + FixedOffset
+
+Page-mapped virtual memory has arbitrary relationship between VA and PA, which
+is recorded in TLB and page tables. LoongArch's TLB includes a fully-associative
+MTLB (Multiple Page Size TLB) and set-associative STLB (Single Page Size TLB).
+
+By default, the whole virtual address space of LA32 is configured like this:
+
+============ =========================== =============================
+Name         Address Range               Attributes
+============ =========================== =============================
+``UVRANGE``  ``0x00000000 - 0x7FFFFFFF`` Page-mapped, Cached, PLV0~3
+``KPRANGE0`` ``0x80000000 - 0x9FFFFFFF`` Direct-mapped, Uncached, PLV0
+``KPRANGE1`` ``0xA0000000 - 0xBFFFFFFF`` Direct-mapped, Cached, PLV0
+``KVRANGE``  ``0xC0000000 - 0xFFFFFFFF`` Page-mapped, Cached, PLV0
+============ =========================== =============================
+
+User mode (PLV3) can only access UVRANGE. For direct-mapped KPRANGE0 and
+KPRANGE1, PA is equal to VA with bit30~31 cleared. For example, the uncached
+direct-mapped VA of 0x00001000 is 0x80001000, and the cached direct-mapped
+VA of 0x00001000 is 0xA0001000.
+
+By default, the whole virtual address space of LA64 is configured like this:
+
+============ ====================== ======================================
+Name         Address Range          Attributes
+============ ====================== ======================================
+``XUVRANGE`` ``0x0000000000000000 - Page-mapped, Cached, PLV0~3
+             0x3FFFFFFFFFFFFFFF``
+``XSPRANGE`` ``0x4000000000000000 - Direct-mapped, Cached / Uncached, PLV0
+             0x7FFFFFFFFFFFFFFF``
+``XKPRANGE`` ``0x8000000000000000 - Direct-mapped, Cached / Uncached, PLV0
+             0xBFFFFFFFFFFFFFFF``
+``XKVRANGE`` ``0xC000000000000000 - Page-mapped, Cached, PLV0
+             0xFFFFFFFFFFFFFFFF``
+============ ====================== ======================================
+
+User mode (PLV3) can only access XUVRANGE. For direct-mapped XSPRANGE and XKPRANGE,
+PA is equal to VA with bit60~63 cleared, and the cache attributes is configured by
+bit60~61 (0 is strongly-ordered uncached, 1 is coherent cached, and 2 is weakly-
+ordered uncached) in VA. Currently we only use XKPRANGE for direct mapping and
+XSPRANGE is reserved. As an example, the strongly-ordered uncached direct-mapped VA
+(in XKPRANGE) of 0x00000000 00001000 is 0x80000000 00001000, the coherent cached
+direct-mapped VA (in XKPRANGE) of 0x00000000 00001000 is 0x90000000 00001000, and
+the weakly-ordered uncached direct-mapped VA (in XKPRANGE) of 0x00000000 00001000
+is 0xA0000000 00001000.
+
+Relationship of Loongson and LoongArch
+======================================
+
+LoongArch is a RISC ISA which is different from any other existing ones, while
+Loongson is a family of processors. Loongson includes 3 series: Loongson-1 is
+the 32-bit processor series, Loongson-2 is the low-end 64-bit processor series,
+and Loongson-3 is the high-end 64-bit processor series. Old Loongson is based on
+MIPS, while New Loongson is based on LoongArch. Take Loongson-3 as an example:
+Loongson-3A1000/3B1500/3A2000/3A3000/3A4000 are MIPS-compatible, while Loongson-
+3A5000 (and future revisions) are all based on LoongArch.
+
+References
+==========
+
+Official web site of Loongson and LoongArch (Loongson Technology Corp. Ltd.):
+
+  http://www.loongson.cn/index.html
+
+Developer web site of Loongson and LoongArch (Software and Documentation):
+
+  http://www.loongnix.cn/index.php
+
+  https://github.com/loongson
+
+Documentation of LoongArch ISA:
+
+  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.00-CN.pdf (in Chinese)
+
+  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.00-EN.pdf (in English)
+
+Documentation of LoongArch ELF ABI:
+
+  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-ELF-ABI-v1.00-CN.pdf (in Chinese)
+
+  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-ELF-ABI-v1.00-EN.pdf (in English)
+
+Linux kernel repository of Loongson and LoongArch:
+
+  https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git
diff --git a/Documentation/loongarch/irq-chip-model.rst b/Documentation/loongarch/irq-chip-model.rst
new file mode 100644
index 000000000000..bde112b81ace
--- /dev/null
+++ b/Documentation/loongarch/irq-chip-model.rst
@@ -0,0 +1,168 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======================================
+IRQ chip model (hierarchy) of LoongArch
+=======================================
+
+Currently, LoongArch based processors (e.g. Loongson-3A5000) can only work together
+with LS7A chipsets. The irq chips in LoongArch computers include CPUINTC (CPU Core
+Interrupt Controller), LIOINTC (Legacy I/O Interrupt Controller), EIOINTC (Extended
+I/O Interrupt Controller), HTVECINTC (Hyper-Transport Vector Interrupt Controller),
+PCH-PIC (Main Interrupt Controller in LS7A chipset), PCH-LPC (LPC Interrupt Controller
+in LS7A chipset) and PCH-MSI (MSI Interrupt Controller).
+
+CPUINTC is a per-core controller (in CPU), LIOINTC/EIOINTC/HTVECINTC are per-package
+controllers (in CPU), while PCH-PIC/PCH-LPC/PCH-MSI are controllers out of CPU (i.e.,
+in chipsets). These controllers (in other words, irqchips) are linked in a hierarchy,
+and there are two models of hierarchy (legacy model and extended model).
+
+Legacy IRQ model
+================
+
+In this model, IPI (Inter-Processor Interrupt) and CPU Local Timer interrupt go
+to CPUINTC directly, CPU UARTS interrupts go to LIOINTC, while all other devices
+interrupts go to PCH-PIC/PCH-LPC/PCH-MSI and gathered by HTVECINTC, and then go
+to LIOINTC, and then CPUINTC.
+
+ +---------------------------------------------+
+ |::                                           |
+ |                                             |
+ |    +-----+     +---------+     +-------+    |
+ |    | IPI | --> | CPUINTC | <-- | Timer |    |
+ |    +-----+     +---------+     +-------+    |
+ |                     ^                       |
+ |                     |                       |
+ |                +---------+     +-------+    |
+ |                | LIOINTC | <-- | UARTs |    |
+ |                +---------+     +-------+    |
+ |                     ^                       |
+ |                     |                       |
+ |               +-----------+                 |
+ |               | HTVECINTC |                 |
+ |               +-----------+                 |
+ |                ^         ^                  |
+ |                |         |                  |
+ |          +---------+ +---------+            |
+ |          | PCH-PIC | | PCH-MSI |            |
+ |          +---------+ +---------+            |
+ |            ^     ^           ^              |
+ |            |     |           |              |
+ |    +---------+ +---------+ +---------+      |
+ |    | PCH-LPC | | Devices | | Devices |      |
+ |    +---------+ +---------+ +---------+      |
+ |         ^                                   |
+ |         |                                   |
+ |    +---------+                              |
+ |    | Devices |                              |
+ |    +---------+                              |
+ |                                             |
+ |                                             |
+ +---------------------------------------------+
+
+Extended IRQ model
+==================
+
+In this model, IPI (Inter-Processor Interrupt) and CPU Local Timer interrupt go
+to CPUINTC directly, CPU UARTS interrupts go to LIOINTC, while all other devices
+interrupts go to PCH-PIC/PCH-LPC/PCH-MSI and gathered by EIOINTC, and then go to
+to CPUINTC directly.
+
+ +--------------------------------------------------------+
+ |::                                                      |
+ |                                                        |
+ |         +-----+     +---------+     +-------+          |
+ |         | IPI | --> | CPUINTC | <-- | Timer |          |
+ |         +-----+     +---------+     +-------+          |
+ |                      ^       ^                         |
+ |                      |       |                         |
+ |               +---------+ +---------+     +-------+    |
+ |               | EIOINTC | | LIOINTC | <-- | UARTs |    |
+ |               +---------+ +---------+     +-------+    |
+ |                ^       ^                               |
+ |                |       |                               |
+ |         +---------+ +---------+                        |
+ |         | PCH-PIC | | PCH-MSI |                        |
+ |         +---------+ +---------+                        |
+ |           ^     ^           ^                          |
+ |           |     |           |                          |
+ |   +---------+ +---------+ +---------+                  |
+ |   | PCH-LPC | | Devices | | Devices |                  |
+ |   +---------+ +---------+ +---------+                  |
+ |        ^                                               |
+ |        |                                               |
+ |   +---------+                                          |
+ |   | Devices |                                          |
+ |   +---------+                                          |
+ |                                                        |
+ |                                                        |
+ +--------------------------------------------------------+
+
+ACPI-related definitions
+========================
+
+CPUINTC::
+
+  ACPI_MADT_TYPE_CORE_PIC;
+  struct acpi_madt_core_pic;
+  enum acpi_madt_core_pic_version;
+
+LIOINTC::
+
+  ACPI_MADT_TYPE_LIO_PIC;
+  struct acpi_madt_lio_pic;
+  enum acpi_madt_lio_pic_version;
+
+EIOINTC::
+
+  ACPI_MADT_TYPE_EIO_PIC;
+  struct acpi_madt_eio_pic;
+  enum acpi_madt_eio_pic_version;
+
+HTVECINTC::
+
+  ACPI_MADT_TYPE_HT_PIC;
+  struct acpi_madt_ht_pic;
+  enum acpi_madt_ht_pic_version;
+
+PCH-PIC::
+
+  ACPI_MADT_TYPE_BIO_PIC;
+  struct acpi_madt_bio_pic;
+  enum acpi_madt_bio_pic_version;
+
+PCH-MSI::
+
+  ACPI_MADT_TYPE_MSI_PIC;
+  struct acpi_madt_msi_pic;
+  enum acpi_madt_msi_pic_version;
+
+PCH-LPC::
+
+  ACPI_MADT_TYPE_LPC_PIC;
+  struct acpi_madt_lpc_pic;
+  enum acpi_madt_lpc_pic_version;
+
+References
+==========
+
+Documentation of Loongson-3A5000:
+
+  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-3A5000-usermanual-1.02-CN.pdf (in Chinese)
+
+  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-3A5000-usermanual-1.02-EN.pdf (in English)
+
+Documentation of Loongson's LS7A chipset:
+
+  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-7A1000-usermanual-2.00-CN.pdf (in Chinese)
+
+  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-7A1000-usermanual-2.00-EN.pdf (in English)
+
+Attention: CPUINTC is CSR.ECFG/CSR.ESTAT and its interrupt controller described
+in Section 7.4 of "LoongArch Reference Manual, Vol 1"; LIOINTC is "Legacy I/O
+Interrupts" described in Section 11.1 of "Loongson 3A5000 Processor Reference
+Manual"; EIOINTC is "Extended I/O Interrupts" described in Section 11.2 of
+"Loongson 3A5000 Processor Reference Manual"; HTVECINTC is "HyperTransport
+Interrupts" described in Section 14.3 of "Loongson 3A5000 Processor Reference
+Manual"; PCH-PIC/PCH-MSI is "Interrupt Controller" described in Section 5 of
+"Loongson 7A1000 Bridge User Manual"; PCH-LPC is "LPC Interrupts" described in
+Section 24.3 of "Loongson 7A1000 Bridge User Manual".
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 02/24] Documentation/zh_CN: Add basic LoongArch documentations
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
  2022-04-30  9:04 ` [PATCH V9 01/24] Documentation: LoongArch: Add basic documentations Huacai Chen
@ 2022-04-30  9:04 ` Huacai Chen
  2022-05-01  9:38   ` WANG Xuerui
  2022-04-30  9:04 ` [PATCH V9 03/24] LoongArch: Add elf-related definitions Huacai Chen
                   ` (22 subsequent siblings)
  24 siblings, 1 reply; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:04 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen,
	Alex Shi

Add some basic documentation (zh_CN version) for LoongArch. LoongArch is
a new RISC ISA, which is a bit like MIPS or RISC-V. LoongArch includes a
reduced 32-bit version (LA32R), a standard 32-bit version (LA32S) and a
64-bit version (LA64).

Reviewed-by: Alex Shi <alexs@kernel.org>
Reviewed-by: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 Documentation/translations/zh_CN/index.rst    |   1 +
 .../translations/zh_CN/loongarch/features.rst |   8 +
 .../translations/zh_CN/loongarch/index.rst    |  26 ++
 .../zh_CN/loongarch/introduction.rst          | 318 ++++++++++++++++++
 .../zh_CN/loongarch/irq-chip-model.rst        | 167 +++++++++
 5 files changed, 520 insertions(+)
 create mode 100644 Documentation/translations/zh_CN/loongarch/features.rst
 create mode 100644 Documentation/translations/zh_CN/loongarch/index.rst
 create mode 100644 Documentation/translations/zh_CN/loongarch/introduction.rst
 create mode 100644 Documentation/translations/zh_CN/loongarch/irq-chip-model.rst

diff --git a/Documentation/translations/zh_CN/index.rst b/Documentation/translations/zh_CN/index.rst
index 88d8df957a78..41c59950523c 100644
--- a/Documentation/translations/zh_CN/index.rst
+++ b/Documentation/translations/zh_CN/index.rst
@@ -171,6 +171,7 @@ TODOList:
    riscv/index
    openrisc/index
    parisc/index
+   loongarch/index
 
 TODOList:
 
diff --git a/Documentation/translations/zh_CN/loongarch/features.rst b/Documentation/translations/zh_CN/loongarch/features.rst
new file mode 100644
index 000000000000..3886e635ec06
--- /dev/null
+++ b/Documentation/translations/zh_CN/loongarch/features.rst
@@ -0,0 +1,8 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. include:: ../disclaimer-zh_CN.rst
+
+:Original: Documentation/loongarch/features.rst
+:Translator: Huacai Chen <chenhuacai@loongson.cn>
+
+.. kernel-feat:: $srctree/Documentation/features loongarch
diff --git a/Documentation/translations/zh_CN/loongarch/index.rst b/Documentation/translations/zh_CN/loongarch/index.rst
new file mode 100644
index 000000000000..367dead02e3a
--- /dev/null
+++ b/Documentation/translations/zh_CN/loongarch/index.rst
@@ -0,0 +1,26 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. include:: ../disclaimer-zh_CN.rst
+
+:Original: Documentation/loongarch/index.rst
+:Translator: Huacai Chen <chenhuacai@loongson.cn>
+
+=================
+LoongArch特性文档
+=================
+
+.. toctree::
+   :maxdepth: 2
+   :numbered:
+
+   introduction
+   irq-chip-model
+
+   features
+
+.. only::  subproject and html
+
+   Indices
+   =======
+
+   * :ref:`genindex`
diff --git a/Documentation/translations/zh_CN/loongarch/introduction.rst b/Documentation/translations/zh_CN/loongarch/introduction.rst
new file mode 100644
index 000000000000..432a6267f1f1
--- /dev/null
+++ b/Documentation/translations/zh_CN/loongarch/introduction.rst
@@ -0,0 +1,318 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. include:: ../disclaimer-zh_CN.rst
+
+:Original: Documentation/loongarch/introduction.rst
+:Translator: Huacai Chen <chenhuacai@loongson.cn>
+
+=============
+LoongArch介绍
+=============
+
+LoongArch是一种新的RISC ISA,在一定程度上类似于MIPS和RISC-V。LoongArch指令集
+包括一个精简32位版(LA32R)、一个标准32位版(LA32S)、一个64位版(LA64)。
+LoongArch有四个特权级(PLV0~PLV3),其中PLV0是最高特权级,用于内核;而PLV3是
+最低特权级,用于应用程序。本文档介绍了LoongArch的寄存器、基础指令集、虚拟内
+存以及其他一些主题。
+
+寄存器
+======
+
+LoongArch的寄存器包括通用寄存器(GPRs)、浮点寄存器(FPRs)、向量寄存器(VRs)
+和用于特权模式(PLV0)的控制状态寄存器(CSRs)。
+
+通用寄存器
+----------
+
+LoongArch包括32个通用寄存器($r0 - $r31),LA32中每个寄存器为32位宽,LA64中
+每个寄存器为64位宽。$r0的内容总是0,而其他寄存器没有特殊功能。然而,我们有
+如下所示的一套ABI寄存器使用约定。
+
+================= =============== =================== ==========
+寄存器名          别名            用途                跨调用保持
+================= =============== =================== ==========
+``$r0``           ``$zero``       常量0               不使用
+``$r1``           ``$ra``         返回地址            否
+``$r2``           ``$tp``         TLS(线程局部存储) 不使用
+``$r3``           ``$sp``         栈指针              是
+``$r4``-``$r11``  ``$a0``-``$a7`` 参数寄存器          否
+``$r4``-``$r5``   ``$v0``-``$v1`` 返回值              否
+``$r12``-``$r20`` ``$t0``-``$t8`` 临时寄存器          否
+``$r21``          ``$x``          保留                不使用
+``$r22``          ``$fp``         帧指针              是
+``$r23``-``$r31`` ``$s0``-``$s8`` 静态寄存器          是
+================= =============== =================== ==========
+
+浮点寄存器
+----------
+
+LoongArch有32个浮点寄存器($f0 - $f31),每个寄存器均为64位宽。我们同样
+有如下所示的一套ABI寄存器使用约定。
+
+================= ================== =================== ==========
+寄存器名          别名               用途                跨调用保持
+================= ================== =================== ==========
+``$f0``-``$f7``   ``$fa0``-``$fa7``  参数寄存器          否
+``$f0``-``$f1``   ``$fv0``-``$fv1``  返回值              否
+``$f8``-``$f23``  ``$ft0``-``$ft15`` 临时寄存器          否
+``$f24``-``$f31`` ``$fs0``-``$fs7``  静态寄存器          是
+================= ================== =================== ==========
+
+向量寄存器
+----------
+
+LoongArch拥有128位向量扩展(LSX,全称Loongson SIMD eXtention)和256位向量扩展
+(LASX,全称Loongson Advanced SIMD eXtension)。共有32个向量寄存器,对于LSX是
+$v0 - $v31,对于LASX是$x0 - $x31。浮点寄存器和向量寄存器是复用的,比如:$x0的
+低128位是$v0,而$v0的低64位又是$f0,以此类推。
+
+控制状态寄存器
+--------------
+
+控制状态寄存器只用于特权模式(PLV0):
+
+================= ==================================== ==========
+地址              全称描述                             简称
+================= ==================================== ==========
+0x0               当前模式信息                         CRMD
+0x1               异常前模式信息                       PRMD
+0x2               扩展部件使能                         EUEN
+0x3               杂项控制                             MISC
+0x4               异常配置                             ECFG
+0x5               异常状态                             ESTAT
+0x6               异常返回地址                         ERA
+0x7               出错虚拟地址                         BADV
+0x8               出错指令                             BADI
+0xC               异常入口地址                         EENTRY
+0x10              TLB索引                              TLBIDX
+0x11              TLB表项高位                          TLBEHI
+0x12              TLB表项低位0                         TLBELO0
+0x13              TLB表项低位1                         TLBELO1
+0x18              地址空间标识符                       ASID
+0x19              低半地址空间页全局目录基址           PGDL
+0x1A              高半地址空间页全局目录基址           PGDH
+0x1B              页全局目录基址                       PGD
+0x1C              页表遍历控制低半部分                 PWCL
+0x1D              页表遍历控制高半部分                 PWCH
+0x1E              STLB页大小                           STLBPS
+0x1F              缩减虚地址配置                       RVACFG
+0x20              CPU编号                              CPUID
+0x21              特权资源配置信息1                    PRCFG1
+0x22              特权资源配置信息2                    PRCFG2
+0x23              特权资源配置信息3                    PRCFG3
+0x30+n (0≤n≤15)   数据保存寄存器                       SAVEn
+0x40              定时器编号                           TID
+0x41              定时器配置                           TCFG
+0x42              定时器值                             TVAL
+0x43              计时器补偿                           CNTC
+0x44              定时器中断清除                       TICLR
+0x60              LLBit相关控制                        LLBCTL
+0x80              实现相关控制1                        IMPCTL1
+0x81              实现相关控制2                        IMPCTL2
+0x88              TLB重填异常入口地址                  TLBRENTRY
+0x89              TLB重填异常出错虚地址                TLBRBADV
+0x8A              TLB重填异常返回地址                  TLBRERA
+0x8B              TLB重填异常数据保存                  TLBRSAVE
+0x8C              TLB重填异常表项低位0                 TLBRELO0
+0x8D              TLB重填异常表项低位1                 TLBRELO1
+0x8E              TLB重填异常表项高位                  TLBEHI
+0x8F              TLB重填异常前模式信息                TLBRPRMD
+0x90              机器错误控制                         MERRCTL
+0x91              机器错误信息1                        MERRINFO1
+0x92              机器错误信息2                        MERRINFO2
+0x93              机器错误异常入口地址                 MERRENTRY
+0x94              机器错误异常返回地址                 MERRERA
+0x95              机器错误异常数据保存                 MERRSAVE
+0x98              高速缓存标签                         CTAG
+0x180+n (0≤n≤3)   直接映射配置窗口n                    DMWn
+0x200+2n (0≤n≤31) 性能监测配置n                        PMCFGn
+0x201+2n (0≤n≤31) 性能监测计数器n                      PMCNTn
+0x300             内存读写监视点整体控制               MWPC
+0x301             内存读写监视点整体状态               MWPS
+0x310+8n (0≤n≤7)  内存读写监视点n配置1                 MWPnCFG1
+0x311+8n (0≤n≤7)  内存读写监视点n配置2                 MWPnCFG2
+0x312+8n (0≤n≤7)  内存读写监视点n配置3                 MWPnCFG3
+0x313+8n (0≤n≤7)  内存读写监视点n配置4                 MWPnCFG4
+0x380             取指监视点整体控制                   FWPC
+0x381             取指监视点整体状态                   FWPS
+0x390+8n (0≤n≤7)  取指监视点n配置1                     FWPnCFG1
+0x391+8n (0≤n≤7)  取指监视点n配置2                     FWPnCFG2
+0x392+8n (0≤n≤7)  取指监视点n配置3                     FWPnCFG3
+0x393+8n (0≤n≤7)  取指监视点n配置4                     FWPnCFG4
+0x500             调试寄存器                           DBG
+0x501             调试异常返回地址                     DERA
+0x502             调试数据保存                         DSAVE
+================= ==================================== ==========
+
+ERA,TLBRERA,MERREEA和ERA有时也称为EPC,TLBREPC,MERREPC和DEPC。
+
+基础指令集
+==========
+
+指令格式
+--------
+
+LoongArch的指令字长为32位,一共有9种指令格式::
+
+  2R-type:    Opcode + Rj + Rd
+  3R-type:    Opcode + Rk + Rj + Rd
+  4R-type:    Opcode + Ra + Rk + Rj + Rd
+  2RI8-type:  Opcode + I8 + Rj + Rd
+  2RI12-type: Opcode + I12 + Rj + Rd
+  2RI14-type: Opcode + I14 + Rj + Rd
+  2RI16-type: Opcode + I16 + Rj + Rd
+  1RI21-type: Opcode + I21L + Rj + I21H
+  I26-type:   Opcode + I26L + I26H
+
+Opcode是指令操作码,Rj和Rk是源操作数(寄存器),Rd是目标操作数(寄存器),Ra是
+4R-type格式特有的附加操作数(寄存器)。I8/I12/I16/I21/I26分别是8位/12位/16位/
+21位/26位的立即数。其中21位和26位立即数在指令字中被分割为高位部分与低位部分,
+所以你们在这里的格式描述中能够看到I21L/I21H和I26L/I26H这样的表述。
+
+指令名称(助记符)
+------------------
+
+我们在此只简单罗列一下指令名称,详细信息请阅读参考文献中的文档。
+
+算术运算指令::
+
+  ADD.W SUB.W ADDI.W ADD.D SUB.D ADDI.D
+  SLT SLTU SLTI SLTUI
+  AND OR NOR XOR ANDN ORN ANDI ORI XORI
+  MUL.W MULH.W MULH.WU DIV.W DIV.WU MOD.W MOD.WU
+  MUL.D MULH.D MULH.DU DIV.D DIV.DU MOD.D MOD.DU
+  PCADDI PCADDU12I PCADDU18I
+  LU12I.W LU32I.D LU52I.D ADDU16I.D
+
+移位运算指令::
+
+  SLL.W SRL.W SRA.W ROTR.W SLLI.W SRLI.W SRAI.W ROTRI.W
+  SLL.D SRL.D SRA.D ROTR.D SLLI.D SRLI.D SRAI.D ROTRI.D
+
+位域操作指令::
+
+  EXT.W.B EXT.W.H CLO.W CLO.D SLZ.W CLZ.D CTO.W CTO.D CTZ.W CTZ.D
+  BYTEPICK.W BYTEPICK.D BSTRINS.W BSTRINS.D BSTRPICK.W BSTRPICK.D
+  REVB.2H REVB.4H REVB.2W REVB.D REVH.2W REVH.D BITREV.4B BITREV.8B BITREV.W BITREV.D
+  MASKEQZ MASKNEZ
+
+分支转移指令::
+
+  BEQ BNE BLT BGE BLTU BGEU BEQZ BNEZ B BL JIRL
+
+访存读写指令::
+
+  LD.B LD.BU LD.H LD.HU LD.W LD.WU LD.D ST.B ST.H ST.W ST.D
+  LDX.B LDX.BU LDX.H LDX.HU LDX.W LDX.WU LDX.D STX.B STX.H STX.W STX.D
+  LDPTR.W LDPTR.D STPTR.W STPTR.D
+  PRELD PRELDX
+
+原子操作指令::
+
+  LL.W SC.W LL.D SC.D
+  AMSWAP.W AMSWAP.D AMADD.W AMADD.D AMAND.W AMAND.D AMOR.W AMOR.D AMXOR.W AMXOR.D
+  AMMAX.W AMMAX.D AMMIN.W AMMIN.D
+
+栅障指令::
+
+  IBAR DBAR
+
+特殊指令::
+
+  SYSCALL BREAK CPUCFG NOP IDLE ERTN DBCL RDTIMEL.W RDTIMEH.W RDTIME.D ASRTLE.D ASRTGT.D
+
+特权指令::
+
+  CSRRD CSRWR CSRXCHG
+  IOCSRRD.B IOCSRRD.H IOCSRRD.W IOCSRRD.D IOCSRWR.B IOCSRWR.H IOCSRWR.W IOCSRWR.D
+  CACOP TLBP(TLBSRCH) TLBRD TLBWR TLBFILL TLBCLR TLBFLUSH INVTLB LDDIR LDPTE
+
+虚拟内存
+========
+
+LoongArch可以使用直接映射虚拟内存和分页映射虚拟内存。
+
+直接映射虚拟内存通过CSR.DMWn(n=0~3)来进行配置,虚拟地址(VA)和物理地址(PA)
+之间有简单的映射关系::
+
+ VA = PA + 固定偏移
+
+分页映射的虚拟地址(VA)和物理地址(PA)有任意的映射关系,这种关系记录在TLB和页
+表中。LoongArch的TLB包括一个全相联的MTLB(Multiple Page Size TLB,页大小可变)
+和一个组相联的STLB(Single Page Size TLB,页大小固定)。
+
+缺省状态下,LA32的整个虚拟地址空间配置如下:
+
+============ =========================== ===========================
+区段名       地址范围                    属性
+============ =========================== ===========================
+``UVRANGE``  ``0x00000000 - 0x7FFFFFFF`` 分页映射, 可缓存, PLV0~3
+``KPRANGE0`` ``0x80000000 - 0x9FFFFFFF`` 直接映射, 非缓存, PLV0
+``KPRANGE1`` ``0xA0000000 - 0xBFFFFFFF`` 直接映射, 可缓存, PLV0
+``KVRANGE``  ``0xC0000000 - 0xFFFFFFFF`` 分页映射, 可缓存, PLV0
+============ =========================== ===========================
+
+用户态(PLV3)只能访问UVRANGE,对于直接映射的KPRANGE0和KPRANGE1,将虚拟地址的第
+30~31位清零就等于物理地址。例如:物理地址0x00001000对应的非缓存直接映射虚拟地址
+是0x80001000,而其可缓存直接映射虚拟地址是0xA0001000。
+
+缺省状态下,LA64的整个虚拟地址空间配置如下:
+
+============ ====================== ==================================
+区段名       地址范围               属性
+============ ====================== ==================================
+``XUVRANGE`` ``0x0000000000000000 - 分页映射, 可缓存, PLV0~3
+             0x3FFFFFFFFFFFFFFF``
+``XSPRANGE`` ``0x4000000000000000 - 直接映射, 可缓存 / 非缓存, PLV0
+             0x7FFFFFFFFFFFFFFF``
+``XKPRANGE`` ``0x8000000000000000 - 直接映射, 可缓存 / 非缓存, PLV0
+             0xBFFFFFFFFFFFFFFF``
+``XKVRANGE`` ``0xC000000000000000 - 分页映射, 可缓存, PLV0
+             0xFFFFFFFFFFFFFFFF``
+============ ====================== ==================================
+
+用户态(PLV3)只能访问XUVRANGE,对于直接映射的XSPRANGE和XKPRANGE,将虚拟地址的第
+60~63位清零就等于物理地址,而其缓存属性是通过虚拟地址的第60~61位配置的(0表示强序
+非缓存,1表示一致可缓存,2表示弱序非缓存)。目前,我们仅用XKPRANGE来进行直接映射,
+XSPRANGE保留给以后用。此处给出一个直接映射的例子:物理地址0x00000000 00001000的强
+序非缓存直接映射虚拟地址是0x80000000 00001000,其一致可缓存直接映射虚拟地址是
+0x90000000 00001000,而其弱序非缓存直接映射虚拟地址是0xA0000000 00001000。
+
+Loongson与LoongArch的关系
+=========================
+
+LoongArch是一种RISC指令集架构(ISA),不同于现存的任何一种ISA,而Loongson(即龙
+芯)是一个处理器家族。龙芯包括三个系列:Loongson-1(龙芯1号)是32位处理器系列,
+Loongson-2(龙芯2号)是低端64位处理器系列,而Loongson-3(龙芯3号)是高端64位处理
+器系列。旧的龙芯处理器基于MIPS架构,而新的龙芯处理器基于LoongArch架构。以龙芯3号
+为例:龙芯3A1000/3B1500/3A2000/3A3000/3A4000都是兼容MIPS的,而龙芯3A5000(以及将
+来的型号)都是基于LoongArch的。
+
+参考文献
+========
+
+Loongson与LoongArch的官方网站(龙芯中科技术股份有限公司):
+
+  http://www.loongson.cn/index.html
+
+Loongson与LoongArch的开发者网站(软件与文档资源):
+
+  http://www.loongnix.cn/index.php
+
+  https://github.com/loongson
+
+LoongArch指令集架构的文档:
+
+  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.00-CN.pdf (中文版)
+
+  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.00-EN.pdf (英文版)
+
+LoongArch的ELF ABI文档:
+
+  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-ELF-ABI-v1.00-CN.pdf (中文版)
+
+  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-ELF-ABI-v1.00-EN.pdf (英文版)
+
+Loongson与LoongArch的Linux内核源码仓库:
+
+  https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git
diff --git a/Documentation/translations/zh_CN/loongarch/irq-chip-model.rst b/Documentation/translations/zh_CN/loongarch/irq-chip-model.rst
new file mode 100644
index 000000000000..54c0c9ebac77
--- /dev/null
+++ b/Documentation/translations/zh_CN/loongarch/irq-chip-model.rst
@@ -0,0 +1,167 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. include:: ../disclaimer-zh_CN.rst
+
+:Original: Documentation/loongarch/irq-chip-model.rst
+:Translator: Huacai Chen <chenhuacai@loongson.cn>
+
+==================================
+LoongArch的IRQ芯片模型(层级关系)
+==================================
+
+目前,基于LoongArch的处理器(如龙芯3A5000)只能与LS7A芯片组配合工作。LoongArch计算机
+中的中断控制器(即IRQ芯片)包括CPUINTC(CPU Core Interrupt Controller)、LIOINTC(
+Legacy I/O Interrupt Controller)、EIOINTC(Extended I/O Interrupt Controller)、
+HTVECINTC(Hyper-Transport Vector Interrupt Controller)、PCH-PIC(LS7A芯片组的主中
+断控制器)、PCH-LPC(LS7A芯片组的LPC中断控制器)和PCH-MSI(MSI中断控制器)。
+
+CPUINTC是一种CPU内部的每个核本地的中断控制器,LIOINTC/EIOINTC/HTVECINTC是CPU内部的
+全局中断控制器(每个芯片一个,所有核共享),而PCH-PIC/PCH-LPC/PCH-MSI是CPU外部的中
+断控制器(在配套芯片组里面)。这些中断控制器(或者说IRQ芯片)以一种层次树的组织形式
+级联在一起,一共有两种层级关系模型(传统IRQ模型和扩展IRQ模型)。
+
+传统IRQ模型
+===========
+
+在这种模型里面,IPI(Inter-Processor Interrupt)和CPU本地始终中断直接发送到CPUINTC,
+CPU串口(UARTs)中断发送到LIOINTC,而其他所有设备的中断则分别发送到所连接的PCH-PIC/
+PCH-LPC/PCH-MSI,然后被HTVECINTC统一收集,再发送到LIOINTC,最后到达CPUINTC。
+
+ +---------------------------------------------+
+ |::                                           |
+ |                                             |
+ |    +-----+     +---------+     +-------+    |
+ |    | IPI | --> | CPUINTC | <-- | Timer |    |
+ |    +-----+     +---------+     +-------+    |
+ |                     ^                       |
+ |                     |                       |
+ |                +---------+     +-------+    |
+ |                | LIOINTC | <-- | UARTs |    |
+ |                +---------+     +-------+    |
+ |                     ^                       |
+ |                     |                       |
+ |               +-----------+                 |
+ |               | HTVECINTC |                 |
+ |               +-----------+                 |
+ |                ^         ^                  |
+ |                |         |                  |
+ |          +---------+ +---------+            |
+ |          | PCH-PIC | | PCH-MSI |            |
+ |          +---------+ +---------+            |
+ |            ^     ^           ^              |
+ |            |     |           |              |
+ |    +---------+ +---------+ +---------+      |
+ |    | PCH-LPC | | Devices | | Devices |      |
+ |    +---------+ +---------+ +---------+      |
+ |         ^                                   |
+ |         |                                   |
+ |    +---------+                              |
+ |    | Devices |                              |
+ |    +---------+                              |
+ |                                             |
+ |                                             |
+ +---------------------------------------------+
+
+扩展IRQ模型
+===========
+
+在这种模型里面,IPI(Inter-Processor Interrupt)和CPU本地始终中断直接发送到CPUINTC,
+CPU串口(UARTs)中断发送到LIOINTC,而其他所有设备的中断则分别发送到所连接的PCH-PIC/
+PCH-LPC/PCH-MSI,然后被EIOINTC统一收集,再直接到达CPUINTC。
+
+ +--------------------------------------------------------+
+ |::                                                      |
+ |                                                        |
+ |         +-----+     +---------+     +-------+          |
+ |         | IPI | --> | CPUINTC | <-- | Timer |          |
+ |         +-----+     +---------+     +-------+          |
+ |                      ^       ^                         |
+ |                      |       |                         |
+ |               +---------+ +---------+     +-------+    |
+ |               | EIOINTC | | LIOINTC | <-- | UARTs |    |
+ |               +---------+ +---------+     +-------+    |
+ |                ^       ^                               |
+ |                |       |                               |
+ |         +---------+ +---------+                        |
+ |         | PCH-PIC | | PCH-MSI |                        |
+ |         +---------+ +---------+                        |
+ |           ^     ^           ^                          |
+ |           |     |           |                          |
+ |   +---------+ +---------+ +---------+                  |
+ |   | PCH-LPC | | Devices | | Devices |                  |
+ |   +---------+ +---------+ +---------+                  |
+ |        ^                                               |
+ |        |                                               |
+ |   +---------+                                          |
+ |   | Devices |                                          |
+ |   +---------+                                          |
+ |                                                        |
+ |                                                        |
+ +--------------------------------------------------------+
+
+ACPI相关的定义
+==============
+
+CPUINTC::
+
+  ACPI_MADT_TYPE_CORE_PIC;
+  struct acpi_madt_core_pic;
+  enum acpi_madt_core_pic_version;
+
+LIOINTC::
+
+  ACPI_MADT_TYPE_LIO_PIC;
+  struct acpi_madt_lio_pic;
+  enum acpi_madt_lio_pic_version;
+
+EIOINTC::
+
+  ACPI_MADT_TYPE_EIO_PIC;
+  struct acpi_madt_eio_pic;
+  enum acpi_madt_eio_pic_version;
+
+HTVECINTC::
+
+  ACPI_MADT_TYPE_HT_PIC;
+  struct acpi_madt_ht_pic;
+  enum acpi_madt_ht_pic_version;
+
+PCH-PIC::
+
+  ACPI_MADT_TYPE_BIO_PIC;
+  struct acpi_madt_bio_pic;
+  enum acpi_madt_bio_pic_version;
+
+PCH-MSI::
+
+  ACPI_MADT_TYPE_MSI_PIC;
+  struct acpi_madt_msi_pic;
+  enum acpi_madt_msi_pic_version;
+
+PCH-LPC::
+
+  ACPI_MADT_TYPE_LPC_PIC;
+  struct acpi_madt_lpc_pic;
+  enum acpi_madt_lpc_pic_version;
+
+参考文献
+========
+
+龙芯3A5000的文档:
+
+  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-3A5000-usermanual-1.02-CN.pdf (中文版)
+
+  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-3A5000-usermanual-1.02-EN.pdf (英文版)
+
+龙芯LS7A芯片组的文档:
+
+  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-7A1000-usermanual-2.00-CN.pdf (中文版)
+
+  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-7A1000-usermanual-2.00-EN.pdf (英文版)
+
+注:CPUINTC即《龙芯架构参考手册卷一》第7.4节所描述的CSR.ECFG/CSR.ESTAT寄存器及其中断
+控制逻辑;LIOINTC即《龙芯3A5000处理器使用手册》第11.1节所描述的“传统I/O中断”;EIOINTC
+即《龙芯3A5000处理器使用手册》第11.2节所描述的“扩展I/O中断”;HTVECINTC即《龙芯3A5000
+处理器使用手册》第14.3节所描述的“HyperTransport中断”;PCH-PIC/PCH-MSI即《龙芯7A1000桥
+片用户手册》第5章所描述的“中断控制器”;PCH-LPC即《龙芯7A1000桥片用户手册》第24.3节所
+描述的“LPC中断”。
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 03/24] LoongArch: Add elf-related definitions
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
  2022-04-30  9:04 ` [PATCH V9 01/24] Documentation: LoongArch: Add basic documentations Huacai Chen
  2022-04-30  9:04 ` [PATCH V9 02/24] Documentation/zh_CN: Add basic LoongArch documentations Huacai Chen
@ 2022-04-30  9:04 ` Huacai Chen
  2022-05-01  9:41   ` WANG Xuerui
  2022-04-30  9:04 ` [PATCH V9 04/24] LoongArch: Add writecombine support for drm Huacai Chen
                   ` (21 subsequent siblings)
  24 siblings, 1 reply; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:04 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen

Add elf-related definitions for LoongArch, including: EM_LOONGARCH,
KEXEC_ARCH_LOONGARCH, AUDIT_ARCH_LOONGARCH32, AUDIT_ARCH_LOONGARCH64
and NT_LOONGARCH_*.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 include/uapi/linux/audit.h  | 2 ++
 include/uapi/linux/elf-em.h | 1 +
 include/uapi/linux/elf.h    | 5 +++++
 include/uapi/linux/kexec.h  | 1 +
 scripts/sorttable.c         | 5 +++++
 5 files changed, 14 insertions(+)

diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
index 8eda133ca4c1..7c1dc818b1d5 100644
--- a/include/uapi/linux/audit.h
+++ b/include/uapi/linux/audit.h
@@ -439,6 +439,8 @@ enum {
 #define AUDIT_ARCH_UNICORE	(EM_UNICORE|__AUDIT_ARCH_LE)
 #define AUDIT_ARCH_X86_64	(EM_X86_64|__AUDIT_ARCH_64BIT|__AUDIT_ARCH_LE)
 #define AUDIT_ARCH_XTENSA	(EM_XTENSA)
+#define AUDIT_ARCH_LOONGARCH32	(EM_LOONGARCH|__AUDIT_ARCH_LE)
+#define AUDIT_ARCH_LOONGARCH64	(EM_LOONGARCH|__AUDIT_ARCH_64BIT|__AUDIT_ARCH_LE)
 
 #define AUDIT_PERM_EXEC		1
 #define AUDIT_PERM_WRITE	2
diff --git a/include/uapi/linux/elf-em.h b/include/uapi/linux/elf-em.h
index f47e853546fa..ef38c2bc5ab7 100644
--- a/include/uapi/linux/elf-em.h
+++ b/include/uapi/linux/elf-em.h
@@ -51,6 +51,7 @@
 #define EM_RISCV	243	/* RISC-V */
 #define EM_BPF		247	/* Linux BPF - in-kernel virtual machine */
 #define EM_CSKY		252	/* C-SKY */
+#define EM_LOONGARCH	258	/* LoongArch */
 #define EM_FRV		0x5441	/* Fujitsu FR-V */
 
 /*
diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
index 7ce993e6786c..1e0ae3f554f6 100644
--- a/include/uapi/linux/elf.h
+++ b/include/uapi/linux/elf.h
@@ -436,6 +436,11 @@ typedef struct elf64_shdr {
 #define NT_MIPS_DSP	0x800		/* MIPS DSP ASE registers */
 #define NT_MIPS_FP_MODE	0x801		/* MIPS floating-point mode */
 #define NT_MIPS_MSA	0x802		/* MIPS SIMD registers */
+#define NT_LOONGARCH_CPUCFG	0xa00	/* LoongArch CPU config registers */
+#define NT_LOONGARCH_CSR	0xa01	/* LoongArch control and status registers */
+#define NT_LOONGARCH_LSX	0xa02	/* LoongArch Loongson SIMD Extension registers */
+#define NT_LOONGARCH_LASX	0xa03	/* LoongArch Loongson Advanced SIMD Extension registers */
+#define NT_LOONGARCH_LBT	0xa04	/* LoongArch Loongson Binary Translation registers */
 
 /* Note types with note name "GNU" */
 #define NT_GNU_PROPERTY_TYPE_0	5
diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h
index fb7e2ef60825..981016e05cfa 100644
--- a/include/uapi/linux/kexec.h
+++ b/include/uapi/linux/kexec.h
@@ -43,6 +43,7 @@
 #define KEXEC_ARCH_MIPS    ( 8 << 16)
 #define KEXEC_ARCH_AARCH64 (183 << 16)
 #define KEXEC_ARCH_RISCV   (243 << 16)
+#define KEXEC_ARCH_LOONGARCH	(258 << 16)
 
 /* The artificial cap on the number of segments passed to kexec_load. */
 #define KEXEC_SEGMENT_MAX 16
diff --git a/scripts/sorttable.c b/scripts/sorttable.c
index d00504c5f530..fba40e99f354 100644
--- a/scripts/sorttable.c
+++ b/scripts/sorttable.c
@@ -60,6 +60,10 @@
 #define EM_RISCV	243
 #endif
 
+#ifndef EM_LOONGARCH
+#define EM_LOONGARCH	258
+#endif
+
 static uint32_t (*r)(const uint32_t *);
 static uint16_t (*r2)(const uint16_t *);
 static uint64_t (*r8)(const uint64_t *);
@@ -313,6 +317,7 @@ static int do_file(char const *const fname, void *addr)
 	case EM_ARCOMPACT:
 	case EM_ARCV2:
 	case EM_ARM:
+	case EM_LOONGARCH:
 	case EM_MICROBLAZE:
 	case EM_MIPS:
 	case EM_XTENSA:
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 04/24] LoongArch: Add writecombine support for drm
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (2 preceding siblings ...)
  2022-04-30  9:04 ` [PATCH V9 03/24] LoongArch: Add elf-related definitions Huacai Chen
@ 2022-04-30  9:04 ` Huacai Chen
  2022-04-30  9:04 ` [PATCH V9 05/24] LoongArch: Add build infrastructure Huacai Chen
                   ` (20 subsequent siblings)
  24 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:04 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen

LoongArch maintains cache coherency in hardware, but its WUC attribute
(Weak-ordered UnCached, which is similar to WC) is out of the scope of
cache coherency machanism. This means WUC can only used for write-only
memory regions.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 drivers/gpu/drm/drm_vm.c         | 2 +-
 drivers/gpu/drm/ttm/ttm_module.c | 2 +-
 include/drm/drm_cache.h          | 8 ++++++++
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_vm.c b/drivers/gpu/drm/drm_vm.c
index e957d4851dc0..f024dc93939e 100644
--- a/drivers/gpu/drm/drm_vm.c
+++ b/drivers/gpu/drm/drm_vm.c
@@ -69,7 +69,7 @@ static pgprot_t drm_io_prot(struct drm_local_map *map,
 	pgprot_t tmp = vm_get_page_prot(vma->vm_flags);
 
 #if defined(__i386__) || defined(__x86_64__) || defined(__powerpc__) || \
-    defined(__mips__)
+    defined(__mips__) || defined(__loongarch__)
 	if (map->type == _DRM_REGISTERS && !(map->flags & _DRM_WRITE_COMBINING))
 		tmp = pgprot_noncached(tmp);
 	else
diff --git a/drivers/gpu/drm/ttm/ttm_module.c b/drivers/gpu/drm/ttm/ttm_module.c
index a3ad7c9736ec..b3fffe7b5062 100644
--- a/drivers/gpu/drm/ttm/ttm_module.c
+++ b/drivers/gpu/drm/ttm/ttm_module.c
@@ -74,7 +74,7 @@ pgprot_t ttm_prot_from_caching(enum ttm_caching caching, pgprot_t tmp)
 #endif /* CONFIG_UML */
 #endif /* __i386__ || __x86_64__ */
 #if defined(__ia64__) || defined(__arm__) || defined(__aarch64__) || \
-	defined(__powerpc__) || defined(__mips__)
+	defined(__powerpc__) || defined(__mips__) || defined(__loongarch__)
 	if (caching == ttm_write_combined)
 		tmp = pgprot_writecombine(tmp);
 	else
diff --git a/include/drm/drm_cache.h b/include/drm/drm_cache.h
index 22deb216b59c..08e0e3ffad13 100644
--- a/include/drm/drm_cache.h
+++ b/include/drm/drm_cache.h
@@ -67,6 +67,14 @@ static inline bool drm_arch_can_wc_memory(void)
 	 * optimization entirely for ARM and arm64.
 	 */
 	return false;
+#elif defined(CONFIG_LOONGARCH)
+	/*
+	 * LoongArch maintains cache coherency in hardware, but its WUC attribute
+	 * (Weak-ordered UnCached, which is similar to WC) is out of the scope of
+	 * cache coherency machanism. This means WUC can only used for write-only
+	 * memory regions.
+	 */
+	return false;
 #else
 	return true;
 #endif
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 05/24] LoongArch: Add build infrastructure
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (3 preceding siblings ...)
  2022-04-30  9:04 ` [PATCH V9 04/24] LoongArch: Add writecombine support for drm Huacai Chen
@ 2022-04-30  9:04 ` Huacai Chen
  2022-05-01 10:09   ` WANG Xuerui
  2022-04-30  9:05 ` [PATCH V9 06/24] LoongArch: Add CPU definition headers Huacai Chen
                   ` (19 subsequent siblings)
  24 siblings, 1 reply; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:04 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen

This patch adds Kbuild, Makefile, Kconfig and link script for LoongArch
build infrastructure.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/.gitignore              |   9 +
 arch/loongarch/Kbuild                  |   3 +
 arch/loongarch/Kconfig                 | 351 +++++++++++++++++++++++++
 arch/loongarch/Kconfig.debug           |   0
 arch/loongarch/Makefile                |  99 +++++++
 arch/loongarch/include/asm/Kbuild      |  29 ++
 arch/loongarch/include/uapi/asm/Kbuild |   2 +
 arch/loongarch/kernel/Makefile         |  22 ++
 arch/loongarch/kernel/vmlinux.lds.S    | 100 +++++++
 arch/loongarch/lib/Makefile            |   7 +
 arch/loongarch/mm/Makefile             |   9 +
 arch/loongarch/pci/Makefile            |   7 +
 scripts/subarch.include                |   2 +-
 13 files changed, 639 insertions(+), 1 deletion(-)
 create mode 100644 arch/loongarch/.gitignore
 create mode 100644 arch/loongarch/Kbuild
 create mode 100644 arch/loongarch/Kconfig
 create mode 100644 arch/loongarch/Kconfig.debug
 create mode 100644 arch/loongarch/Makefile
 create mode 100644 arch/loongarch/include/asm/Kbuild
 create mode 100644 arch/loongarch/include/uapi/asm/Kbuild
 create mode 100644 arch/loongarch/kernel/Makefile
 create mode 100644 arch/loongarch/kernel/vmlinux.lds.S
 create mode 100644 arch/loongarch/lib/Makefile
 create mode 100644 arch/loongarch/mm/Makefile
 create mode 100644 arch/loongarch/pci/Makefile

diff --git a/arch/loongarch/.gitignore b/arch/loongarch/.gitignore
new file mode 100644
index 000000000000..fd88d21e7172
--- /dev/null
+++ b/arch/loongarch/.gitignore
@@ -0,0 +1,9 @@
+*.lds
+*.raw
+calc_vmlinuz_load_addr
+elf-entry
+relocs
+vmlinux*
+vmlinuz*
+
+!kernel/vmlinux.lds.S
diff --git a/arch/loongarch/Kbuild b/arch/loongarch/Kbuild
new file mode 100644
index 000000000000..1ad35aabdd16
--- /dev/null
+++ b/arch/loongarch/Kbuild
@@ -0,0 +1,3 @@
+obj-y += kernel/
+obj-y += mm/
+obj-y += vdso/
diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
new file mode 100644
index 000000000000..44b763046893
--- /dev/null
+++ b/arch/loongarch/Kconfig
@@ -0,0 +1,351 @@
+# SPDX-License-Identifier: GPL-2.0
+config LOONGARCH
+	bool
+	default y
+	select ACPI_MCFG if ACPI
+	select ACPI_SYSTEM_POWER_STATES_SUPPORT	if ACPI
+	select ARCH_BINFMT_ELF_STATE
+	select ARCH_ENABLE_MEMORY_HOTPLUG
+	select ARCH_ENABLE_MEMORY_HOTREMOVE
+	select ARCH_HAS_ACPI_TABLE_UPGRADE	if ACPI
+	select ARCH_HAS_PTE_SPECIAL
+	select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
+	select ARCH_INLINE_READ_LOCK if !PREEMPTION
+	select ARCH_INLINE_READ_LOCK_BH if !PREEMPTION
+	select ARCH_INLINE_READ_LOCK_IRQ if !PREEMPTION
+	select ARCH_INLINE_READ_LOCK_IRQSAVE if !PREEMPTION
+	select ARCH_INLINE_READ_UNLOCK if !PREEMPTION
+	select ARCH_INLINE_READ_UNLOCK_BH if !PREEMPTION
+	select ARCH_INLINE_READ_UNLOCK_IRQ if !PREEMPTION
+	select ARCH_INLINE_READ_UNLOCK_IRQRESTORE if !PREEMPTION
+	select ARCH_INLINE_WRITE_LOCK if !PREEMPTION
+	select ARCH_INLINE_WRITE_LOCK_BH if !PREEMPTION
+	select ARCH_INLINE_WRITE_LOCK_IRQ if !PREEMPTION
+	select ARCH_INLINE_WRITE_LOCK_IRQSAVE if !PREEMPTION
+	select ARCH_INLINE_WRITE_UNLOCK if !PREEMPTION
+	select ARCH_INLINE_WRITE_UNLOCK_BH if !PREEMPTION
+	select ARCH_INLINE_WRITE_UNLOCK_IRQ if !PREEMPTION
+	select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE if !PREEMPTION
+	select ARCH_INLINE_SPIN_TRYLOCK if !PREEMPTION
+	select ARCH_INLINE_SPIN_TRYLOCK_BH if !PREEMPTION
+	select ARCH_INLINE_SPIN_LOCK if !PREEMPTION
+	select ARCH_INLINE_SPIN_LOCK_BH if !PREEMPTION
+	select ARCH_INLINE_SPIN_LOCK_IRQ if !PREEMPTION
+	select ARCH_INLINE_SPIN_LOCK_IRQSAVE if !PREEMPTION
+	select ARCH_INLINE_SPIN_UNLOCK if !PREEMPTION
+	select ARCH_INLINE_SPIN_UNLOCK_BH if !PREEMPTION
+	select ARCH_INLINE_SPIN_UNLOCK_IRQ if !PREEMPTION
+	select ARCH_INLINE_SPIN_UNLOCK_IRQRESTORE if !PREEMPTION
+	select ARCH_MIGHT_HAVE_PC_PARPORT
+	select ARCH_MIGHT_HAVE_PC_SERIO
+	select ARCH_SPARSEMEM_ENABLE
+	select ARCH_SUPPORTS_ACPI
+	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_HUGETLBFS
+	select ARCH_USE_BUILTIN_BSWAP
+	select ARCH_USE_CMPXCHG_LOCKREF
+	select ARCH_USE_QUEUED_RWLOCKS
+	select ARCH_USE_QUEUED_SPINLOCKS
+	select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT
+	select ARCH_WANTS_NO_INSTR
+	select BUILDTIME_TABLE_SORT
+	select COMMON_CLK
+	select GENERIC_CLOCKEVENTS
+	select GENERIC_CMOS_UPDATE
+	select GENERIC_CPU_AUTOPROBE
+	select GENERIC_ENTRY
+	select GENERIC_FIND_FIRST_BIT
+	select GENERIC_GETTIMEOFDAY
+	select GENERIC_IRQ_MULTI_HANDLER
+	select GENERIC_IRQ_PROBE
+	select GENERIC_IRQ_SHOW
+	select GENERIC_LIB_ASHLDI3
+	select GENERIC_LIB_ASHRDI3
+	select GENERIC_LIB_CMPDI2
+	select GENERIC_LIB_LSHRDI3
+	select GENERIC_LIB_UCMPDI2
+	select GENERIC_PCI_IOMAP
+	select GENERIC_SCHED_CLOCK
+	select GENERIC_TIME_VSYSCALL
+	select GPIOLIB
+	select HAVE_ARCH_AUDITSYSCALL
+	select HAVE_ARCH_COMPILER_H
+	select HAVE_ARCH_MMAP_RND_BITS if MMU
+	select HAVE_ARCH_SECCOMP_FILTER
+	select HAVE_ARCH_TRACEHOOK
+	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
+	select HAVE_ASM_MODVERSIONS
+	select HAVE_CONTEXT_TRACKING
+	select HAVE_COPY_THREAD_TLS
+	select HAVE_DEBUG_KMEMLEAK
+	select HAVE_DEBUG_STACKOVERFLOW
+	select HAVE_DMA_CONTIGUOUS
+	select HAVE_EXIT_THREAD
+	select HAVE_FAST_GUP
+	select HAVE_GENERIC_VDSO
+	select HAVE_IOREMAP_PROT
+	select HAVE_IRQ_EXIT_ON_IRQ_STACK
+	select HAVE_IRQ_TIME_ACCOUNTING
+	select HAVE_MEMBLOCK
+	select HAVE_MEMBLOCK_NODE_MAP
+	select HAVE_MOD_ARCH_SPECIFIC
+	select HAVE_NMI
+	select HAVE_PCI
+	select HAVE_PERF_EVENTS
+	select HAVE_REGS_AND_STACK_ACCESS_API
+	select HAVE_RSEQ
+	select HAVE_SYSCALL_TRACEPOINTS
+	select HAVE_TIF_NOHZ
+	select HAVE_VIRT_CPU_ACCOUNTING_GEN
+	select IRQ_FORCED_THREADING
+	select IRQ_LOONGARCH_CPU
+	select MODULES_USE_ELF_RELA if MODULES
+	select PCI
+	select PCI_DOMAINS_GENERIC
+	select PCI_ECAM if ACPI
+	select PCI_MSI_ARCH_FALLBACKS
+	select PERF_USE_VMALLOC
+	select RTC_LIB
+	select SPARSE_IRQ
+	select SYSCTL_EXCEPTION_TRACE
+	select SWIOTLB
+	select TRACE_IRQFLAGS_SUPPORT
+	select ZONE_DMA32
+
+config 32BIT
+	bool
+
+config 64BIT
+	def_bool y
+
+config CPU_HAS_FPU
+	bool
+	default y
+
+config CPU_HAS_PREFETCH
+	bool
+	default y
+
+config GENERIC_CALIBRATE_DELAY
+	def_bool y
+
+config GENERIC_CSUM
+	def_bool y
+
+config GENERIC_HWEIGHT
+	def_bool y
+
+config L1_CACHE_SHIFT
+	int
+	default "6"
+
+config LOCKDEP_SUPPORT
+	bool
+	default y
+
+config MACH_LOONGSON32
+	def_bool 32BIT
+
+config MACH_LOONGSON64
+	def_bool 64BIT
+
+config PAGE_SIZE_4KB
+	bool
+
+config PAGE_SIZE_16KB
+	bool
+
+config PAGE_SIZE_64KB
+	bool
+
+config PGTABLE_2LEVEL
+	bool
+
+config PGTABLE_3LEVEL
+	bool
+
+config PGTABLE_4LEVEL
+	bool
+
+config PGTABLE_LEVELS
+	int
+	default 2 if PGTABLE_2LEVEL
+	default 3 if PGTABLE_3LEVEL
+	default 4 if PGTABLE_4LEVEL
+
+config SCHED_OMIT_FRAME_POINTER
+	bool
+	default y
+
+menu "Kernel type"
+
+source "kernel/Kconfig.hz"
+
+choice
+	prompt "Page Table Layout"
+	default 16KB_2LEVEL if 32BIT
+	default 16KB_3LEVEL if 64BIT
+	help
+	  Allows choosing the page table layout, which is a combination
+	  of page size and page table levels. The virtual memory address
+	  space bits are determined by the page table layout.
+
+config 4KB_3LEVEL
+	bool "4KB with 3 levels"
+	select PAGE_SIZE_4KB
+	select PGTABLE_3LEVEL
+	help
+	  This option selects 4KB page size with 3 level page tables, which
+	  support a maximum 39 bits of application virtual memory.
+
+config 4KB_4LEVEL
+	bool "4KB with 4 levels"
+	select PAGE_SIZE_4KB
+	select PGTABLE_4LEVEL
+	help
+	  This option selects 4KB page size with 4 level page tables, which
+	  support a maximum 48 bits of application virtual memory.
+
+config 16KB_2LEVEL
+	bool "16KB with 2 levels"
+	select PAGE_SIZE_16KB
+	select PGTABLE_2LEVEL
+	help
+	  This option selects 16KB page size with 2 level page tables, which
+	  support a maximum 36 bits of application virtual memory.
+
+config 16KB_3LEVEL
+	bool "16KB with 3 levels"
+	select PAGE_SIZE_16KB
+	select PGTABLE_3LEVEL
+	help
+	  This option selects 16KB page size with 3 level page tables, which
+	  support a maximum 47 bits of application virtual memory.
+
+config 64KB_2LEVEL
+	bool "64KB with 2 levels"
+	select PAGE_SIZE_64KB
+	select PGTABLE_2LEVEL
+	help
+	  This option selects 64KB page size with 2 level page tables, which
+	  support a maximum 42 bits of application virtual memory.
+
+config 64KB_3LEVEL
+	bool "64KB with 3 levels"
+	select PAGE_SIZE_64KB
+	select PGTABLE_3LEVEL
+	help
+	  This option selects 64KB page size with 3 level page tables, which
+	  support a maximum 55 bits of application virtual memory.
+
+endchoice
+
+config DMI
+	bool "Enable DMI scanning"
+	select DMI_SCAN_MACHINE_NON_EFI_FALLBACK
+	default y
+	help
+	  Enabled scanning of DMI to identify machine quirks. Say Y
+	  here unless you have verified that your setup is not
+	  affected by entries in the DMI blacklist. Required by PNP
+	  BIOS code.
+
+config EFI
+	bool "EFI runtime service support"
+	select UCS2_STRING
+	select EFI_RUNTIME_WRAPPERS
+	help
+	  This enables the kernel to use EFI runtime services that are
+	  available (such as the EFI variable services).
+
+	  This option is only useful on systems that have EFI firmware.
+	  In addition, you should use the latest ELILO loader available
+	  at <http://elilo.sourceforge.net> in order to take advantage
+	  of EFI runtime services. However, even with this option, the
+	  resultant kernel should continue to boot on existing non-EFI
+	  platforms.
+
+config FORCE_MAX_ZONEORDER
+	int "Maximum zone order"
+	range 14 64 if PAGE_SIZE_64KB
+	default "14" if PAGE_SIZE_64KB
+	range 12 64 if PAGE_SIZE_16KB
+	default "12" if PAGE_SIZE_16KB
+	range 11 64
+	default "11"
+	help
+	  The kernel memory allocator divides physically contiguous memory
+	  blocks into "zones", where each zone is a power of two number of
+	  pages.  This option selects the largest power of two that the kernel
+	  keeps in the memory allocator.  If you need to allocate very large
+	  blocks of physically contiguous memory, then you may need to
+	  increase this value.
+
+	  This config option is actually maximum order plus one. For example,
+	  a value of 11 means that the largest free memory block is 2^10 pages.
+
+	  The page size is not necessarily 4KB.  Keep this in mind
+	  when choosing a value for this option.
+
+config SECCOMP
+	bool "Enable seccomp to safely compute untrusted bytecode"
+	depends on PROC_FS
+	default y
+	help
+	  This kernel feature is useful for number crunching applications
+	  that may need to compute untrusted bytecode during their
+	  execution. By using pipes or other transports made available to
+	  the process as file descriptors supporting the read/write
+	  syscalls, it's possible to isolate those applications in
+	  their own address space using seccomp. Once seccomp is
+	  enabled via /proc/<pid>/seccomp, it cannot be disabled
+	  and the task is only allowed to execute a few safe syscalls
+	  defined by each seccomp mode.
+
+	  If unsure, say Y. Only embedded should say N here.
+
+endmenu
+
+config ARCH_SELECT_MEMORY_MODEL
+	def_bool y
+
+config ARCH_FLATMEM_ENABLE
+	def_bool y
+
+config ARCH_SPARSEMEM_ENABLE
+	def_bool y
+	help
+	  Say Y to support efficient handling of sparse physical memory,
+	  for architectures which are either NUMA (Non-Uniform Memory Access)
+	  or have huge holes in the physical address space for other reasons.
+	  See <file:Documentation/vm/numa.rst> for more.
+
+config ARCH_ENABLE_THP_MIGRATION
+	def_bool y
+	depends on TRANSPARENT_HUGEPAGE
+
+config ARCH_MEMORY_PROBE
+	def_bool y
+	depends on MEMORY_HOTPLUG
+
+config MMU
+	bool
+	default y
+
+config ARCH_MMAP_RND_BITS_MIN
+	default 12
+
+config ARCH_MMAP_RND_BITS_MAX
+	default 18
+
+menu "Bus options"
+
+endmenu
+
+menu "Power management options"
+
+source "drivers/acpi/Kconfig"
+
+endmenu
+
+source "drivers/firmware/Kconfig"
diff --git a/arch/loongarch/Kconfig.debug b/arch/loongarch/Kconfig.debug
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
new file mode 100644
index 000000000000..0a40e79b3265
--- /dev/null
+++ b/arch/loongarch/Makefile
@@ -0,0 +1,99 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Author: Huacai Chen <chenhuacai@loongson.cn>
+# Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+
+#
+# Select the object file format to substitute into the linker script.
+#
+64bit-tool-archpref	= loongarch64
+32bit-bfd		= elf32-loongarch
+64bit-bfd		= elf64-loongarch
+32bit-emul		= elf32loongarch
+64bit-emul		= elf64loongarch
+
+ifdef CONFIG_64BIT
+tool-archpref		= $(64bit-tool-archpref)
+UTS_MACHINE		:= loongarch64
+endif
+
+ifneq ($(SUBARCH),$(ARCH))
+  ifeq ($(CROSS_COMPILE),)
+    CROSS_COMPILE := $(call cc-cross-prefix, $(tool-archpref)-linux-  $(tool-archpref)-linux-gnu-  $(tool-archpref)-unknown-linux-gnu-)
+  endif
+endif
+
+cflags-y += $(call cc-option, -mno-check-zero-division)
+
+ifdef CONFIG_64BIT
+ld-emul			= $(64bit-emul)
+cflags-y		+= -mabi=lp64s
+endif
+
+all-y			:= vmlinux
+
+#
+# GCC uses -G0 -mabicalls -fpic as default.  We don't want PIC in the kernel
+# code since it only slows down the whole thing.  At some point we might make
+# use of global pointer optimizations but their use of $r2 conflicts with
+# the current pointer optimization.
+#
+cflags-y			+= -G0 -pipe
+cflags-y			+= -msoft-float
+LDFLAGS_vmlinux			+= -G0 -static -n -nostdlib
+KBUILD_AFLAGS_KERNEL		+= -Wa,-mla-global-with-pcrel
+KBUILD_CFLAGS_KERNEL		+= -Wa,-mla-global-with-pcrel
+KBUILD_AFLAGS_MODULE		+= -Wa,-mla-global-with-abs
+KBUILD_CFLAGS_MODULE		+= -fplt -Wa,-mla-global-with-abs,-mla-local-with-abs
+
+cflags-y += -ffreestanding
+cflags-y += $(call as-option,-Wa$(comma)-mno-fix-loongson3-llsc,)
+
+load-y		= 0x9000000000200000
+bootvars-y	= VMLINUX_LOAD_ADDRESS=$(load-y)
+
+drivers-$(CONFIG_PCI)		+= arch/loongarch/pci/
+
+KBUILD_AFLAGS	+= $(cflags-y)
+KBUILD_CFLAGS	+= $(cflags-y)
+KBUILD_CPPFLAGS += -DVMLINUX_LOAD_ADDRESS=$(load-y)
+
+# This is required to get dwarf unwinding tables into .debug_frame
+# instead of .eh_frame so we don't discard them.
+KBUILD_CFLAGS += -fno-asynchronous-unwind-tables
+KBUILD_CFLAGS += -isystem $(shell $(CC) -print-file-name=include)
+KBUILD_CFLAGS += $(call cc-option,-mstrict-align)
+
+KBUILD_LDFLAGS	+= -m $(ld-emul)
+
+ifdef CONFIG_LOONGARCH
+CHECKFLAGS += $(shell $(CC) $(KBUILD_CFLAGS) -dM -E -x c /dev/null | \
+	egrep -vw '__GNUC_(MINOR_|PATCHLEVEL_)?_' | \
+	sed -e "s/^\#define /-D'/" -e "s/ /'='/" -e "s/$$/'/" -e 's/\$$/&&/g')
+endif
+
+head-y := arch/loongarch/kernel/head.o
+
+libs-y += arch/loongarch/lib/
+
+prepare: vdso_prepare
+vdso_prepare: prepare0
+	$(Q)$(MAKE) $(build)=arch/loongarch/vdso include/generated/vdso-offsets.h
+
+PHONY += vdso_install
+vdso_install:
+	$(Q)$(MAKE) $(build)=arch/loongarch/vdso $@
+
+all:	$(all-y)
+
+CLEAN_FILES += vmlinux
+
+install:
+	$(Q)install -D -m 755 vmlinux $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
+	$(Q)install -D -m 644 .config $(INSTALL_PATH)/config-$(KERNELRELEASE)
+	$(Q)install -D -m 644 System.map $(INSTALL_PATH)/System.map-$(KERNELRELEASE)
+
+define archhelp
+	echo '  install              - install kernel into $(INSTALL_PATH)'
+	echo
+endef
diff --git a/arch/loongarch/include/asm/Kbuild b/arch/loongarch/include/asm/Kbuild
new file mode 100644
index 000000000000..a0eed6076c79
--- /dev/null
+++ b/arch/loongarch/include/asm/Kbuild
@@ -0,0 +1,29 @@
+# SPDX-License-Identifier: GPL-2.0
+generic-y += dma-contiguous.h
+generic-y += export.h
+generic-y += mcs_spinlock.h
+generic-y += parport.h
+generic-y += early_ioremap.h
+generic-y += qrwlock.h
+generic-y += qspinlock.h
+generic-y += rwsem.h
+generic-y += segment.h
+generic-y += user.h
+generic-y += stat.h
+generic-y += fcntl.h
+generic-y += ioctl.h
+generic-y += ioctls.h
+generic-y += mman.h
+generic-y += msgbuf.h
+generic-y += sembuf.h
+generic-y += shmbuf.h
+generic-y += statfs.h
+generic-y += socket.h
+generic-y += sockios.h
+generic-y += termios.h
+generic-y += termbits.h
+generic-y += poll.h
+generic-y += param.h
+generic-y += posix_types.h
+generic-y += resource.h
+generic-y += kvm_para.h
diff --git a/arch/loongarch/include/uapi/asm/Kbuild b/arch/loongarch/include/uapi/asm/Kbuild
new file mode 100644
index 000000000000..4aa680ca2e5f
--- /dev/null
+++ b/arch/loongarch/include/uapi/asm/Kbuild
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0
+generic-y += kvm_para.h
diff --git a/arch/loongarch/kernel/Makefile b/arch/loongarch/kernel/Makefile
new file mode 100644
index 000000000000..ead27a11e8e0
--- /dev/null
+++ b/arch/loongarch/kernel/Makefile
@@ -0,0 +1,22 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for the Linux/LoongArch kernel.
+#
+
+extra-y		:= head.o vmlinux.lds
+
+obj-y		+= cpu-probe.o cacheinfo.o cmdline.o env.o setup.o entry.o genex.o \
+		   traps.o irq.o idle.o process.o dma.o mem.o io.o reset.o switch.o \
+		   elf.o rtc.o syscall.o signal.o time.o topology.o cmpxchg.o \
+		   inst.o ptrace.o vdso.o
+
+obj-$(CONFIG_ACPI)		+= acpi.o
+obj-$(CONFIG_EFI) 		+= efi.o
+
+obj-$(CONFIG_CPU_HAS_FPU)	+= fpu.o
+
+obj-$(CONFIG_MODULES)		+= module.o module-sections.o
+
+obj-$(CONFIG_PROC_FS)		+= proc.o
+
+CPPFLAGS_vmlinux.lds		:= $(KBUILD_CFLAGS)
diff --git a/arch/loongarch/kernel/vmlinux.lds.S b/arch/loongarch/kernel/vmlinux.lds.S
new file mode 100644
index 000000000000..02abfaaa4892
--- /dev/null
+++ b/arch/loongarch/kernel/vmlinux.lds.S
@@ -0,0 +1,100 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#include <linux/sizes.h>
+#include <asm/asm-offsets.h>
+#include <asm/thread_info.h>
+
+#define PAGE_SIZE _PAGE_SIZE
+
+/*
+ * Put .bss..swapper_pg_dir as the first thing in .bss. This will
+ * ensure that it has .bss alignment (64K).
+ */
+#define BSS_FIRST_SECTIONS *(.bss..swapper_pg_dir)
+
+#include <asm-generic/vmlinux.lds.h>
+
+OUTPUT_ARCH(loongarch)
+ENTRY(kernel_entry)
+PHDRS {
+	text PT_LOAD FLAGS(7);	/* RWX */
+	note PT_NOTE FLAGS(4);	/* R__ */
+}
+
+jiffies	 = jiffies_64;
+
+SECTIONS
+{
+	. = VMLINUX_LOAD_ADDRESS;
+
+	_text = .;
+	.text : {
+		TEXT_TEXT
+		SCHED_TEXT
+		CPUIDLE_TEXT
+		LOCK_TEXT
+		KPROBES_TEXT
+		IRQENTRY_TEXT
+		SOFTIRQENTRY_TEXT
+		*(.fixup)
+		*(.gnu.warning)
+	} :text = 0
+	_etext = .;
+
+	EXCEPTION_TABLE(16)
+
+	. = ALIGN(PAGE_SIZE);
+	__init_begin = .;
+	__inittext_begin = .;
+
+	INIT_TEXT_SECTION(PAGE_SIZE)
+	.exit.text : {
+		EXIT_TEXT
+	}
+
+	__inittext_end = .;
+
+	__initdata_begin = .;
+
+	INIT_DATA_SECTION(16)
+	.exit.data : {
+		EXIT_DATA
+	}
+
+	__initdata_end = .;
+
+	__init_end = .;
+
+	_sdata = .;
+	RO_DATA(4096)
+	RW_DATA(1 << CONFIG_L1_CACHE_SHIFT, PAGE_SIZE, THREAD_SIZE)
+
+	.sdata : {
+		*(.sdata)
+	}
+
+	. = ALIGN(SZ_64K);
+	_edata =  .;
+
+	BSS_SECTION(0, SZ_64K, 8)
+
+	_end = .;
+
+	STABS_DEBUG
+	DWARF_DEBUG
+
+	.gptab.sdata : {
+		*(.gptab.data)
+		*(.gptab.sdata)
+	}
+	.gptab.sbss : {
+		*(.gptab.bss)
+		*(.gptab.sbss)
+	}
+
+	DISCARDS
+	/DISCARD/ : {
+		*(.gnu.attributes)
+		*(.options)
+		*(.eh_frame)
+	}
+}
diff --git a/arch/loongarch/lib/Makefile b/arch/loongarch/lib/Makefile
new file mode 100644
index 000000000000..7f32f3e4a6ec
--- /dev/null
+++ b/arch/loongarch/lib/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for LoongArch-specific library files..
+#
+
+lib-y	+= delay.o memset.o memcpy.o memmove.o \
+	   clear_user.o copy_user.o dump_tlb.o
diff --git a/arch/loongarch/mm/Makefile b/arch/loongarch/mm/Makefile
new file mode 100644
index 000000000000..8ffc6383f836
--- /dev/null
+++ b/arch/loongarch/mm/Makefile
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for the Linux/LoongArch-specific parts of the memory manager.
+#
+
+obj-y				+= init.o cache.o tlb.o tlbex.o extable.o \
+				   fault.o ioremap.o maccess.o mmap.o pgtable.o page.o
+
+obj-$(CONFIG_HUGETLB_PAGE)	+= hugetlbpage.o
diff --git a/arch/loongarch/pci/Makefile b/arch/loongarch/pci/Makefile
new file mode 100644
index 000000000000..8101ef3df71c
--- /dev/null
+++ b/arch/loongarch/pci/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for the PCI specific kernel interface routines under Linux.
+#
+
+obj-y				+= pci.o
+obj-$(CONFIG_ACPI)		+= acpi.o
diff --git a/scripts/subarch.include b/scripts/subarch.include
index 776849a3c500..4bd327d0ae42 100644
--- a/scripts/subarch.include
+++ b/scripts/subarch.include
@@ -10,4 +10,4 @@ SUBARCH := $(shell uname -m | sed -e s/i.86/x86/ -e s/x86_64/x86/ \
 				  -e s/s390x/s390/ \
 				  -e s/ppc.*/powerpc/ -e s/mips.*/mips/ \
 				  -e s/sh[234].*/sh/ -e s/aarch64.*/arm64/ \
-				  -e s/riscv.*/riscv/)
+				  -e s/riscv.*/riscv/ -e s/loongarch.*/loongarch/)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 06/24] LoongArch: Add CPU definition headers
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (4 preceding siblings ...)
  2022-04-30  9:04 ` [PATCH V9 05/24] LoongArch: Add build infrastructure Huacai Chen
@ 2022-04-30  9:05 ` Huacai Chen
  2022-05-01 11:05   ` WANG Xuerui
  2022-04-30  9:05 ` [PATCH V9 07/24] LoongArch: Add atomic/locking headers Huacai Chen
                   ` (18 subsequent siblings)
  24 siblings, 1 reply; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:05 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen

This patch adds common headers (CPU definition and address space layout)
for basic LoongArch support.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/include/asm/addrspace.h    |  110 ++
 arch/loongarch/include/asm/cpu-features.h |   69 +
 arch/loongarch/include/asm/cpu-info.h     |  136 ++
 arch/loongarch/include/asm/cpu.h          |  127 ++
 arch/loongarch/include/asm/fpregdef.h     |   49 +
 arch/loongarch/include/asm/loongarch.h    | 1528 +++++++++++++++++++++
 arch/loongarch/include/asm/loongson.h     |  159 +++
 arch/loongarch/include/asm/regdef.h       |   43 +
 8 files changed, 2221 insertions(+)
 create mode 100644 arch/loongarch/include/asm/addrspace.h
 create mode 100644 arch/loongarch/include/asm/cpu-features.h
 create mode 100644 arch/loongarch/include/asm/cpu-info.h
 create mode 100644 arch/loongarch/include/asm/cpu.h
 create mode 100644 arch/loongarch/include/asm/fpregdef.h
 create mode 100644 arch/loongarch/include/asm/loongarch.h
 create mode 100644 arch/loongarch/include/asm/loongson.h
 create mode 100644 arch/loongarch/include/asm/regdef.h

diff --git a/arch/loongarch/include/asm/addrspace.h b/arch/loongarch/include/asm/addrspace.h
new file mode 100644
index 000000000000..e92541629d25
--- /dev/null
+++ b/arch/loongarch/include/asm/addrspace.h
@@ -0,0 +1,110 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_ADDRSPACE_H
+#define _ASM_ADDRSPACE_H
+
+#include <linux/const.h>
+
+#include <asm/loongarch.h>
+
+/*
+ * This gives the physical RAM offset.
+ */
+#ifndef __ASSEMBLY__
+#ifndef PHYS_OFFSET
+#define PHYS_OFFSET	_AC(0, UL)
+#endif
+extern unsigned long vm_map_base;
+#endif /* __ASSEMBLY__ */
+
+#ifndef IO_BASE
+#define IO_BASE			CSR_DMW0_BASE
+#endif
+
+#ifndef CAC_BASE
+#define CAC_BASE		CSR_DMW1_BASE
+#endif
+
+#ifndef UNCAC_BASE
+#define UNCAC_BASE		CSR_DMW0_BASE
+#endif
+
+#define DMW_PABITS	48
+#define TO_PHYS_MASK	((1ULL << DMW_PABITS) - 1)
+
+/*
+ * Memory above this physical address will be considered highmem.
+ */
+#ifndef HIGHMEM_START
+#define HIGHMEM_START		(_AC(1, UL) << _AC(DMW_PABITS, UL))
+#endif
+
+#define TO_PHYS(x)		(	      ((x) & TO_PHYS_MASK))
+#define TO_CAC(x)		(CAC_BASE   | ((x) & TO_PHYS_MASK))
+#define TO_UNCAC(x)		(UNCAC_BASE | ((x) & TO_PHYS_MASK))
+
+/*
+ * This handles the memory map.
+ */
+#ifndef PAGE_OFFSET
+#define PAGE_OFFSET		(CAC_BASE + PHYS_OFFSET)
+#endif
+
+#ifndef FIXADDR_TOP
+#define FIXADDR_TOP		((unsigned long)(long)(int)0xfffe0000)
+#endif
+
+/*
+ *  Configure language
+ */
+#ifdef __ASSEMBLY__
+#define _ATYPE_
+#define _ATYPE32_
+#define _ATYPE64_
+#define _CONST64_(x)	x
+#else
+#define _ATYPE_		__PTRDIFF_TYPE__
+#define _ATYPE32_	int
+#define _ATYPE64_	__s64
+#ifdef CONFIG_64BIT
+#define _CONST64_(x)	x ## L
+#else
+#define _CONST64_(x)	x ## LL
+#endif
+#endif
+
+/*
+ *  32/64-bit LoongArch address spaces
+ */
+#ifdef __ASSEMBLY__
+#define _ACAST32_
+#define _ACAST64_
+#else
+#define _ACAST32_		(_ATYPE_)(_ATYPE32_)	/* widen if necessary */
+#define _ACAST64_		(_ATYPE64_)		/* do _not_ narrow */
+#endif
+
+#ifdef CONFIG_32BIT
+
+#define UVRANGE			0x00000000
+#define KPRANGE0		0x80000000
+#define KPRANGE1		0xa0000000
+#define KVRANGE			0xc0000000
+
+#else
+
+#define XUVRANGE		_CONST64_(0x0000000000000000)
+#define XSPRANGE		_CONST64_(0x4000000000000000)
+#define XKPRANGE		_CONST64_(0x8000000000000000)
+#define XKVRANGE		_CONST64_(0xc000000000000000)
+
+#endif
+
+/*
+ * Returns the physical address of a KPRANGEx / XKPRANGE address
+ */
+#define PHYSADDR(a)		((_ACAST64_(a)) & TO_PHYS_MASK)
+
+#endif /* _ASM_ADDRSPACE_H */
diff --git a/arch/loongarch/include/asm/cpu-features.h b/arch/loongarch/include/asm/cpu-features.h
new file mode 100644
index 000000000000..e29d446112e8
--- /dev/null
+++ b/arch/loongarch/include/asm/cpu-features.h
@@ -0,0 +1,69 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM_CPU_FEATURES_H
+#define __ASM_CPU_FEATURES_H
+
+#include <asm/cpu.h>
+#include <asm/cpu-info.h>
+
+#define cpu_opt(opt)			(cpu_data[0].options & (opt))
+#define cpu_has(feat)			(cpu_data[0].options & BIT_ULL(feat))
+
+#define cpu_has_loongarch		(cpu_has_loongarch32 | cpu_has_loongarch64)
+#define cpu_has_loongarch32		(cpu_data[0].isa_level & LOONGARCH_CPU_ISA_32BIT)
+#define cpu_has_loongarch64		(cpu_data[0].isa_level & LOONGARCH_CPU_ISA_64BIT)
+
+#define cpu_icache_line_size()		cpu_data[0].icache.linesz
+#define cpu_dcache_line_size()		cpu_data[0].dcache.linesz
+#define cpu_vcache_line_size()		cpu_data[0].vcache.linesz
+#define cpu_scache_line_size()		cpu_data[0].scache.linesz
+
+#ifdef CONFIG_32BIT
+# define cpu_has_64bits			(cpu_data[0].isa_level & LOONGARCH_CPU_ISA_64BIT)
+# define cpu_vabits			31
+# define cpu_pabits			31
+#endif
+
+#ifdef CONFIG_64BIT
+# define cpu_has_64bits			1
+# define cpu_vabits			cpu_data[0].vabits
+# define cpu_pabits			cpu_data[0].pabits
+# define __NEED_ADDRBITS_PROBE
+#endif
+
+/*
+ * SMP assumption: Options of CPU 0 are a superset of all processors.
+ * This is true for all known LoongArch systems.
+ */
+#define cpu_has_cpucfg		cpu_opt(LOONGARCH_CPU_CPUCFG)
+#define cpu_has_lam		cpu_opt(LOONGARCH_CPU_LAM)
+#define cpu_has_ual		cpu_opt(LOONGARCH_CPU_UAL)
+#define cpu_has_fpu		cpu_opt(LOONGARCH_CPU_FPU)
+#define cpu_has_lsx		cpu_opt(LOONGARCH_CPU_LSX)
+#define cpu_has_lasx		cpu_opt(LOONGARCH_CPU_LASX)
+#define cpu_has_complex		cpu_opt(LOONGARCH_CPU_COMPLEX)
+#define cpu_has_crypto		cpu_opt(LOONGARCH_CPU_CRYPTO)
+#define cpu_has_lvz		cpu_opt(LOONGARCH_CPU_LVZ)
+#define cpu_has_lbt_x86		cpu_opt(LOONGARCH_CPU_LBT_X86)
+#define cpu_has_lbt_arm		cpu_opt(LOONGARCH_CPU_LBT_ARM)
+#define cpu_has_lbt_mips	cpu_opt(LOONGARCH_CPU_LBT_MIPS)
+#define cpu_has_lbt		(cpu_has_lbt_x86|cpu_has_lbt_arm|cpu_has_lbt_mips)
+#define cpu_has_csr		cpu_opt(LOONGARCH_CPU_CSR)
+#define cpu_has_tlb		cpu_opt(LOONGARCH_CPU_TLB)
+#define cpu_has_watch		cpu_opt(LOONGARCH_CPU_WATCH)
+#define cpu_has_vint		cpu_opt(LOONGARCH_CPU_VINT)
+#define cpu_has_csripi		cpu_opt(LOONGARCH_CPU_CSRIPI)
+#define cpu_has_extioi		cpu_opt(LOONGARCH_CPU_EXTIOI)
+#define cpu_has_prefetch	cpu_opt(LOONGARCH_CPU_PREFETCH)
+#define cpu_has_pmp		cpu_opt(LOONGARCH_CPU_PMP)
+#define cpu_has_perf		cpu_opt(LOONGARCH_CPU_PMP)
+#define cpu_has_scalefreq	cpu_opt(LOONGARCH_CPU_SCALEFREQ)
+#define cpu_has_flatmode	cpu_opt(LOONGARCH_CPU_FLATMODE)
+#define cpu_has_eiodecode	cpu_opt(LOONGARCH_CPU_EIODECODE)
+#define cpu_has_guestid		cpu_opt(LOONGARCH_CPU_GUESTID)
+#define cpu_has_hypervisor	cpu_opt(LOONGARCH_CPU_HYPERVISOR)
+
+
+#endif /* __ASM_CPU_FEATURES_H */
diff --git a/arch/loongarch/include/asm/cpu-info.h b/arch/loongarch/include/asm/cpu-info.h
new file mode 100644
index 000000000000..8c173ee5650b
--- /dev/null
+++ b/arch/loongarch/include/asm/cpu-info.h
@@ -0,0 +1,136 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM_CPU_INFO_H
+#define __ASM_CPU_INFO_H
+
+#include <linux/cache.h>
+#include <linux/types.h>
+
+#include <asm/loongarch.h>
+
+/*
+ * Descriptor for a cache
+ */
+struct cache_desc {
+	unsigned int waysize;	/* Bytes per way */
+	unsigned short sets;	/* Number of lines per set */
+	unsigned char ways;	/* Number of ways */
+	unsigned char linesz;	/* Size of line in bytes */
+	unsigned char waybit;	/* Bits to select in a cache set */
+	unsigned char flags;	/* Flags describing cache properties */
+};
+
+struct cpuinfo_loongarch {
+	u64			asid_cache;
+	unsigned long		asid_mask;
+
+	/*
+	 * Capability and feature descriptor structure for LoongArch CPU
+	 */
+	unsigned long		ases;
+	unsigned long long	options;
+	unsigned int		processor_id;
+	unsigned int		fpu_vers;
+	unsigned int		fpu_csr0;
+	unsigned int		fpu_mask;
+	unsigned int		cputype;
+	int			isa_level;
+	int			tlbsize;
+	int			tlbsizemtlb;
+	int			tlbsizestlbsets;
+	int			tlbsizestlbways;
+	struct cache_desc	icache; /* Primary I-cache */
+	struct cache_desc	dcache; /* Primary D or combined I/D cache */
+	struct cache_desc	vcache; /* Victim cache, between pcache and scache */
+	struct cache_desc	scache; /* Secondary cache */
+	struct cache_desc	tcache; /* Tertiary/split secondary cache */
+	int			package;/* physical package number */
+	unsigned int		globalnumber;
+	int			vabits; /* Virtual Address size in bits */
+	int			pabits; /* Physical Address size in bits */
+	void			*data;	/* Additional data */
+	unsigned int		watch_dreg_count;   /* Number data breakpoints */
+	unsigned int		watch_ireg_count;   /* Number instruction breakpoints */
+	unsigned int		watch_reg_use_cnt; /* min(NUM_WATCH_REGS, watch_dreg_count + watch_ireg_count), Usable by ptrace */
+	unsigned int		kscratch_mask; /* Usable KScratch mask. */
+} __aligned(SMP_CACHE_BYTES);
+
+extern struct cpuinfo_loongarch cpu_data[];
+#define boot_cpu_data cpu_data[0]
+#define current_cpu_data cpu_data[smp_processor_id()]
+#define raw_current_cpu_data cpu_data[raw_smp_processor_id()]
+
+extern void cpu_probe(void);
+
+extern const char *__cpu_family[];
+extern const char *__cpu_full_name[];
+#define cpu_family_string()	__cpu_family[raw_smp_processor_id()]
+#define cpu_full_name_string()	__cpu_full_name[raw_smp_processor_id()]
+
+struct seq_file;
+struct notifier_block;
+
+extern int register_proc_cpuinfo_notifier(struct notifier_block *nb);
+extern int proc_cpuinfo_notifier_call_chain(unsigned long val, void *v);
+
+#define proc_cpuinfo_notifier(fn, pri)					\
+({									\
+	static struct notifier_block fn##_nb = {			\
+		.notifier_call = fn,					\
+		.priority = pri						\
+	};								\
+									\
+	register_proc_cpuinfo_notifier(&fn##_nb);			\
+})
+
+struct proc_cpuinfo_notifier_args {
+	struct seq_file *m;
+	unsigned long n;
+};
+
+static inline unsigned int cpu_cluster(struct cpuinfo_loongarch *cpuinfo)
+{
+	return (cpuinfo->globalnumber & LOONGARCH_GLOBALNUMBER_CLUSTER) >>
+		LOONGARCH_GLOBALNUMBER_CLUSTER_SHF;
+}
+
+static inline unsigned int cpu_core(struct cpuinfo_loongarch *cpuinfo)
+{
+	return (cpuinfo->globalnumber & LOONGARCH_GLOBALNUMBER_CORE) >>
+		LOONGARCH_GLOBALNUMBER_CORE_SHF;
+}
+
+extern void cpu_set_cluster(struct cpuinfo_loongarch *cpuinfo, unsigned int cluster);
+extern void cpu_set_core(struct cpuinfo_loongarch *cpuinfo, unsigned int core);
+
+static inline bool cpus_are_siblings(int cpua, int cpub)
+{
+	struct cpuinfo_loongarch *infoa = &cpu_data[cpua];
+	struct cpuinfo_loongarch *infob = &cpu_data[cpub];
+	unsigned int gnuma, gnumb;
+
+	if (infoa->package != infob->package)
+		return false;
+
+	gnuma = infoa->globalnumber & ~LOONGARCH_GLOBALNUMBER_VP;
+	gnumb = infob->globalnumber & ~LOONGARCH_GLOBALNUMBER_VP;
+	if (gnuma != gnumb)
+		return false;
+
+	return true;
+}
+
+static inline unsigned long cpu_asid_mask(struct cpuinfo_loongarch *cpuinfo)
+{
+	return cpuinfo->asid_mask;
+}
+
+static inline void set_cpu_asid_mask(struct cpuinfo_loongarch *cpuinfo,
+				     unsigned long asid_mask)
+{
+	cpuinfo->asid_mask = asid_mask;
+}
+
+#endif /* __ASM_CPU_INFO_H */
diff --git a/arch/loongarch/include/asm/cpu.h b/arch/loongarch/include/asm/cpu.h
new file mode 100644
index 000000000000..62e9cb6520a9
--- /dev/null
+++ b/arch/loongarch/include/asm/cpu.h
@@ -0,0 +1,127 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * cpu.h: Values of the PRId register used to match up
+ *	  various LoongArch cpu types.
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_CPU_H
+#define _ASM_CPU_H
+
+/*
+ * As of the LoongArch specs from Loongson Technology, the PRId register
+ * (CPUCFG.00) is defined in this (backwards compatible) way:
+ *
+ * +----------------+----------------+----------------+----------------+
+ * | Reserved       | Company ID	    | Processor ID   | Revision	      |
+ * +----------------+----------------+----------------+----------------+
+ *  31		 24 23		  16 15		    8 7              0
+ *
+ */
+
+/*
+ * Assigned Company values for bits 23:16 of the PRId register.
+ */
+
+#define PRID_COMP_MASK		0xff0000
+
+#define PRID_COMP_LOONGSON	0x140000
+
+/*
+ * Assigned Processor ID (implementation) values for bits 15:8 of the PRId
+ * register.  In order to detect a certain CPU type exactly eventually
+ * additional registers may need to be examined.
+ */
+
+#define PRID_IMP_MASK		0xff00
+
+#define PRID_IMP_LOONGSON_32	0x4200  /* Loongson 32bit */
+#define PRID_IMP_LOONGSON_64R	0x6100  /* Reduced Loongson 64bit */
+#define PRID_IMP_LOONGSON_64C	0x6300  /* Classic Loongson 64bit */
+#define PRID_IMP_LOONGSON_64G	0xc000  /* Generic Loongson 64bit */
+#define PRID_IMP_UNKNOWN	0xff00
+
+/*
+ * Particular Revision values for bits 7:0 of the PRId register.
+ */
+
+#define PRID_REV_MASK		0x00ff
+
+#if !defined(__ASSEMBLY__)
+
+enum cpu_type_enum {
+	CPU_UNKNOWN,
+	CPU_LOONGSON32,
+	CPU_LOONGSON64,
+	CPU_LAST
+};
+
+#endif /* !__ASSEMBLY */
+
+/*
+ * ISA Level encodings
+ *
+ */
+
+#define LOONGARCH_CPU_ISA_LA32R 0x00000001
+#define LOONGARCH_CPU_ISA_LA32S 0x00000002
+#define LOONGARCH_CPU_ISA_LA64  0x00000004
+
+#define LOONGARCH_CPU_ISA_32BIT (LOONGARCH_CPU_ISA_LA32R | LOONGARCH_CPU_ISA_LA32S)
+#define LOONGARCH_CPU_ISA_64BIT LOONGARCH_CPU_ISA_LA64
+
+/*
+ * CPU Option encodings
+ */
+#define CPU_FEATURE_CPUCFG		0	/* CPU has CPUCFG */
+#define CPU_FEATURE_LAM			1	/* CPU has Atomic instructions */
+#define CPU_FEATURE_UAL			2	/* CPU has Unaligned Access support */
+#define CPU_FEATURE_FPU			3	/* CPU has FPU */
+#define CPU_FEATURE_LSX			4	/* CPU has 128bit SIMD instructions */
+#define CPU_FEATURE_LASX		5	/* CPU has 256bit SIMD instructions */
+#define CPU_FEATURE_COMPLEX		6	/* CPU has Complex instructions */
+#define CPU_FEATURE_CRYPTO		7	/* CPU has Crypto instructions */
+#define CPU_FEATURE_LVZ			8	/* CPU has Virtualization extension */
+#define CPU_FEATURE_LBT_X86		9	/* CPU has X86 Binary Translation */
+#define CPU_FEATURE_LBT_ARM		10	/* CPU has ARM Binary Translation */
+#define CPU_FEATURE_LBT_MIPS		11	/* CPU has MIPS Binary Translation */
+#define CPU_FEATURE_TLB			12	/* CPU has TLB */
+#define CPU_FEATURE_CSR			13	/* CPU has CSR feature */
+#define CPU_FEATURE_WATCH		14	/* CPU has watchpoint registers */
+#define CPU_FEATURE_VINT		15	/* CPU has vectored interrupts */
+#define CPU_FEATURE_CSRIPI		16	/* CPU has CSR-IPI */
+#define CPU_FEATURE_EXTIOI		17	/* CPU has EXT-IOI */
+#define CPU_FEATURE_PREFETCH		18	/* CPU has prefetch instructions */
+#define CPU_FEATURE_PMP			19	/* CPU has perfermance counter */
+#define CPU_FEATURE_SCALEFREQ		20	/* CPU support scale cpufreq */
+#define CPU_FEATURE_FLATMODE		21	/* CPU has flatmode */
+#define CPU_FEATURE_EIODECODE		22	/* CPU has extioi int pin decode mode */
+#define CPU_FEATURE_GUESTID		23	/* CPU has GuestID feature */
+#define CPU_FEATURE_HYPERVISOR		24	/* CPU has hypervisor (run in VM) */
+
+#define LOONGARCH_CPU_CPUCFG		BIT_ULL(CPU_FEATURE_CPUCFG)
+#define LOONGARCH_CPU_LAM		BIT_ULL(CPU_FEATURE_LAM)
+#define LOONGARCH_CPU_UAL		BIT_ULL(CPU_FEATURE_UAL)
+#define LOONGARCH_CPU_FPU		BIT_ULL(CPU_FEATURE_FPU)
+#define LOONGARCH_CPU_LSX		BIT_ULL(CPU_FEATURE_LSX)
+#define LOONGARCH_CPU_LASX		BIT_ULL(CPU_FEATURE_LASX)
+#define LOONGARCH_CPU_COMPLEX		BIT_ULL(CPU_FEATURE_COMPLEX)
+#define LOONGARCH_CPU_CRYPTO		BIT_ULL(CPU_FEATURE_CRYPTO)
+#define LOONGARCH_CPU_LVZ		BIT_ULL(CPU_FEATURE_LVZ)
+#define LOONGARCH_CPU_LBT_X86		BIT_ULL(CPU_FEATURE_LBT_X86)
+#define LOONGARCH_CPU_LBT_ARM		BIT_ULL(CPU_FEATURE_LBT_ARM)
+#define LOONGARCH_CPU_LBT_MIPS		BIT_ULL(CPU_FEATURE_LBT_MIPS)
+#define LOONGARCH_CPU_TLB		BIT_ULL(CPU_FEATURE_TLB)
+#define LOONGARCH_CPU_CSR		BIT_ULL(CPU_FEATURE_CSR)
+#define LOONGARCH_CPU_WATCH		BIT_ULL(CPU_FEATURE_WATCH)
+#define LOONGARCH_CPU_VINT		BIT_ULL(CPU_FEATURE_VINT)
+#define LOONGARCH_CPU_CSRIPI		BIT_ULL(CPU_FEATURE_CSRIPI)
+#define LOONGARCH_CPU_EXTIOI		BIT_ULL(CPU_FEATURE_EXTIOI)
+#define LOONGARCH_CPU_PREFETCH		BIT_ULL(CPU_FEATURE_PREFETCH)
+#define LOONGARCH_CPU_PMP		BIT_ULL(CPU_FEATURE_PMP)
+#define LOONGARCH_CPU_SCALEFREQ		BIT_ULL(CPU_FEATURE_SCALEFREQ)
+#define LOONGARCH_CPU_FLATMODE		BIT_ULL(CPU_FEATURE_FLATMODE)
+#define LOONGARCH_CPU_EIODECODE		BIT_ULL(CPU_FEATURE_EIODECODE)
+#define LOONGARCH_CPU_GUESTID		BIT_ULL(CPU_FEATURE_GUESTID)
+#define LOONGARCH_CPU_HYPERVISOR	BIT_ULL(CPU_FEATURE_HYPERVISOR)
+#endif /* _ASM_CPU_H */
diff --git a/arch/loongarch/include/asm/fpregdef.h b/arch/loongarch/include/asm/fpregdef.h
new file mode 100644
index 000000000000..151dc9aee1c6
--- /dev/null
+++ b/arch/loongarch/include/asm/fpregdef.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Definitions for the FPU register names
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_FPREGDEF_H
+#define _ASM_FPREGDEF_H
+
+#define fv0	$f0	/* return value */
+#define fv1	$f2
+#define fa0	$f12	/* argument registers */
+#define fa1	$f13
+#define fa2	$f14
+#define fa3	$f15
+#define fa4	$f16
+#define fa5	$f17
+#define fa6	$f18
+#define fa7	$f19
+#define ft0	$f4	/* caller saved */
+#define ft1	$f5
+#define ft2	$f6
+#define ft3	$f7
+#define ft4	$f8
+#define ft5	$f9
+#define ft6	$f10
+#define ft7	$f11
+#define ft8	$f20
+#define ft9	$f21
+#define ft10	$f22
+#define ft11	$f23
+#define ft12	$f1
+#define ft13	$f3
+#define fs0	$f24	/* callee saved */
+#define fs1	$f25
+#define fs2	$f26
+#define fs3	$f27
+#define fs4	$f28
+#define fs5	$f29
+#define fs6	$f30
+#define fs7	$f31
+
+#define fcsr0	$r0
+#define fcsr1	$r1
+#define fcsr2	$r2
+#define fcsr3	$r3
+#define vcsr16	$r16
+
+#endif /* _ASM_FPREGDEF_H */
diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h
new file mode 100644
index 000000000000..083e6726d4cb
--- /dev/null
+++ b/arch/loongarch/include/asm/loongarch.h
@@ -0,0 +1,1528 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_LOONGARCH_H
+#define _ASM_LOONGARCH_H
+
+#include <linux/bits.h>
+#include <linux/linkage.h>
+#include <linux/types.h>
+
+#ifndef __ASSEMBLY__
+#include <larchintrin.h>
+
+/*
+ * parse_r var, r - Helper assembler macro for parsing register names.
+ *
+ * This converts the register name in $n form provided in \r to the
+ * corresponding register number, which is assigned to the variable \var. It is
+ * needed to allow explicit encoding of instructions in inline assembly where
+ * registers are chosen by the compiler in $n form, allowing us to avoid using
+ * fixed register numbers.
+ *
+ * It also allows newer instructions (not implemented by the assembler) to be
+ * transparently implemented using assembler macros, instead of needing separate
+ * cases depending on toolchain support.
+ *
+ * Simple usage example:
+ * __asm__ __volatile__("parse_r addr, %0\n\t"
+ *			"#invtlb op, 0, %0\n\t"
+ *			".word ((0x6498000) | (addr << 10) | (0 << 5) | op)"
+ *			: "=r" (status);
+ */
+
+/* Match an individual register number and assign to \var */
+#define _IFC_REG(n)				\
+	".ifc	\\r, $r" #n "\n\t"		\
+	"\\var	= " #n "\n\t"			\
+	".endif\n\t"
+
+__asm__(".macro	parse_r var r\n\t"
+	"\\var	= -1\n\t"
+	_IFC_REG(0)  _IFC_REG(1)  _IFC_REG(2)  _IFC_REG(3)
+	_IFC_REG(4)  _IFC_REG(5)  _IFC_REG(6)  _IFC_REG(7)
+	_IFC_REG(8)  _IFC_REG(9)  _IFC_REG(10) _IFC_REG(11)
+	_IFC_REG(12) _IFC_REG(13) _IFC_REG(14) _IFC_REG(15)
+	_IFC_REG(16) _IFC_REG(17) _IFC_REG(18) _IFC_REG(19)
+	_IFC_REG(20) _IFC_REG(21) _IFC_REG(22) _IFC_REG(23)
+	_IFC_REG(24) _IFC_REG(25) _IFC_REG(26) _IFC_REG(27)
+	_IFC_REG(28) _IFC_REG(29) _IFC_REG(30) _IFC_REG(31)
+	".iflt	\\var\n\t"
+	".error	\"Unable to parse register name \\r\"\n\t"
+	".endif\n\t"
+	".endm");
+
+#undef _IFC_REG
+
+/* CPUCFG */
+static inline u32 read_cpucfg(u32 reg)
+{
+	return __cpucfg(reg);
+}
+
+#endif /* !__ASSEMBLY__ */
+
+#ifdef __ASSEMBLY__
+
+/* LoongArch Registers */
+#define REG_RA	0x1
+#define REG_TP	0x2
+#define REG_SP	0x3
+#define REG_A0	0x4
+#define REG_A1	0x5
+#define REG_A2	0x6
+#define REG_A3	0x7
+#define REG_A4	0x8
+#define REG_A5	0x9
+#define REG_A6	0xa
+#define REG_A7	0xb
+#define REG_V0	REG_A0
+#define REG_V1	REG_A1
+#define REG_T0	0xc
+#define REG_T1	0xd
+#define REG_T2	0xe
+#define REG_T3	0xf
+#define REG_T4	0x10
+#define REG_T5	0x11
+#define REG_T6	0x12
+#define REG_T7	0x13
+#define REG_T8	0x14
+#define REG_U0	0x15
+#define REG_FP	0x16
+#define REG_S0	0x17
+#define REG_S1	0x18
+#define REG_S2	0x19
+#define REG_S3	0x1a
+#define REG_S4	0x1b
+#define REG_S5	0x1c
+#define REG_S6	0x1d
+#define REG_S7	0x1e
+#define REG_S8	0x1f
+
+#endif /* __ASSEMBLY__ */
+
+/* Bit Domains for CPUCFG registers */
+#define LOONGARCH_CPUCFG0		0x0
+#define  CPUCFG0_PRID			GENMASK(31, 0)
+
+#define LOONGARCH_CPUCFG1		0x1
+#define  CPUCFG1_ISGR32			BIT(0)
+#define  CPUCFG1_ISGR64			BIT(1)
+#define  CPUCFG1_PAGING			BIT(2)
+#define  CPUCFG1_IOCSR			BIT(3)
+#define  CPUCFG1_PABITS			GENMASK(11, 4)
+#define  CPUCFG1_VABITS			GENMASK(19, 12)
+#define  CPUCFG1_UAL			BIT(20)
+#define  CPUCFG1_RI			BIT(21)
+#define  CPUCFG1_EP			BIT(22)
+#define  CPUCFG1_RPLV			BIT(23)
+#define  CPUCFG1_HUGEPG			BIT(24)
+#define  CPUCFG1_IOCSRBRD		BIT(25)
+#define  CPUCFG1_MSGINT			BIT(26)
+
+#define LOONGARCH_CPUCFG2		0x2
+#define  CPUCFG2_FP			BIT(0)
+#define  CPUCFG2_FPSP			BIT(1)
+#define  CPUCFG2_FPDP			BIT(2)
+#define  CPUCFG2_FPVERS			GENMASK(5, 3)
+#define  CPUCFG2_LSX			BIT(6)
+#define  CPUCFG2_LASX			BIT(7)
+#define  CPUCFG2_COMPLEX		BIT(8)
+#define  CPUCFG2_CRYPTO			BIT(9)
+#define  CPUCFG2_LVZP			BIT(10)
+#define  CPUCFG2_LVZVER			GENMASK(13, 11)
+#define  CPUCFG2_LLFTP			BIT(14)
+#define  CPUCFG2_LLFTPREV		GENMASK(17, 15)
+#define  CPUCFG2_X86BT			BIT(18)
+#define  CPUCFG2_ARMBT			BIT(19)
+#define  CPUCFG2_MIPSBT			BIT(20)
+#define  CPUCFG2_LSPW			BIT(21)
+#define  CPUCFG2_LAM			BIT(22)
+
+#define LOONGARCH_CPUCFG3		0x3
+#define  CPUCFG3_CCDMA			BIT(0)
+#define  CPUCFG3_SFB			BIT(1)
+#define  CPUCFG3_UCACC			BIT(2)
+#define  CPUCFG3_LLEXC			BIT(3)
+#define  CPUCFG3_SCDLY			BIT(4)
+#define  CPUCFG3_LLDBAR			BIT(5)
+#define  CPUCFG3_ITLBT			BIT(6)
+#define  CPUCFG3_ICACHET		BIT(7)
+#define  CPUCFG3_SPW_LVL		GENMASK(10, 8)
+#define  CPUCFG3_SPW_HG_HF		BIT(11)
+#define  CPUCFG3_RVA			BIT(12)
+#define  CPUCFG3_RVAMAX			GENMASK(16, 13)
+
+#define LOONGARCH_CPUCFG4		0x4
+#define  CPUCFG4_CCFREQ			GENMASK(31, 0)
+
+#define LOONGARCH_CPUCFG5		0x5
+#define  CPUCFG5_CCMUL			GENMASK(15, 0)
+#define  CPUCFG5_CCDIV			GENMASK(31, 16)
+
+#define LOONGARCH_CPUCFG6		0x6
+#define  CPUCFG6_PMP			BIT(0)
+#define  CPUCFG6_PAMVER			GENMASK(3, 1)
+#define  CPUCFG6_PMNUM			GENMASK(7, 4)
+#define  CPUCFG6_PMBITS			GENMASK(13, 8)
+#define  CPUCFG6_UPM			BIT(14)
+
+#define LOONGARCH_CPUCFG16		0x10
+#define  CPUCFG16_L1_IUPRE		BIT(0)
+#define  CPUCFG16_L1_IUUNIFY		BIT(1)
+#define  CPUCFG16_L1_DPRE		BIT(2)
+#define  CPUCFG16_L2_IUPRE		BIT(3)
+#define  CPUCFG16_L2_IUUNIFY		BIT(4)
+#define  CPUCFG16_L2_IUPRIV		BIT(5)
+#define  CPUCFG16_L2_IUINCL		BIT(6)
+#define  CPUCFG16_L2_DPRE		BIT(7)
+#define  CPUCFG16_L2_DPRIV		BIT(8)
+#define  CPUCFG16_L2_DINCL		BIT(9)
+#define  CPUCFG16_L3_IUPRE		BIT(10)
+#define  CPUCFG16_L3_IUUNIFY		BIT(11)
+#define  CPUCFG16_L3_IUPRIV		BIT(12)
+#define  CPUCFG16_L3_IUINCL		BIT(13)
+#define  CPUCFG16_L3_DPRE		BIT(14)
+#define  CPUCFG16_L3_DPRIV		BIT(15)
+#define  CPUCFG16_L3_DINCL		BIT(16)
+
+#define LOONGARCH_CPUCFG17		0x11
+#define  CPUCFG17_L1I_WAYS_M		GENMASK(15, 0)
+#define  CPUCFG17_L1I_SETS_M		GENMASK(23, 16)
+#define  CPUCFG17_L1I_SIZE_M		GENMASK(30, 24)
+#define  CPUCFG17_L1I_WAYS		0
+#define  CPUCFG17_L1I_SETS		16
+#define  CPUCFG17_L1I_SIZE		24
+
+#define LOONGARCH_CPUCFG18		0x12
+#define  CPUCFG18_L1D_WAYS_M		GENMASK(15, 0)
+#define  CPUCFG18_L1D_SETS_M		GENMASK(23, 16)
+#define  CPUCFG18_L1D_SIZE_M		GENMASK(30, 24)
+#define  CPUCFG18_L1D_WAYS		0
+#define  CPUCFG18_L1D_SETS		16
+#define  CPUCFG18_L1D_SIZE		24
+
+#define LOONGARCH_CPUCFG19		0x13
+#define  CPUCFG19_L2_WAYS_M		GENMASK(15, 0)
+#define  CPUCFG19_L2_SETS_M		GENMASK(23, 16)
+#define  CPUCFG19_L2_SIZE_M		GENMASK(30, 24)
+#define  CPUCFG19_L2_WAYS		0
+#define  CPUCFG19_L2_SETS		16
+#define  CPUCFG19_L2_SIZE		24
+
+#define LOONGARCH_CPUCFG20		0x14
+#define  CPUCFG20_L3_WAYS_M		GENMASK(15, 0)
+#define  CPUCFG20_L3_SETS_M		GENMASK(23, 16)
+#define  CPUCFG20_L3_SIZE_M		GENMASK(30, 24)
+#define  CPUCFG20_L3_WAYS		0
+#define  CPUCFG20_L3_SETS		16
+#define  CPUCFG20_L3_SIZE		24
+
+#define LOONGARCH_CPUCFG48		0x30
+#define  CPUCFG48_MCSR_LCK		BIT(0)
+#define  CPUCFG48_NAP_EN		BIT(1)
+#define  CPUCFG48_VFPU_CG		BIT(2)
+#define  CPUCFG48_RAM_CG		BIT(3)
+
+#ifndef __ASSEMBLY__
+
+/* CSR */
+static __always_inline u32 csr_readl(u32 reg)
+{
+	return __csrrd_w(reg);
+}
+
+static __always_inline u64 csr_readq(u32 reg)
+{
+	return __csrrd_d(reg);
+}
+
+static __always_inline void csr_writel(u32 val, u32 reg)
+{
+	__csrwr_w(val, reg);
+}
+
+static __always_inline void csr_writeq(u64 val, u32 reg)
+{
+	__csrwr_d(val, reg);
+}
+
+static __always_inline u32 csr_xchgl(u32 val, u32 mask, u32 reg)
+{
+	return __csrxchg_w(val, mask, reg);
+}
+
+static __always_inline u64 csr_xchgq(u64 val, u64 mask, u32 reg)
+{
+	return __csrxchg_d(val, mask, reg);
+}
+
+/* IOCSR */
+static __always_inline u32 iocsr_readl(u32 reg)
+{
+	return __iocsrrd_w(reg);
+}
+
+static __always_inline u64 iocsr_readq(u32 reg)
+{
+	return __iocsrrd_d(reg);
+}
+
+static __always_inline void iocsr_writel(u32 val, u32 reg)
+{
+	__iocsrwr_w(val, reg);
+}
+
+static __always_inline void iocsr_writeq(u64 val, u32 reg)
+{
+	__iocsrwr_d(val, reg);
+}
+
+#endif /* !__ASSEMBLY__ */
+
+/* CSR register number */
+
+/* Basic CSR registers */
+#define LOONGARCH_CSR_CRMD		0x0	/* Current mode info */
+#define  CSR_CRMD_WE_SHIFT		9
+#define  CSR_CRMD_WE			(_ULCAST_(0x1) << CSR_CRMD_WE_SHIFT)
+#define  CSR_CRMD_DACM_SHIFT		7
+#define  CSR_CRMD_DACM_WIDTH		2
+#define  CSR_CRMD_DACM			(_ULCAST_(0x3) << CSR_CRMD_DACM_SHIFT)
+#define  CSR_CRMD_DACF_SHIFT		5
+#define  CSR_CRMD_DACF_WIDTH		2
+#define  CSR_CRMD_DACF			(_ULCAST_(0x3) << CSR_CRMD_DACF_SHIFT)
+#define  CSR_CRMD_PG_SHIFT		4
+#define  CSR_CRMD_PG			(_ULCAST_(0x1) << CSR_CRMD_PG_SHIFT)
+#define  CSR_CRMD_DA_SHIFT		3
+#define  CSR_CRMD_DA			(_ULCAST_(0x1) << CSR_CRMD_DA_SHIFT)
+#define  CSR_CRMD_IE_SHIFT		2
+#define  CSR_CRMD_IE			(_ULCAST_(0x1) << CSR_CRMD_IE_SHIFT)
+#define  CSR_CRMD_PLV_SHIFT		0
+#define  CSR_CRMD_PLV_WIDTH		2
+#define  CSR_CRMD_PLV			(_ULCAST_(0x3) << CSR_CRMD_PLV_SHIFT)
+
+#define PLV_KERN			0
+#define PLV_USER			3
+#define PLV_MASK			0x3
+
+#define LOONGARCH_CSR_PRMD		0x1	/* Prev-exception mode info */
+#define  CSR_PRMD_PWE_SHIFT		3
+#define  CSR_PRMD_PWE			(_ULCAST_(0x1) << CSR_PRMD_PWE_SHIFT)
+#define  CSR_PRMD_PIE_SHIFT		2
+#define  CSR_PRMD_PIE			(_ULCAST_(0x1) << CSR_PRMD_PIE_SHIFT)
+#define  CSR_PRMD_PPLV_SHIFT		0
+#define  CSR_PRMD_PPLV_WIDTH		2
+#define  CSR_PRMD_PPLV			(_ULCAST_(0x3) << CSR_PRMD_PPLV_SHIFT)
+
+#define LOONGARCH_CSR_EUEN		0x2	/* Extended unit enable */
+#define  CSR_EUEN_LBTEN_SHIFT		3
+#define  CSR_EUEN_LBTEN			(_ULCAST_(0x1) << CSR_EUEN_LBTEN_SHIFT)
+#define  CSR_EUEN_LASXEN_SHIFT		2
+#define  CSR_EUEN_LASXEN		(_ULCAST_(0x1) << CSR_EUEN_LASXEN_SHIFT)
+#define  CSR_EUEN_LSXEN_SHIFT		1
+#define  CSR_EUEN_LSXEN			(_ULCAST_(0x1) << CSR_EUEN_LSXEN_SHIFT)
+#define  CSR_EUEN_FPEN_SHIFT		0
+#define  CSR_EUEN_FPEN			(_ULCAST_(0x1) << CSR_EUEN_FPEN_SHIFT)
+
+#define LOONGARCH_CSR_MISC		0x3	/* Misc config */
+
+#define LOONGARCH_CSR_ECFG		0x4	/* Exception config */
+#define  CSR_ECFG_VS_SHIFT		16
+#define  CSR_ECFG_VS_WIDTH		3
+#define  CSR_ECFG_VS			(_ULCAST_(0x7) << CSR_ECFG_VS_SHIFT)
+#define  CSR_ECFG_IM_SHIFT		0
+#define  CSR_ECFG_IM_WIDTH		13
+#define  CSR_ECFG_IM			(_ULCAST_(0x1fff) << CSR_ECFG_IM_SHIFT)
+
+#define LOONGARCH_CSR_ESTAT		0x5	/* Exception status */
+#define  CSR_ESTAT_ESUBCODE_SHIFT	22
+#define  CSR_ESTAT_ESUBCODE_WIDTH	9
+#define  CSR_ESTAT_ESUBCODE		(_ULCAST_(0x1ff) << CSR_ESTAT_ESUBCODE_SHIFT)
+#define  CSR_ESTAT_EXC_SHIFT		16
+#define  CSR_ESTAT_EXC_WIDTH		6
+#define  CSR_ESTAT_EXC			(_ULCAST_(0x3f) << CSR_ESTAT_EXC_SHIFT)
+#define  CSR_ESTAT_IS_SHIFT		0
+#define  CSR_ESTAT_IS_WIDTH		15
+#define  CSR_ESTAT_IS			(_ULCAST_(0x7fff) << CSR_ESTAT_IS_SHIFT)
+
+#define LOONGARCH_CSR_ERA		0x6	/* ERA */
+
+#define LOONGARCH_CSR_BADV		0x7	/* Bad virtual address */
+
+#define LOONGARCH_CSR_BADI		0x8	/* Bad instruction */
+
+#define LOONGARCH_CSR_EENTRY		0xc	/* Exception entry */
+
+/* TLB related CSR registers */
+#define LOONGARCH_CSR_TLBIDX		0x10	/* TLB Index, EHINV, PageSize, NP */
+#define  CSR_TLBIDX_EHINV_SHIFT		31
+#define  CSR_TLBIDX_EHINV		(_ULCAST_(1) << CSR_TLBIDX_EHINV_SHIFT)
+#define  CSR_TLBIDX_PS_SHIFT		24
+#define  CSR_TLBIDX_PS_WIDTH		6
+#define  CSR_TLBIDX_PS			(_ULCAST_(0x3f) << CSR_TLBIDX_PS_SHIFT)
+#define  CSR_TLBIDX_IDX_SHIFT		0
+#define  CSR_TLBIDX_IDX_WIDTH		12
+#define  CSR_TLBIDX_IDX			(_ULCAST_(0xfff) << CSR_TLBIDX_IDX_SHIFT)
+#define  CSR_TLBIDX_SIZEM		0x3f000000
+#define  CSR_TLBIDX_SIZE		CSR_TLBIDX_PS_SHIFT
+#define  CSR_TLBIDX_IDXM		0xfff
+#define  CSR_INVALID_ENTRY(e)		(CSR_TLBIDX_EHINV | e)
+
+#define LOONGARCH_CSR_TLBEHI		0x11	/* TLB EntryHi */
+
+#define LOONGARCH_CSR_TLBELO0		0x12	/* TLB EntryLo0 */
+#define  CSR_TLBLO0_RPLV_SHIFT		63
+#define  CSR_TLBLO0_RPLV		(_ULCAST_(0x1) << CSR_TLBLO0_RPLV_SHIFT)
+#define  CSR_TLBLO0_NX_SHIFT		62
+#define  CSR_TLBLO0_NX			(_ULCAST_(0x1) << CSR_TLBLO0_NX_SHIFT)
+#define  CSR_TLBLO0_NR_SHIFT		61
+#define  CSR_TLBLO0_NR			(_ULCAST_(0x1) << CSR_TLBLO0_NR_SHIFT)
+#define  CSR_TLBLO0_PFN_SHIFT		12
+#define  CSR_TLBLO0_PFN_WIDTH		36
+#define  CSR_TLBLO0_PFN			(_ULCAST_(0xfffffffff) << CSR_TLBLO0_PFN_SHIFT)
+#define  CSR_TLBLO0_GLOBAL_SHIFT	6
+#define  CSR_TLBLO0_GLOBAL		(_ULCAST_(0x1) << CSR_TLBLO0_GLOBAL_SHIFT)
+#define  CSR_TLBLO0_CCA_SHIFT		4
+#define  CSR_TLBLO0_CCA_WIDTH		2
+#define  CSR_TLBLO0_CCA			(_ULCAST_(0x3) << CSR_TLBLO0_CCA_SHIFT)
+#define  CSR_TLBLO0_PLV_SHIFT		2
+#define  CSR_TLBLO0_PLV_WIDTH		2
+#define  CSR_TLBLO0_PLV			(_ULCAST_(0x3) << CSR_TLBLO0_PLV_SHIFT)
+#define  CSR_TLBLO0_WE_SHIFT		1
+#define  CSR_TLBLO0_WE			(_ULCAST_(0x1) << CSR_TLBLO0_WE_SHIFT)
+#define  CSR_TLBLO0_V_SHIFT		0
+#define  CSR_TLBLO0_V			(_ULCAST_(0x1) << CSR_TLBLO0_V_SHIFT)
+
+#define LOONGARCH_CSR_TLBELO1		0x13	/* TLB EntryLo1 */
+#define  CSR_TLBLO1_RPLV_SHIFT		63
+#define  CSR_TLBLO1_RPLV		(_ULCAST_(0x1) << CSR_TLBLO1_RPLV_SHIFT)
+#define  CSR_TLBLO1_NX_SHIFT		62
+#define  CSR_TLBLO1_NX			(_ULCAST_(0x1) << CSR_TLBLO1_NX_SHIFT)
+#define  CSR_TLBLO1_NR_SHIFT		61
+#define  CSR_TLBLO1_NR			(_ULCAST_(0x1) << CSR_TLBLO1_NR_SHIFT)
+#define  CSR_TLBLO1_PFN_SHIFT		12
+#define  CSR_TLBLO1_PFN_WIDTH		36
+#define  CSR_TLBLO1_PFN			(_ULCAST_(0xfffffffff) << CSR_TLBLO1_PFN_SHIFT)
+#define  CSR_TLBLO1_GLOBAL_SHIFT	6
+#define  CSR_TLBLO1_GLOBAL		(_ULCAST_(0x1) << CSR_TLBLO1_GLOBAL_SHIFT)
+#define  CSR_TLBLO1_CCA_SHIFT		4
+#define  CSR_TLBLO1_CCA_WIDTH		2
+#define  CSR_TLBLO1_CCA			(_ULCAST_(0x3) << CSR_TLBLO1_CCA_SHIFT)
+#define  CSR_TLBLO1_PLV_SHIFT		2
+#define  CSR_TLBLO1_PLV_WIDTH		2
+#define  CSR_TLBLO1_PLV			(_ULCAST_(0x3) << CSR_TLBLO1_PLV_SHIFT)
+#define  CSR_TLBLO1_WE_SHIFT		1
+#define  CSR_TLBLO1_WE			(_ULCAST_(0x1) << CSR_TLBLO1_WE_SHIFT)
+#define  CSR_TLBLO1_V_SHIFT		0
+#define  CSR_TLBLO1_V			(_ULCAST_(0x1) << CSR_TLBLO1_V_SHIFT)
+
+#define LOONGARCH_CSR_GTLBC		0x15	/* Guest TLB control */
+#define  CSR_GTLBC_RID_SHIFT		16
+#define  CSR_GTLBC_RID_WIDTH		8
+#define  CSR_GTLBC_RID			(_ULCAST_(0xff) << CSR_GTLBC_RID_SHIFT)
+#define  CSR_GTLBC_TOTI_SHIFT		13
+#define  CSR_GTLBC_TOTI			(_ULCAST_(0x1) << CSR_GTLBC_TOTI_SHIFT)
+#define  CSR_GTLBC_USERID_SHIFT		12
+#define  CSR_GTLBC_USERID		(_ULCAST_(0x1) << CSR_GTLBC_USERID_SHIFT)
+#define  CSR_GTLBC_GMTLBSZ_SHIFT	0
+#define  CSR_GTLBC_GMTLBSZ_WIDTH	6
+#define  CSR_GTLBC_GMTLBSZ		(_ULCAST_(0x3f) << CSR_GTLBC_GMTLBSZ_SHIFT)
+
+#define LOONGARCH_CSR_TRGP		0x16	/* TLBR read guest info */
+#define  CSR_TRGP_RID_SHIFT		16
+#define  CSR_TRGP_RID_WIDTH		8
+#define  CSR_TRGP_RID			(_ULCAST_(0xff) << CSR_TRGP_RID_SHIFT)
+#define  CSR_TRGP_GTLB_SHIFT		0
+#define  CSR_TRGP_GTLB			(1 << CSR_TRGP_GTLB_SHIFT)
+
+#define LOONGARCH_CSR_ASID		0x18	/* ASID */
+#define  CSR_ASID_BIT_SHIFT		16	/* ASIDBits */
+#define  CSR_ASID_BIT_WIDTH		8
+#define  CSR_ASID_BIT			(_ULCAST_(0xff) << CSR_ASID_BIT_SHIFT)
+#define  CSR_ASID_ASID_SHIFT		0
+#define  CSR_ASID_ASID_WIDTH		10
+#define  CSR_ASID_ASID			(_ULCAST_(0x3ff) << CSR_ASID_ASID_SHIFT)
+
+#define LOONGARCH_CSR_PGDL		0x19	/* Page table base address when VA[47] = 0 */
+
+#define LOONGARCH_CSR_PGDH		0x1a	/* Page table base address when VA[47] = 1 */
+
+#define LOONGARCH_CSR_PGD		0x1b	/* Page table base */
+
+#define LOONGARCH_CSR_PWCTL0		0x1c	/* PWCtl0 */
+#define  CSR_PWCTL0_PTEW_SHIFT		30
+#define  CSR_PWCTL0_PTEW_WIDTH		2
+#define  CSR_PWCTL0_PTEW		(_ULCAST_(0x3) << CSR_PWCTL0_PTEW_SHIFT)
+#define  CSR_PWCTL0_DIR1WIDTH_SHIFT	25
+#define  CSR_PWCTL0_DIR1WIDTH_WIDTH	5
+#define  CSR_PWCTL0_DIR1WIDTH		(_ULCAST_(0x1f) << CSR_PWCTL0_DIR1WIDTH_SHIFT)
+#define  CSR_PWCTL0_DIR1BASE_SHIFT	20
+#define  CSR_PWCTL0_DIR1BASE_WIDTH	5
+#define  CSR_PWCTL0_DIR1BASE		(_ULCAST_(0x1f) << CSR_PWCTL0_DIR1BASE_SHIFT)
+#define  CSR_PWCTL0_DIR0WIDTH_SHIFT	15
+#define  CSR_PWCTL0_DIR0WIDTH_WIDTH	5
+#define  CSR_PWCTL0_DIR0WIDTH		(_ULCAST_(0x1f) << CSR_PWCTL0_DIR0WIDTH_SHIFT)
+#define  CSR_PWCTL0_DIR0BASE_SHIFT	10
+#define  CSR_PWCTL0_DIR0BASE_WIDTH	5
+#define  CSR_PWCTL0_DIR0BASE		(_ULCAST_(0x1f) << CSR_PWCTL0_DIR0BASE_SHIFT)
+#define  CSR_PWCTL0_PTWIDTH_SHIFT	5
+#define  CSR_PWCTL0_PTWIDTH_WIDTH	5
+#define  CSR_PWCTL0_PTWIDTH		(_ULCAST_(0x1f) << CSR_PWCTL0_PTWIDTH_SHIFT)
+#define  CSR_PWCTL0_PTBASE_SHIFT	0
+#define  CSR_PWCTL0_PTBASE_WIDTH	5
+#define  CSR_PWCTL0_PTBASE		(_ULCAST_(0x1f) << CSR_PWCTL0_PTBASE_SHIFT)
+
+#define LOONGARCH_CSR_PWCTL1		0x1d	/* PWCtl1 */
+#define  CSR_PWCTL1_DIR3WIDTH_SHIFT	18
+#define  CSR_PWCTL1_DIR3WIDTH_WIDTH	5
+#define  CSR_PWCTL1_DIR3WIDTH		(_ULCAST_(0x1f) << CSR_PWCTL1_DIR3WIDTH_SHIFT)
+#define  CSR_PWCTL1_DIR3BASE_SHIFT	12
+#define  CSR_PWCTL1_DIR3BASE_WIDTH	5
+#define  CSR_PWCTL1_DIR3BASE		(_ULCAST_(0x1f) << CSR_PWCTL0_DIR3BASE_SHIFT)
+#define  CSR_PWCTL1_DIR2WIDTH_SHIFT	6
+#define  CSR_PWCTL1_DIR2WIDTH_WIDTH	5
+#define  CSR_PWCTL1_DIR2WIDTH		(_ULCAST_(0x1f) << CSR_PWCTL1_DIR2WIDTH_SHIFT)
+#define  CSR_PWCTL1_DIR2BASE_SHIFT	0
+#define  CSR_PWCTL1_DIR2BASE_WIDTH	5
+#define  CSR_PWCTL1_DIR2BASE		(_ULCAST_(0x1f) << CSR_PWCTL0_DIR2BASE_SHIFT)
+
+#define LOONGARCH_CSR_STLBPGSIZE	0x1e
+#define  CSR_STLBPGSIZE_PS_WIDTH	6
+#define  CSR_STLBPGSIZE_PS		(_ULCAST_(0x3f))
+
+#define LOONGARCH_CSR_RVACFG		0x1f
+#define  CSR_RVACFG_RDVA_WIDTH		4
+#define  CSR_RVACFG_RDVA		(_ULCAST_(0xf))
+
+/* Config CSR registers */
+#define LOONGARCH_CSR_CPUID		0x20	/* CPU core id */
+#define  CSR_CPUID_COREID_WIDTH		9
+#define  CSR_CPUID_COREID		_ULCAST_(0x1ff)
+
+#define LOONGARCH_CSR_PRCFG1		0x21	/* Config1 */
+#define  CSR_CONF1_VSMAX_SHIFT		12
+#define  CSR_CONF1_VSMAX_WIDTH		3
+#define  CSR_CONF1_VSMAX		(_ULCAST_(7) << CSR_CONF1_VSMAX_SHIFT)
+#define  CSR_CONF1_TMRBITS_SHIFT	4
+#define  CSR_CONF1_TMRBITS_WIDTH	8
+#define  CSR_CONF1_TMRBITS		(_ULCAST_(0xff) << CSR_CONF1_TMRBITS_SHIFT)
+#define  CSR_CONF1_KSNUM_WIDTH		4
+#define  CSR_CONF1_KSNUM		_ULCAST_(0xf)
+
+#define LOONGARCH_CSR_PRCFG2		0x22	/* Config2 */
+#define  CSR_CONF2_PGMASK_SUPP		0x3ffff000
+
+#define LOONGARCH_CSR_PRCFG3		0x23	/* Config3 */
+#define  CSR_CONF3_STLBIDX_SHIFT	20
+#define  CSR_CONF3_STLBIDX_WIDTH	6
+#define  CSR_CONF3_STLBIDX		(_ULCAST_(0x3f) << CSR_CONF3_STLBIDX_SHIFT)
+#define  CSR_CONF3_STLBWAYS_SHIFT	12
+#define  CSR_CONF3_STLBWAYS_WIDTH	8
+#define  CSR_CONF3_STLBWAYS		(_ULCAST_(0xff) << CSR_CONF3_STLBWAYS_SHIFT)
+#define  CSR_CONF3_MTLBSIZE_SHIFT	4
+#define  CSR_CONF3_MTLBSIZE_WIDTH	8
+#define  CSR_CONF3_MTLBSIZE		(_ULCAST_(0xff) << CSR_CONF3_MTLBSIZE_SHIFT)
+#define  CSR_CONF3_TLBTYPE_SHIFT	0
+#define  CSR_CONF3_TLBTYPE_WIDTH	4
+#define  CSR_CONF3_TLBTYPE		(_ULCAST_(0xf) << CSR_CONF3_TLBTYPE_SHIFT)
+
+/* Kscratch registers */
+#define LOONGARCH_CSR_KS0		0x30
+#define LOONGARCH_CSR_KS1		0x31
+#define LOONGARCH_CSR_KS2		0x32
+#define LOONGARCH_CSR_KS3		0x33
+#define LOONGARCH_CSR_KS4		0x34
+#define LOONGARCH_CSR_KS5		0x35
+#define LOONGARCH_CSR_KS6		0x36
+#define LOONGARCH_CSR_KS7		0x37
+#define LOONGARCH_CSR_KS8		0x38
+
+/* Exception allocated KS0, KS1 and KS2 statically */
+#define EXCEPTION_KS0			LOONGARCH_CSR_KS0
+#define EXCEPTION_KS1			LOONGARCH_CSR_KS1
+#define EXCEPTION_KS2			LOONGARCH_CSR_KS2
+#define EXC_KSCRATCH_MASK		(1 << 0 | 1 << 1 | 1 << 2)
+
+/* Percpu-data base allocated KS3 statically */
+#define PERCPU_BASE_KS			LOONGARCH_CSR_KS3
+#define PERCPU_KSCRATCH_MASK		(1 << 3)
+
+/* KVM allocated KS4 and KS5 statically */
+#define KVM_VCPU_KS			LOONGARCH_CSR_KS4
+#define KVM_TEMP_KS			LOONGARCH_CSR_KS5
+#define KVM_KSCRATCH_MASK		(1 << 4 | 1 << 5)
+
+/* Timer registers */
+#define LOONGARCH_CSR_TMID		0x40	/* Timer ID */
+
+#define LOONGARCH_CSR_TCFG		0x41	/* Timer config */
+#define  CSR_TCFG_VAL_SHIFT		2
+#define	 CSR_TCFG_VAL_WIDTH		48
+#define  CSR_TCFG_VAL			(_ULCAST_(0x3fffffffffff) << CSR_TCFG_VAL_SHIFT)
+#define  CSR_TCFG_PERIOD_SHIFT		1
+#define  CSR_TCFG_PERIOD		(_ULCAST_(0x1) << CSR_TCFG_PERIOD_SHIFT)
+#define  CSR_TCFG_EN			(_ULCAST_(0x1))
+
+#define LOONGARCH_CSR_TVAL		0x42	/* Timer value */
+
+#define LOONGARCH_CSR_CNTC		0x43	/* Timer offset */
+
+#define LOONGARCH_CSR_TINTCLR		0x44	/* Timer interrupt clear */
+#define  CSR_TINTCLR_TI_SHIFT		0
+#define  CSR_TINTCLR_TI			(1 << CSR_TINTCLR_TI_SHIFT)
+
+/* Guest registers */
+#define LOONGARCH_CSR_GSTAT		0x50	/* Guest status */
+#define  CSR_GSTAT_GID_SHIFT		16
+#define  CSR_GSTAT_GID_WIDTH		8
+#define  CSR_GSTAT_GID			(_ULCAST_(0xff) << CSR_GSTAT_GID_SHIFT)
+#define  CSR_GSTAT_GIDBIT_SHIFT		4
+#define  CSR_GSTAT_GIDBIT_WIDTH		6
+#define  CSR_GSTAT_GIDBIT		(_ULCAST_(0x3f) << CSR_GSTAT_GIDBIT_SHIFT)
+#define  CSR_GSTAT_PVM_SHIFT		1
+#define  CSR_GSTAT_PVM			(_ULCAST_(0x1) << CSR_GSTAT_PVM_SHIFT)
+#define  CSR_GSTAT_VM_SHIFT		0
+#define  CSR_GSTAT_VM			(_ULCAST_(0x1) << CSR_GSTAT_VM_SHIFT)
+
+#define LOONGARCH_CSR_GCFG		0x51	/* Guest config */
+#define  CSR_GCFG_GPERF_SHIFT		24
+#define  CSR_GCFG_GPERF_WIDTH		3
+#define  CSR_GCFG_GPERF			(_ULCAST_(0x7) << CSR_GCFG_GPERF_SHIFT)
+#define  CSR_GCFG_GCI_SHIFT		20
+#define  CSR_GCFG_GCI_WIDTH		2
+#define  CSR_GCFG_GCI			(_ULCAST_(0x3) << CSR_GCFG_GCI_SHIFT)
+#define  CSR_GCFG_GCI_ALL		(_ULCAST_(0x0) << CSR_GCFG_GCI_SHIFT)
+#define  CSR_GCFG_GCI_HIT		(_ULCAST_(0x1) << CSR_GCFG_GCI_SHIFT)
+#define  CSR_GCFG_GCI_SECURE		(_ULCAST_(0x2) << CSR_GCFG_GCI_SHIFT)
+#define  CSR_GCFG_GCIP_SHIFT		16
+#define  CSR_GCFG_GCIP			(_ULCAST_(0xf) << CSR_GCFG_GCIP_SHIFT)
+#define  CSR_GCFG_GCIP_ALL		(_ULCAST_(0x1) << CSR_GCFG_GCIP_SHIFT)
+#define  CSR_GCFG_GCIP_HIT		(_ULCAST_(0x1) << (CSR_GCFG_GCIP_SHIFT + 1))
+#define  CSR_GCFG_GCIP_SECURE		(_ULCAST_(0x1) << (CSR_GCFG_GCIP_SHIFT + 2))
+#define  CSR_GCFG_TORU_SHIFT		15
+#define  CSR_GCFG_TORU			(_ULCAST_(0x1) << CSR_GCFG_TORU_SHIFT)
+#define  CSR_GCFG_TORUP_SHIFT		14
+#define  CSR_GCFG_TORUP			(_ULCAST_(0x1) << CSR_GCFG_TORUP_SHIFT)
+#define  CSR_GCFG_TOP_SHIFT		13
+#define  CSR_GCFG_TOP			(_ULCAST_(0x1) << CSR_GCFG_TOP_SHIFT)
+#define  CSR_GCFG_TOPP_SHIFT		12
+#define  CSR_GCFG_TOPP			(_ULCAST_(0x1) << CSR_GCFG_TOPP_SHIFT)
+#define  CSR_GCFG_TOE_SHIFT		11
+#define  CSR_GCFG_TOE			(_ULCAST_(0x1) << CSR_GCFG_TOE_SHIFT)
+#define  CSR_GCFG_TOEP_SHIFT		10
+#define  CSR_GCFG_TOEP			(_ULCAST_(0x1) << CSR_GCFG_TOEP_SHIFT)
+#define  CSR_GCFG_TIT_SHIFT		9
+#define  CSR_GCFG_TIT			(_ULCAST_(0x1) << CSR_GCFG_TIT_SHIFT)
+#define  CSR_GCFG_TITP_SHIFT		8
+#define  CSR_GCFG_TITP			(_ULCAST_(0x1) << CSR_GCFG_TITP_SHIFT)
+#define  CSR_GCFG_SIT_SHIFT		7
+#define  CSR_GCFG_SIT			(_ULCAST_(0x1) << CSR_GCFG_SIT_SHIFT)
+#define  CSR_GCFG_SITP_SHIFT		6
+#define  CSR_GCFG_SITP			(_ULCAST_(0x1) << CSR_GCFG_SITP_SHIFT)
+#define  CSR_GCFG_MATC_SHITF		4
+#define  CSR_GCFG_MATC_WIDTH		2
+#define  CSR_GCFG_MATC_MASK		(_ULCAST_(0x3) << CSR_GCFG_MATC_SHITF)
+#define  CSR_GCFG_MATC_GUEST		(_ULCAST_(0x0) << CSR_GCFG_MATC_SHITF)
+#define  CSR_GCFG_MATC_ROOT		(_ULCAST_(0x1) << CSR_GCFG_MATC_SHITF)
+#define  CSR_GCFG_MATC_NEST		(_ULCAST_(0x2) << CSR_GCFG_MATC_SHITF)
+
+#define LOONGARCH_CSR_GINTC		0x52	/* Guest interrupt control */
+#define  CSR_GINTC_HC_SHIFT		16
+#define  CSR_GINTC_HC_WIDTH		8
+#define  CSR_GINTC_HC			(_ULCAST_(0xff) << CSR_GINTC_HC_SHIFT)
+#define  CSR_GINTC_PIP_SHIFT		8
+#define  CSR_GINTC_PIP_WIDTH		8
+#define  CSR_GINTC_PIP			(_ULCAST_(0xff) << CSR_GINTC_PIP_SHIFT)
+#define  CSR_GINTC_VIP_SHIFT		0
+#define  CSR_GINTC_VIP_WIDTH		8
+#define  CSR_GINTC_VIP			(_ULCAST_(0xff))
+
+#define LOONGARCH_CSR_GCNTC		0x53	/* Guest timer offset */
+
+/* LLBCTL register */
+#define LOONGARCH_CSR_LLBCTL		0x60	/* LLBit control */
+#define  CSR_LLBCTL_ROLLB_SHIFT		0
+#define  CSR_LLBCTL_ROLLB		(_ULCAST_(1) << CSR_LLBCTL_ROLLB_SHIFT)
+#define  CSR_LLBCTL_WCLLB_SHIFT		1
+#define  CSR_LLBCTL_WCLLB		(_ULCAST_(1) << CSR_LLBCTL_WCLLB_SHIFT)
+#define  CSR_LLBCTL_KLO_SHIFT		2
+#define  CSR_LLBCTL_KLO			(_ULCAST_(1) << CSR_LLBCTL_KLO_SHIFT)
+
+/* Implement dependent */
+#define LOONGARCH_CSR_IMPCTL1		0x80	/* Loongson config1 */
+#define  CSR_MISPEC_SHIFT		20
+#define  CSR_MISPEC_WIDTH		8
+#define  CSR_MISPEC			(_ULCAST_(0xff) << CSR_MISPEC_SHIFT)
+#define  CSR_SSEN_SHIFT			18
+#define  CSR_SSEN			(_ULCAST_(1) << CSR_SSEN_SHIFT)
+#define  CSR_SCRAND_SHIFT		17
+#define  CSR_SCRAND			(_ULCAST_(1) << CSR_SCRAND_SHIFT)
+#define  CSR_LLEXCL_SHIFT		16
+#define  CSR_LLEXCL			(_ULCAST_(1) << CSR_LLEXCL_SHIFT)
+#define  CSR_DISVC_SHIFT		15
+#define  CSR_DISVC			(_ULCAST_(1) << CSR_DISVC_SHIFT)
+#define  CSR_VCLRU_SHIFT		14
+#define  CSR_VCLRU			(_ULCAST_(1) << CSR_VCLRU_SHIFT)
+#define  CSR_DCLRU_SHIFT		13
+#define  CSR_DCLRU			(_ULCAST_(1) << CSR_DCLRU_SHIFT)
+#define  CSR_FASTLDQ_SHIFT		12
+#define  CSR_FASTLDQ			(_ULCAST_(1) << CSR_FASTLDQ_SHIFT)
+#define  CSR_USERCAC_SHIFT		11
+#define  CSR_USERCAC			(_ULCAST_(1) << CSR_USERCAC_SHIFT)
+#define  CSR_ANTI_MISPEC_SHIFT		10
+#define  CSR_ANTI_MISPEC		(_ULCAST_(1) << CSR_ANTI_MISPEC_SHIFT)
+#define  CSR_AUTO_FLUSHSFB_SHIFT	9
+#define  CSR_AUTO_FLUSHSFB		(_ULCAST_(1) << CSR_AUTO_FLUSHSFB_SHIFT)
+#define  CSR_STFILL_SHIFT		8
+#define  CSR_STFILL			(_ULCAST_(1) << CSR_STFILL_SHIFT)
+#define  CSR_LIFEP_SHIFT		7
+#define  CSR_LIFEP			(_ULCAST_(1) << CSR_LIFEP_SHIFT)
+#define  CSR_LLSYNC_SHIFT		6
+#define  CSR_LLSYNC			(_ULCAST_(1) << CSR_LLSYNC_SHIFT)
+#define  CSR_BRBTDIS_SHIFT		5
+#define  CSR_BRBTDIS			(_ULCAST_(1) << CSR_BRBTDIS_SHIFT)
+#define  CSR_RASDIS_SHIFT		4
+#define  CSR_RASDIS			(_ULCAST_(1) << CSR_RASDIS_SHIFT)
+#define  CSR_STPRE_SHIFT		2
+#define  CSR_STPRE_WIDTH		2
+#define  CSR_STPRE			(_ULCAST_(3) << CSR_STPRE_SHIFT)
+#define  CSR_INSTPRE_SHIFT		1
+#define  CSR_INSTPRE			(_ULCAST_(1) << CSR_INSTPRE_SHIFT)
+#define  CSR_DATAPRE_SHIFT		0
+#define  CSR_DATAPRE			(_ULCAST_(1) << CSR_DATAPRE_SHIFT)
+
+#define LOONGARCH_CSR_IMPCTL2		0x81	/* Loongson config2 */
+#define  CSR_FLUSH_MTLB_SHIFT		0
+#define  CSR_FLUSH_MTLB			(_ULCAST_(1) << CSR_FLUSH_MTLB_SHIFT)
+#define  CSR_FLUSH_STLB_SHIFT		1
+#define  CSR_FLUSH_STLB			(_ULCAST_(1) << CSR_FLUSH_STLB_SHIFT)
+#define  CSR_FLUSH_DTLB_SHIFT		2
+#define  CSR_FLUSH_DTLB			(_ULCAST_(1) << CSR_FLUSH_DTLB_SHIFT)
+#define  CSR_FLUSH_ITLB_SHIFT		3
+#define  CSR_FLUSH_ITLB			(_ULCAST_(1) << CSR_FLUSH_ITLB_SHIFT)
+#define  CSR_FLUSH_BTAC_SHIFT		4
+#define  CSR_FLUSH_BTAC			(_ULCAST_(1) << CSR_FLUSH_BTAC_SHIFT)
+
+#define LOONGARCH_CSR_GNMI		0x82
+
+/* TLB Refill registers */
+#define LOONGARCH_CSR_TLBRENTRY		0x88	/* TLB refill exception entry */
+#define LOONGARCH_CSR_TLBRBADV		0x89	/* TLB refill badvaddr */
+#define LOONGARCH_CSR_TLBRERA		0x8a	/* TLB refill ERA */
+#define LOONGARCH_CSR_TLBRSAVE		0x8b	/* KScratch for TLB refill exception */
+#define LOONGARCH_CSR_TLBRELO0		0x8c	/* TLB refill entrylo0 */
+#define LOONGARCH_CSR_TLBRELO1		0x8d	/* TLB refill entrylo1 */
+#define LOONGARCH_CSR_TLBREHI		0x8e	/* TLB refill entryhi */
+#define  CSR_TLBREHI_PS_SHIFT		0
+#define  CSR_TLBREHI_PS			(_ULCAST_(0x3f) << CSR_TLBREHI_PS_SHIFT)
+#define LOONGARCH_CSR_TLBRPRMD		0x8f	/* TLB refill mode info */
+
+/* Machine Error registers */
+#define LOONGARCH_CSR_MERRCTL		0x90	/* MERRCTL */
+#define LOONGARCH_CSR_MERRINFO1		0x91	/* MError info1 */
+#define LOONGARCH_CSR_MERRINFO2		0x92	/* MError info2 */
+#define LOONGARCH_CSR_MERRENTRY		0x93	/* MError exception entry */
+#define LOONGARCH_CSR_MERRERA		0x94	/* MError exception ERA */
+#define LOONGARCH_CSR_MERRSAVE		0x95	/* KScratch for machine error exception */
+
+#define LOONGARCH_CSR_CTAG		0x98	/* TagLo + TagHi */
+
+#define LOONGARCH_CSR_PRID		0xc0
+
+/* Shadow MCSR : 0xc0 ~ 0xff */
+#define LOONGARCH_CSR_MCSR0		0xc0	/* CPUCFG0 and CPUCFG1 */
+#define  MCSR0_INT_IMPL_SHIFT		58
+#define  MCSR0_INT_IMPL			0
+#define  MCSR0_IOCSR_BRD_SHIFT		57
+#define  MCSR0_IOCSR_BRD		(_ULCAST_(1) << MCSR0_IOCSR_BRD_SHIFT)
+#define  MCSR0_HUGEPG_SHIFT		56
+#define  MCSR0_HUGEPG			(_ULCAST_(1) << MCSR0_HUGEPG_SHIFT)
+#define  MCSR0_RPLMTLB_SHIFT		55
+#define  MCSR0_RPLMTLB			(_ULCAST_(1) << MCSR0_RPLMTLB_SHIFT)
+#define  MCSR0_EP_SHIFT			54
+#define  MCSR0_EP			(_ULCAST_(1) << MCSR0_EP_SHIFT)
+#define  MCSR0_RI_SHIFT			53
+#define  MCSR0_RI			(_ULCAST_(1) << MCSR0_RI_SHIFT)
+#define  MCSR0_UAL_SHIFT		52
+#define  MCSR0_UAL			(_ULCAST_(1) << MCSR0_UAL_SHIFT)
+#define  MCSR0_VABIT_SHIFT		44
+#define  MCSR0_VABIT_WIDTH		8
+#define  MCSR0_VABIT			(_ULCAST_(0xff) << MCSR0_VABIT_SHIFT)
+#define  VABIT_DEFAULT			0x2f
+#define  MCSR0_PABIT_SHIFT		36
+#define  MCSR0_PABIT_WIDTH		8
+#define  MCSR0_PABIT			(_ULCAST_(0xff) << MCSR0_PABIT_SHIFT)
+#define  PABIT_DEFAULT			0x2f
+#define  MCSR0_IOCSR_SHIFT		35
+#define  MCSR0_IOCSR			(_ULCAST_(1) << MCSR0_IOCSR_SHIFT)
+#define  MCSR0_PAGING_SHIFT		34
+#define  MCSR0_PAGING			(_ULCAST_(1) << MCSR0_PAGING_SHIFT)
+#define  MCSR0_GR64_SHIFT		33
+#define  MCSR0_GR64			(_ULCAST_(1) << MCSR0_GR64_SHIFT)
+#define  GR64_DEFAULT			1
+#define  MCSR0_GR32_SHIFT		32
+#define  MCSR0_GR32			(_ULCAST_(1) << MCSR0_GR32_SHIFT)
+#define  GR32_DEFAULT			0
+#define  MCSR0_PRID_WIDTH		32
+#define  MCSR0_PRID			0x14C010
+
+#define LOONGARCH_CSR_MCSR1		0xc1	/* CPUCFG2 and CPUCFG3 */
+#define  MCSR1_HPFOLD_SHIFT		43
+#define  MCSR1_HPFOLD			(_ULCAST_(1) << MCSR1_HPFOLD_SHIFT)
+#define  MCSR1_SPW_LVL_SHIFT		40
+#define  MCSR1_SPW_LVL_WIDTH		3
+#define  MCSR1_SPW_LVL			(_ULCAST_(7) << MCSR1_SPW_LVL_SHIFT)
+#define  MCSR1_ICACHET_SHIFT		39
+#define  MCSR1_ICACHET			(_ULCAST_(1) << MCSR1_ICACHET_SHIFT)
+#define  MCSR1_ITLBT_SHIFT		38
+#define  MCSR1_ITLBT			(_ULCAST_(1) << MCSR1_ITLBT_SHIFT)
+#define  MCSR1_LLDBAR_SHIFT		37
+#define  MCSR1_LLDBAR			(_ULCAST_(1) << MCSR1_LLDBAR_SHIFT)
+#define  MCSR1_SCDLY_SHIFT		36
+#define  MCSR1_SCDLY			(_ULCAST_(1) << MCSR1_SCDLY_SHIFT)
+#define  MCSR1_LLEXC_SHIFT		35
+#define  MCSR1_LLEXC			(_ULCAST_(1) << MCSR1_LLEXC_SHIFT)
+#define  MCSR1_UCACC_SHIFT		34
+#define  MCSR1_UCACC			(_ULCAST_(1) << MCSR1_UCACC_SHIFT)
+#define  MCSR1_SFB_SHIFT		33
+#define  MCSR1_SFB			(_ULCAST_(1) << MCSR1_SFB_SHIFT)
+#define  MCSR1_CCDMA_SHIFT		32
+#define  MCSR1_CCDMA			(_ULCAST_(1) << MCSR1_CCDMA_SHIFT)
+#define  MCSR1_LAMO_SHIFT		22
+#define  MCSR1_LAMO			(_ULCAST_(1) << MCSR1_LAMO_SHIFT)
+#define  MCSR1_LSPW_SHIFT		21
+#define  MCSR1_LSPW			(_ULCAST_(1) << MCSR1_LSPW_SHIFT)
+#define  MCSR1_MIPSBT_SHIFT		20
+#define  MCSR1_MIPSBT			(_ULCAST_(1) << MCSR1_MIPSBT_SHIFT)
+#define  MCSR1_ARMBT_SHIFT		19
+#define  MCSR1_ARMBT			(_ULCAST_(1) << MCSR1_ARMBT_SHIFT)
+#define  MCSR1_X86BT_SHIFT		18
+#define  MCSR1_X86BT			(_ULCAST_(1) << MCSR1_X86BT_SHIFT)
+#define  MCSR1_LLFTPVERS_SHIFT		15
+#define  MCSR1_LLFTPVERS_WIDTH		3
+#define  MCSR1_LLFTPVERS		(_ULCAST_(7) << MCSR1_LLFTPVERS_SHIFT)
+#define  MCSR1_LLFTP_SHIFT		14
+#define  MCSR1_LLFTP			(_ULCAST_(1) << MCSR1_LLFTP_SHIFT)
+#define  MCSR1_VZVERS_SHIFT		11
+#define  MCSR1_VZVERS_WIDTH		3
+#define  MCSR1_VZVERS			(_ULCAST_(7) << MCSR1_VZVERS_SHIFT)
+#define  MCSR1_VZ_SHIFT			10
+#define  MCSR1_VZ			(_ULCAST_(1) << MCSR1_VZ_SHIFT)
+#define  MCSR1_CRYPTO_SHIFT		9
+#define  MCSR1_CRYPTO			(_ULCAST_(1) << MCSR1_CRYPTO_SHIFT)
+#define  MCSR1_COMPLEX_SHIFT		8
+#define  MCSR1_COMPLEX			(_ULCAST_(1) << MCSR1_COMPLEX_SHIFT)
+#define  MCSR1_LASX_SHIFT		7
+#define  MCSR1_LASX			(_ULCAST_(1) << MCSR1_LASX_SHIFT)
+#define  MCSR1_LSX_SHIFT		6
+#define  MCSR1_LSX			(_ULCAST_(1) << MCSR1_LSX_SHIFT)
+#define  MCSR1_FPVERS_SHIFT		3
+#define  MCSR1_FPVERS_WIDTH		3
+#define  MCSR1_FPVERS			(_ULCAST_(7) << MCSR1_FPVERS_SHIFT)
+#define  MCSR1_FPDP_SHIFT		2
+#define  MCSR1_FPDP			(_ULCAST_(1) << MCSR1_FPDP_SHIFT)
+#define  MCSR1_FPSP_SHIFT		1
+#define  MCSR1_FPSP			(_ULCAST_(1) << MCSR1_FPSP_SHIFT)
+#define  MCSR1_FP_SHIFT			0
+#define  MCSR1_FP			(_ULCAST_(1) << MCSR1_FP_SHIFT)
+
+#define LOONGARCH_CSR_MCSR2		0xc2	/* CPUCFG4 and CPUCFG5 */
+#define  MCSR2_CCDIV_SHIFT		48
+#define  MCSR2_CCDIV_WIDTH		16
+#define  MCSR2_CCDIV			(_ULCAST_(0xffff) << MCSR2_CCDIV_SHIFT)
+#define  MCSR2_CCMUL_SHIFT		32
+#define  MCSR2_CCMUL_WIDTH		16
+#define  MCSR2_CCMUL			(_ULCAST_(0xffff) << MCSR2_CCMUL_SHIFT)
+#define  MCSR2_CCFREQ_WIDTH		32
+#define  MCSR2_CCFREQ			(_ULCAST_(0xffffffff))
+#define  CCFREQ_DEFAULT			0x5f5e100	/* 100MHz */
+
+#define LOONGARCH_CSR_MCSR3		0xc3	/* CPUCFG6 */
+#define  MCSR3_UPM_SHIFT		14
+#define  MCSR3_UPM			(_ULCAST_(1) << MCSR3_UPM_SHIFT)
+#define  MCSR3_PMBITS_SHIFT		8
+#define  MCSR3_PMBITS_WIDTH		6
+#define  MCSR3_PMBITS			(_ULCAST_(0x3f) << MCSR3_PMBITS_SHIFT)
+#define  PMBITS_DEFAULT			0x40
+#define  MCSR3_PMNUM_SHIFT		4
+#define  MCSR3_PMNUM_WIDTH		4
+#define  MCSR3_PMNUM			(_ULCAST_(0xf) << MCSR3_PMNUM_SHIFT)
+#define  MCSR3_PAMVER_SHIFT		1
+#define  MCSR3_PAMVER_WIDTH		3
+#define  MCSR3_PAMVER			(_ULCAST_(0x7) << MCSR3_PAMVER_SHIFT)
+#define  MCSR3_PMP_SHIFT		0
+#define  MCSR3_PMP			(_ULCAST_(1) << MCSR3_PMP_SHIFT)
+
+#define LOONGARCH_CSR_MCSR8		0xc8	/* CPUCFG16 and CPUCFG17 */
+#define  MCSR8_L1I_SIZE_SHIFT		56
+#define  MCSR8_L1I_SIZE_WIDTH		7
+#define  MCSR8_L1I_SIZE			(_ULCAST_(0x7f) << MCSR8_L1I_SIZE_SHIFT)
+#define  MCSR8_L1I_IDX_SHIFT		48
+#define  MCSR8_L1I_IDX_WIDTH		8
+#define  MCSR8_L1I_IDX			(_ULCAST_(0xff) << MCSR8_L1I_IDX_SHIFT)
+#define  MCSR8_L1I_WAY_SHIFT		32
+#define  MCSR8_L1I_WAY_WIDTH		16
+#define  MCSR8_L1I_WAY			(_ULCAST_(0xffff) << MCSR8_L1I_WAY_SHIFT)
+#define  MCSR8_L3DINCL_SHIFT		16
+#define  MCSR8_L3DINCL			(_ULCAST_(1) << MCSR8_L3DINCL_SHIFT)
+#define  MCSR8_L3DPRIV_SHIFT		15
+#define  MCSR8_L3DPRIV			(_ULCAST_(1) << MCSR8_L3DPRIV_SHIFT)
+#define  MCSR8_L3DPRE_SHIFT		14
+#define  MCSR8_L3DPRE			(_ULCAST_(1) << MCSR8_L3DPRE_SHIFT)
+#define  MCSR8_L3IUINCL_SHIFT		13
+#define  MCSR8_L3IUINCL			(_ULCAST_(1) << MCSR8_L3IUINCL_SHIFT)
+#define  MCSR8_L3IUPRIV_SHIFT		12
+#define  MCSR8_L3IUPRIV			(_ULCAST_(1) << MCSR8_L3IUPRIV_SHIFT)
+#define  MCSR8_L3IUUNIFY_SHIFT		11
+#define  MCSR8_L3IUUNIFY		(_ULCAST_(1) << MCSR8_L3IUUNIFY_SHIFT)
+#define  MCSR8_L3IUPRE_SHIFT		10
+#define  MCSR8_L3IUPRE			(_ULCAST_(1) << MCSR8_L3IUPRE_SHIFT)
+#define  MCSR8_L2DINCL_SHIFT		9
+#define  MCSR8_L2DINCL			(_ULCAST_(1) << MCSR8_L2DINCL_SHIFT)
+#define  MCSR8_L2DPRIV_SHIFT		8
+#define  MCSR8_L2DPRIV			(_ULCAST_(1) << MCSR8_L2DPRIV_SHIFT)
+#define  MCSR8_L2DPRE_SHIFT		7
+#define  MCSR8_L2DPRE			(_ULCAST_(1) << MCSR8_L2DPRE_SHIFT)
+#define  MCSR8_L2IUINCL_SHIFT		6
+#define  MCSR8_L2IUINCL			(_ULCAST_(1) << MCSR8_L2IUINCL_SHIFT)
+#define  MCSR8_L2IUPRIV_SHIFT		5
+#define  MCSR8_L2IUPRIV			(_ULCAST_(1) << MCSR8_L2IUPRIV_SHIFT)
+#define  MCSR8_L2IUUNIFY_SHIFT		4
+#define  MCSR8_L2IUUNIFY		(_ULCAST_(1) << MCSR8_L2IUUNIFY_SHIFT)
+#define  MCSR8_L2IUPRE_SHIFT		3
+#define  MCSR8_L2IUPRE			(_ULCAST_(1) << MCSR8_L2IUPRE_SHIFT)
+#define  MCSR8_L1DPRE_SHIFT		2
+#define  MCSR8_L1DPRE			(_ULCAST_(1) << MCSR8_L1DPRE_SHIFT)
+#define  MCSR8_L1IUUNIFY_SHIFT		1
+#define  MCSR8_L1IUUNIFY		(_ULCAST_(1) << MCSR8_L1IUUNIFY_SHIFT)
+#define  MCSR8_L1IUPRE_SHIFT		0
+#define  MCSR8_L1IUPRE			(_ULCAST_(1) << MCSR8_L1IUPRE_SHIFT)
+
+#define LOONGARCH_CSR_MCSR9		0xc9	/* CPUCFG18 and CPUCFG19 */
+#define  MCSR9_L2U_SIZE_SHIFT		56
+#define  MCSR9_L2U_SIZE_WIDTH		7
+#define  MCSR9_L2U_SIZE			(_ULCAST_(0x7f) << MCSR9_L2U_SIZE_SHIFT)
+#define  MCSR9_L2U_IDX_SHIFT		48
+#define  MCSR9_L2U_IDX_WIDTH		8
+#define  MCSR9_L2U_IDX			(_ULCAST_(0xff) << MCSR9_IDX_LOG_SHIFT)
+#define  MCSR9_L2U_WAY_SHIFT		32
+#define  MCSR9_L2U_WAY_WIDTH		16
+#define  MCSR9_L2U_WAY			(_ULCAST_(0xffff) << MCSR9_L2U_WAY_SHIFT)
+#define  MCSR9_L1D_SIZE_SHIFT		24
+#define  MCSR9_L1D_SIZE_WIDTH		7
+#define  MCSR9_L1D_SIZE			(_ULCAST_(0x7f) << MCSR9_L1D_SIZE_SHIFT)
+#define  MCSR9_L1D_IDX_SHIFT		16
+#define  MCSR9_L1D_IDX_WIDTH		8
+#define  MCSR9_L1D_IDX			(_ULCAST_(0xff) << MCSR9_L1D_IDX_SHIFT)
+#define  MCSR9_L1D_WAY_SHIFT		0
+#define  MCSR9_L1D_WAY_WIDTH		16
+#define  MCSR9_L1D_WAY			(_ULCAST_(0xffff) << MCSR9_L1D_WAY_SHIFT)
+
+#define LOONGARCH_CSR_MCSR10		0xca	/* CPUCFG20 */
+#define  MCSR10_L3U_SIZE_SHIFT		24
+#define  MCSR10_L3U_SIZE_WIDTH		7
+#define  MCSR10_L3U_SIZE		(_ULCAST_(0x7f) << MCSR10_L3U_SIZE_SHIFT)
+#define  MCSR10_L3U_IDX_SHIFT		16
+#define  MCSR10_L3U_IDX_WIDTH		8
+#define  MCSR10_L3U_IDX			(_ULCAST_(0xff) << MCSR10_L3U_IDX_SHIFT)
+#define  MCSR10_L3U_WAY_SHIFT		0
+#define  MCSR10_L3U_WAY_WIDTH		16
+#define  MCSR10_L3U_WAY			(_ULCAST_(0xffff) << MCSR10_L3U_WAY_SHIFT)
+
+#define LOONGARCH_CSR_MCSR24		0xf0	/* cpucfg48 */
+#define  MCSR24_RAMCG_SHIFT		3
+#define  MCSR24_RAMCG			(_ULCAST_(1) << MCSR24_RAMCG_SHIFT)
+#define  MCSR24_VFPUCG_SHIFT		2
+#define  MCSR24_VFPUCG			(_ULCAST_(1) << MCSR24_VFPUCG_SHIFT)
+#define  MCSR24_NAPEN_SHIFT		1
+#define  MCSR24_NAPEN			(_ULCAST_(1) << MCSR24_NAPEN_SHIFT)
+#define  MCSR24_MCSRLOCK_SHIFT		0
+#define  MCSR24_MCSRLOCK		(_ULCAST_(1) << MCSR24_MCSRLOCK_SHIFT)
+
+/* Uncached accelerate windows registers */
+#define LOONGARCH_CSR_UCAWIN		0x100
+#define LOONGARCH_CSR_UCAWIN0_LO	0x102
+#define LOONGARCH_CSR_UCAWIN0_HI	0x103
+#define LOONGARCH_CSR_UCAWIN1_LO	0x104
+#define LOONGARCH_CSR_UCAWIN1_HI	0x105
+#define LOONGARCH_CSR_UCAWIN2_LO	0x106
+#define LOONGARCH_CSR_UCAWIN2_HI	0x107
+#define LOONGARCH_CSR_UCAWIN3_LO	0x108
+#define LOONGARCH_CSR_UCAWIN3_HI	0x109
+
+/* Direct Map windows registers */
+#define LOONGARCH_CSR_DMWIN0		0x180	/* 64 direct map win0: MEM & IF */
+#define LOONGARCH_CSR_DMWIN1		0x181	/* 64 direct map win1: MEM & IF */
+#define LOONGARCH_CSR_DMWIN2		0x182	/* 64 direct map win2: MEM */
+#define LOONGARCH_CSR_DMWIN3		0x183	/* 64 direct map win3: MEM */
+
+/* Direct Map window 0/1 */
+#define CSR_DMW0_PLV0		_CONST64_(1 << 0)
+#define CSR_DMW0_VSEG		_CONST64_(0x8000)
+#define CSR_DMW0_BASE		(CSR_DMW0_VSEG << DMW_PABITS)
+#define CSR_DMW0_INIT		(CSR_DMW0_BASE | CSR_DMW0_PLV0)
+
+#define CSR_DMW1_PLV0		_CONST64_(1 << 0)
+#define CSR_DMW1_MAT		_CONST64_(1 << 4)
+#define CSR_DMW1_VSEG		_CONST64_(0x9000)
+#define CSR_DMW1_BASE		(CSR_DMW1_VSEG << DMW_PABITS)
+#define CSR_DMW1_INIT		(CSR_DMW1_BASE | CSR_DMW1_MAT | CSR_DMW1_PLV0)
+
+/* Performance Counter registers */
+#define LOONGARCH_CSR_PERFCTRL0		0x200	/* 32 perf event 0 config */
+#define LOONGARCH_CSR_PERFCNTR0		0x201	/* 64 perf event 0 count value */
+#define LOONGARCH_CSR_PERFCTRL1		0x202	/* 32 perf event 1 config */
+#define LOONGARCH_CSR_PERFCNTR1		0x203	/* 64 perf event 1 count value */
+#define LOONGARCH_CSR_PERFCTRL2		0x204	/* 32 perf event 2 config */
+#define LOONGARCH_CSR_PERFCNTR2		0x205	/* 64 perf event 2 count value */
+#define LOONGARCH_CSR_PERFCTRL3		0x206	/* 32 perf event 3 config */
+#define LOONGARCH_CSR_PERFCNTR3		0x207	/* 64 perf event 3 count value */
+#define  CSR_PERFCTRL_PLV0		(_ULCAST_(1) << 16)
+#define  CSR_PERFCTRL_PLV1		(_ULCAST_(1) << 17)
+#define  CSR_PERFCTRL_PLV2		(_ULCAST_(1) << 18)
+#define  CSR_PERFCTRL_PLV3		(_ULCAST_(1) << 19)
+#define  CSR_PERFCTRL_IE		(_ULCAST_(1) << 20)
+#define  CSR_PERFCTRL_EVENT		0x3ff
+
+/* Debug registers */
+#define LOONGARCH_CSR_MWPC		0x300	/* data breakpoint config */
+#define LOONGARCH_CSR_MWPS		0x301	/* data breakpoint status */
+
+#define LOONGARCH_CSR_DB0ADDR		0x310	/* data breakpoint 0 address */
+#define LOONGARCH_CSR_DB0MASK		0x311	/* data breakpoint 0 mask */
+#define LOONGARCH_CSR_DB0CTL		0x312	/* data breakpoint 0 control */
+#define LOONGARCH_CSR_DB0ASID		0x313	/* data breakpoint 0 asid */
+
+#define LOONGARCH_CSR_DB1ADDR		0x318	/* data breakpoint 1 address */
+#define LOONGARCH_CSR_DB1MASK		0x319	/* data breakpoint 1 mask */
+#define LOONGARCH_CSR_DB1CTL		0x31a	/* data breakpoint 1 control */
+#define LOONGARCH_CSR_DB1ASID		0x31b	/* data breakpoint 1 asid */
+
+#define LOONGARCH_CSR_DB2ADDR		0x320	/* data breakpoint 2 address */
+#define LOONGARCH_CSR_DB2MASK		0x321	/* data breakpoint 2 mask */
+#define LOONGARCH_CSR_DB2CTL		0x322	/* data breakpoint 2 control */
+#define LOONGARCH_CSR_DB2ASID		0x323	/* data breakpoint 2 asid */
+
+#define LOONGARCH_CSR_DB3ADDR		0x328	/* data breakpoint 3 address */
+#define LOONGARCH_CSR_DB3MASK		0x329	/* data breakpoint 3 mask */
+#define LOONGARCH_CSR_DB3CTL		0x32a	/* data breakpoint 3 control */
+#define LOONGARCH_CSR_DB3ASID		0x32b	/* data breakpoint 3 asid */
+
+#define LOONGARCH_CSR_DB4ADDR		0x330	/* data breakpoint 4 address */
+#define LOONGARCH_CSR_DB4MASK		0x331	/* data breakpoint 4 maks */
+#define LOONGARCH_CSR_DB4CTL		0x332	/* data breakpoint 4 control */
+#define LOONGARCH_CSR_DB4ASID		0x333	/* data breakpoint 4 asid */
+
+#define LOONGARCH_CSR_DB5ADDR		0x338	/* data breakpoint 5 address */
+#define LOONGARCH_CSR_DB5MASK		0x339	/* data breakpoint 5 mask */
+#define LOONGARCH_CSR_DB5CTL		0x33a	/* data breakpoint 5 control */
+#define LOONGARCH_CSR_DB5ASID		0x33b	/* data breakpoint 5 asid */
+
+#define LOONGARCH_CSR_DB6ADDR		0x340	/* data breakpoint 6 address */
+#define LOONGARCH_CSR_DB6MASK		0x341	/* data breakpoint 6 mask */
+#define LOONGARCH_CSR_DB6CTL		0x342	/* data breakpoint 6 control */
+#define LOONGARCH_CSR_DB6ASID		0x343	/* data breakpoint 6 asid */
+
+#define LOONGARCH_CSR_DB7ADDR		0x348	/* data breakpoint 7 address */
+#define LOONGARCH_CSR_DB7MASK		0x349	/* data breakpoint 7 mask */
+#define LOONGARCH_CSR_DB7CTL		0x34a	/* data breakpoint 7 control */
+#define LOONGARCH_CSR_DB7ASID		0x34b	/* data breakpoint 7 asid */
+
+#define LOONGARCH_CSR_FWPC		0x380	/* instruction breakpoint config */
+#define LOONGARCH_CSR_FWPS		0x381	/* instruction breakpoint status */
+
+#define LOONGARCH_CSR_IB0ADDR		0x390	/* inst breakpoint 0 address */
+#define LOONGARCH_CSR_IB0MASK		0x391	/* inst breakpoint 0 mask */
+#define LOONGARCH_CSR_IB0CTL		0x392	/* inst breakpoint 0 control */
+#define LOONGARCH_CSR_IB0ASID		0x393	/* inst breakpoint 0 asid */
+
+#define LOONGARCH_CSR_IB1ADDR		0x398	/* inst breakpoint 1 address */
+#define LOONGARCH_CSR_IB1MASK		0x399	/* inst breakpoint 1 mask */
+#define LOONGARCH_CSR_IB1CTL		0x39a	/* inst breakpoint 1 control */
+#define LOONGARCH_CSR_IB1ASID		0x39b	/* inst breakpoint 1 asid */
+
+#define LOONGARCH_CSR_IB2ADDR		0x3a0	/* inst breakpoint 2 address */
+#define LOONGARCH_CSR_IB2MASK		0x3a1	/* inst breakpoint 2 mask */
+#define LOONGARCH_CSR_IB2CTL		0x3a2	/* inst breakpoint 2 control */
+#define LOONGARCH_CSR_IB2ASID		0x3a3	/* inst breakpoint 2 asid */
+
+#define LOONGARCH_CSR_IB3ADDR		0x3a8	/* inst breakpoint 3 address */
+#define LOONGARCH_CSR_IB3MASK		0x3a9	/* breakpoint 3 mask */
+#define LOONGARCH_CSR_IB3CTL		0x3aa	/* inst breakpoint 3 control */
+#define LOONGARCH_CSR_IB3ASID		0x3ab	/* inst breakpoint 3 asid */
+
+#define LOONGARCH_CSR_IB4ADDR		0x3b0	/* inst breakpoint 4 address */
+#define LOONGARCH_CSR_IB4MASK		0x3b1	/* inst breakpoint 4 mask */
+#define LOONGARCH_CSR_IB4CTL		0x3b2	/* inst breakpoint 4 control */
+#define LOONGARCH_CSR_IB4ASID		0x3b3	/* inst breakpoint 4 asid */
+
+#define LOONGARCH_CSR_IB5ADDR		0x3b8	/* inst breakpoint 5 address */
+#define LOONGARCH_CSR_IB5MASK		0x3b9	/* inst breakpoint 5 mask */
+#define LOONGARCH_CSR_IB5CTL		0x3ba	/* inst breakpoint 5 control */
+#define LOONGARCH_CSR_IB5ASID		0x3bb	/* inst breakpoint 5 asid */
+
+#define LOONGARCH_CSR_IB6ADDR		0x3c0	/* inst breakpoint 6 address */
+#define LOONGARCH_CSR_IB6MASK		0x3c1	/* inst breakpoint 6 mask */
+#define LOONGARCH_CSR_IB6CTL		0x3c2	/* inst breakpoint 6 control */
+#define LOONGARCH_CSR_IB6ASID		0x3c3	/* inst breakpoint 6 asid */
+
+#define LOONGARCH_CSR_IB7ADDR		0x3c8	/* inst breakpoint 7 address */
+#define LOONGARCH_CSR_IB7MASK		0x3c9	/* inst breakpoint 7 mask */
+#define LOONGARCH_CSR_IB7CTL		0x3ca	/* inst breakpoint 7 control */
+#define LOONGARCH_CSR_IB7ASID		0x3cb	/* inst breakpoint 7 asid */
+
+#define LOONGARCH_CSR_DEBUG		0x500	/* debug config */
+#define LOONGARCH_CSR_DERA		0x501	/* debug era */
+#define LOONGARCH_CSR_DESAVE		0x502	/* debug save */
+
+/*
+ * CSR_ECFG IM
+ */
+#define ECFG0_IM		0x00001fff
+#define ECFGB_SIP0		0
+#define ECFGF_SIP0		(_ULCAST_(1) << ECFGB_SIP0)
+#define ECFGB_SIP1		1
+#define ECFGF_SIP1		(_ULCAST_(1) << ECFGB_SIP1)
+#define ECFGB_IP0		2
+#define ECFGF_IP0		(_ULCAST_(1) << ECFGB_IP0)
+#define ECFGB_IP1		3
+#define ECFGF_IP1		(_ULCAST_(1) << ECFGB_IP1)
+#define ECFGB_IP2		4
+#define ECFGF_IP2		(_ULCAST_(1) << ECFGB_IP2)
+#define ECFGB_IP3		5
+#define ECFGF_IP3		(_ULCAST_(1) << ECFGB_IP3)
+#define ECFGB_IP4		6
+#define ECFGF_IP4		(_ULCAST_(1) << ECFGB_IP4)
+#define ECFGB_IP5		7
+#define ECFGF_IP5		(_ULCAST_(1) << ECFGB_IP5)
+#define ECFGB_IP6		8
+#define ECFGF_IP6		(_ULCAST_(1) << ECFGB_IP6)
+#define ECFGB_IP7		9
+#define ECFGF_IP7		(_ULCAST_(1) << ECFGB_IP7)
+#define ECFGB_PMC		10
+#define ECFGF_PMC		(_ULCAST_(1) << ECFGB_PMC)
+#define ECFGB_TIMER		11
+#define ECFGF_TIMER		(_ULCAST_(1) << ECFGB_TIMER)
+#define ECFGB_IPI		12
+#define ECFGF_IPI		(_ULCAST_(1) << ECFGB_IPI)
+#define ECFGF(hwirq)		(_ULCAST_(1) << hwirq)
+
+#define ESTATF_IP		0x00001fff
+
+#define LOONGARCH_IOCSR_FEATURES	0x8
+#define  IOCSRF_TEMP			BIT_ULL(0)
+#define  IOCSRF_NODECNT			BIT_ULL(1)
+#define  IOCSRF_MSI			BIT_ULL(2)
+#define  IOCSRF_EXTIOI			BIT_ULL(3)
+#define  IOCSRF_CSRIPI			BIT_ULL(4)
+#define  IOCSRF_FREQCSR			BIT_ULL(5)
+#define  IOCSRF_FREQSCALE		BIT_ULL(6)
+#define  IOCSRF_DVFSV1			BIT_ULL(7)
+#define  IOCSRF_EIODECODE		BIT_ULL(9)
+#define  IOCSRF_FLATMODE		BIT_ULL(10)
+#define  IOCSRF_VM			BIT_ULL(11)
+
+#define LOONGARCH_IOCSR_VENDOR		0x10
+
+#define LOONGARCH_IOCSR_CPUNAME		0x20
+
+#define LOONGARCH_IOCSR_NODECNT		0x408
+
+#define LOONGARCH_IOCSR_MISC_FUNC	0x420
+#define  IOCSR_MISC_FUNC_TIMER_RESET	BIT_ULL(21)
+#define  IOCSR_MISC_FUNC_EXT_IOI_EN	BIT_ULL(48)
+
+#define LOONGARCH_IOCSR_CPUTEMP		0x428
+
+/* PerCore CSR, only accessible by local cores */
+#define LOONGARCH_IOCSR_IPI_STATUS	0x1000
+#define LOONGARCH_IOCSR_IPI_EN		0x1004
+#define LOONGARCH_IOCSR_IPI_SET		0x1008
+#define LOONGARCH_IOCSR_IPI_CLEAR	0x100c
+#define LOONGARCH_IOCSR_MBUF0		0x1020
+#define LOONGARCH_IOCSR_MBUF1		0x1028
+#define LOONGARCH_IOCSR_MBUF2		0x1030
+#define LOONGARCH_IOCSR_MBUF3		0x1038
+
+#define LOONGARCH_IOCSR_IPI_SEND	0x1040
+#define  IOCSR_IPI_SEND_IP_SHIFT	0
+#define  IOCSR_IPI_SEND_CPU_SHIFT	16
+#define  IOCSR_IPI_SEND_BLOCKING	BIT(31)
+
+#define LOONGARCH_IOCSR_MBUF_SEND	0x1048
+#define  IOCSR_MBUF_SEND_BLOCKING	BIT_ULL(31)
+#define  IOCSR_MBUF_SEND_BOX_SHIFT	2
+#define  IOCSR_MBUF_SEND_BOX_LO(box)	(box << 1)
+#define  IOCSR_MBUF_SEND_BOX_HI(box)	((box << 1) + 1)
+#define  IOCSR_MBUF_SEND_CPU_SHIFT	16
+#define  IOCSR_MBUF_SEND_BUF_SHIFT	32
+#define  IOCSR_MBUF_SEND_H32_MASK	0xFFFFFFFF00000000ULL
+
+#define LOONGARCH_IOCSR_ANY_SEND	0x1158
+#define  IOCSR_ANY_SEND_BLOCKING	BIT_ULL(31)
+#define  IOCSR_ANY_SEND_CPU_SHIFT	16
+#define  IOCSR_ANY_SEND_MASK_SHIFT	27
+#define  IOCSR_ANY_SEND_BUF_SHIFT	32
+#define  IOCSR_ANY_SEND_H32_MASK	0xFFFFFFFF00000000ULL
+
+/* Register offset and bit definition for CSR access */
+#define LOONGARCH_IOCSR_TIMER_CFG       0x1060
+#define LOONGARCH_IOCSR_TIMER_TICK      0x1070
+#define  IOCSR_TIMER_CFG_RESERVED       (_ULCAST_(1) << 63)
+#define  IOCSR_TIMER_CFG_PERIODIC       (_ULCAST_(1) << 62)
+#define  IOCSR_TIMER_CFG_EN             (_ULCAST_(1) << 61)
+#define  IOCSR_TIMER_MASK		0x0ffffffffffffULL
+#define  IOCSR_TIMER_INITVAL_RST        (_ULCAST_(0xffff) << 48)
+
+#define LOONGARCH_IOCSR_EXTIOI_NODEMAP_BASE	0x14a0
+#define LOONGARCH_IOCSR_EXTIOI_IPMAP_BASE	0x14c0
+#define LOONGARCH_IOCSR_EXTIOI_EN_BASE		0x1600
+#define LOONGARCH_IOCSR_EXTIOI_BOUNCE_BASE	0x1680
+#define LOONGARCH_IOCSR_EXTIOI_ISR_BASE		0x1800
+#define LOONGARCH_IOCSR_EXTIOI_ROUTE_BASE	0x1c00
+#define IOCSR_EXTIOI_VECTOR_NUM			256
+
+#ifndef __ASSEMBLY__
+
+static inline u64 drdtime(void)
+{
+	int rID = 0;
+	u64 val = 0;
+
+	__asm__ __volatile__(
+		"rdtime.d %0, %1 \n\t"
+		: "=r"(val), "=r"(rID)
+		:
+		);
+	return val;
+}
+
+static inline unsigned int get_csr_cpuid(void)
+{
+	return csr_readl(LOONGARCH_CSR_CPUID);
+}
+
+static inline void csr_any_send(unsigned int addr, unsigned int data,
+				unsigned int data_mask, unsigned int cpu)
+{
+	uint64_t val = 0;
+
+	val = IOCSR_ANY_SEND_BLOCKING | addr;
+	val |= (cpu << IOCSR_ANY_SEND_CPU_SHIFT);
+	val |= (data_mask << IOCSR_ANY_SEND_MASK_SHIFT);
+	val |= ((uint64_t)data << IOCSR_ANY_SEND_BUF_SHIFT);
+	iocsr_writeq(val, LOONGARCH_IOCSR_ANY_SEND);
+}
+
+static inline unsigned int read_csr_excode(void)
+{
+	return (csr_readl(LOONGARCH_CSR_ESTAT) & CSR_ESTAT_EXC) >> CSR_ESTAT_EXC_SHIFT;
+}
+
+static inline void write_csr_index(unsigned int idx)
+{
+	csr_xchgl(idx, CSR_TLBIDX_IDXM, LOONGARCH_CSR_TLBIDX);
+}
+
+static inline unsigned int read_csr_pagesize(void)
+{
+	return (csr_readl(LOONGARCH_CSR_TLBIDX) & CSR_TLBIDX_SIZEM) >> CSR_TLBIDX_SIZE;
+}
+
+static inline void write_csr_pagesize(unsigned int size)
+{
+	csr_xchgl(size << CSR_TLBIDX_SIZE, CSR_TLBIDX_SIZEM, LOONGARCH_CSR_TLBIDX);
+}
+
+static inline unsigned int read_csr_tlbrefill_pagesize(void)
+{
+	return (csr_readq(LOONGARCH_CSR_TLBREHI) & CSR_TLBREHI_PS) >> CSR_TLBREHI_PS_SHIFT;
+}
+
+static inline void write_csr_tlbrefill_pagesize(unsigned int size)
+{
+	csr_xchgq(size << CSR_TLBREHI_PS_SHIFT, CSR_TLBREHI_PS, LOONGARCH_CSR_TLBREHI);
+}
+
+#define read_csr_asid()			csr_readl(LOONGARCH_CSR_ASID)
+#define write_csr_asid(val)		csr_writel(val, LOONGARCH_CSR_ASID)
+#define read_csr_entryhi()		csr_readq(LOONGARCH_CSR_TLBEHI)
+#define write_csr_entryhi(val)		csr_writeq(val, LOONGARCH_CSR_TLBEHI)
+#define read_csr_entrylo0()		csr_readq(LOONGARCH_CSR_TLBELO0)
+#define write_csr_entrylo0(val)		csr_writeq(val, LOONGARCH_CSR_TLBELO0)
+#define read_csr_entrylo1()		csr_readq(LOONGARCH_CSR_TLBELO1)
+#define write_csr_entrylo1(val)		csr_writeq(val, LOONGARCH_CSR_TLBELO1)
+#define read_csr_ecfg()			csr_readl(LOONGARCH_CSR_ECFG)
+#define write_csr_ecfg(val)		csr_writel(val, LOONGARCH_CSR_ECFG)
+#define read_csr_estat()		csr_readl(LOONGARCH_CSR_ESTAT)
+#define write_csr_estat(val)		csr_writel(val, LOONGARCH_CSR_ESTAT)
+#define read_csr_tlbidx()		csr_readl(LOONGARCH_CSR_TLBIDX)
+#define write_csr_tlbidx(val)		csr_writel(val, LOONGARCH_CSR_TLBIDX)
+#define read_csr_euen()			csr_readl(LOONGARCH_CSR_EUEN)
+#define write_csr_euen(val)		csr_writel(val, LOONGARCH_CSR_EUEN)
+#define read_csr_cpuid()		csr_readl(LOONGARCH_CSR_CPUID)
+#define read_csr_prcfg1()		csr_readq(LOONGARCH_CSR_PRCFG1)
+#define write_csr_prcfg1(val)		csr_writeq(val, LOONGARCH_CSR_PRCFG1)
+#define read_csr_prcfg2()		csr_readq(LOONGARCH_CSR_PRCFG2)
+#define write_csr_prcfg2(val)		csr_writeq(val, LOONGARCH_CSR_PRCFG2)
+#define read_csr_prcfg3()		csr_readq(LOONGARCH_CSR_PRCFG3)
+#define write_csr_prcfg3(val)		csr_writeq(val, LOONGARCH_CSR_PRCFG3)
+#define read_csr_stlbpgsize()		csr_readl(LOONGARCH_CSR_STLBPGSIZE)
+#define write_csr_stlbpgsize(val)	csr_writel(val, LOONGARCH_CSR_STLBPGSIZE)
+#define read_csr_rvacfg()		csr_readl(LOONGARCH_CSR_RVACFG)
+#define write_csr_rvacfg(val)		csr_writel(val, LOONGARCH_CSR_RVACFG)
+#define write_csr_tintclear(val)	csr_writel(val, LOONGARCH_CSR_TINTCLR)
+#define read_csr_impctl1()		csr_readq(LOONGARCH_CSR_IMPCTL1)
+#define write_csr_impctl1(val)		csr_writeq(val, LOONGARCH_CSR_IMPCTL1)
+#define write_csr_impctl2(val)		csr_writeq(val, LOONGARCH_CSR_IMPCTL2)
+
+#define read_csr_perfctrl0()		csr_readq(LOONGARCH_CSR_PERFCTRL0)
+#define read_csr_perfcntr0()		csr_readq(LOONGARCH_CSR_PERFCNTR0)
+#define read_csr_perfctrl1()		csr_readq(LOONGARCH_CSR_PERFCTRL1)
+#define read_csr_perfcntr1()		csr_readq(LOONGARCH_CSR_PERFCNTR1)
+#define read_csr_perfctrl2()		csr_readq(LOONGARCH_CSR_PERFCTRL2)
+#define read_csr_perfcntr2()		csr_readq(LOONGARCH_CSR_PERFCNTR2)
+#define read_csr_perfctrl3()		csr_readq(LOONGARCH_CSR_PERFCTRL3)
+#define read_csr_perfcntr3()		csr_readq(LOONGARCH_CSR_PERFCNTR3)
+#define write_csr_perfctrl0(val)	csr_writeq(val, LOONGARCH_CSR_PERFCTRL0)
+#define write_csr_perfcntr0(val)	csr_writeq(val, LOONGARCH_CSR_PERFCNTR0)
+#define write_csr_perfctrl1(val)	csr_writeq(val, LOONGARCH_CSR_PERFCTRL1)
+#define write_csr_perfcntr1(val)	csr_writeq(val, LOONGARCH_CSR_PERFCNTR1)
+#define write_csr_perfctrl2(val)	csr_writeq(val, LOONGARCH_CSR_PERFCTRL2)
+#define write_csr_perfcntr2(val)	csr_writeq(val, LOONGARCH_CSR_PERFCNTR2)
+#define write_csr_perfctrl3(val)	csr_writeq(val, LOONGARCH_CSR_PERFCTRL3)
+#define write_csr_perfcntr3(val)	csr_writeq(val, LOONGARCH_CSR_PERFCNTR3)
+
+/*
+ * Manipulate bits in a register.
+ */
+#define __BUILD_CSR_COMMON(name)				\
+static inline unsigned long					\
+set_##name(unsigned long set)					\
+{								\
+	unsigned long res, new;					\
+								\
+	res = read_##name();					\
+	new = res | set;					\
+	write_##name(new);					\
+								\
+	return res;						\
+}								\
+								\
+static inline unsigned long					\
+clear_##name(unsigned long clear)				\
+{								\
+	unsigned long res, new;					\
+								\
+	res = read_##name();					\
+	new = res & ~clear;					\
+	write_##name(new);					\
+								\
+	return res;						\
+}								\
+								\
+static inline unsigned long					\
+change_##name(unsigned long change, unsigned long val)		\
+{								\
+	unsigned long res, new;					\
+								\
+	res = read_##name();					\
+	new = res & ~change;					\
+	new |= (val & change);					\
+	write_##name(new);					\
+								\
+	return res;						\
+}
+
+#define __BUILD_CSR_OP(name)	__BUILD_CSR_COMMON(csr_##name)
+
+__BUILD_CSR_OP(euen)
+__BUILD_CSR_OP(ecfg)
+__BUILD_CSR_OP(tlbidx)
+
+#define set_csr_estat(val)	\
+	csr_xchgl(val, val, LOONGARCH_CSR_ESTAT)
+#define clear_csr_estat(val)	\
+	csr_xchgl(~(val), val, LOONGARCH_CSR_ESTAT)
+
+#endif /* __ASSEMBLY__ */
+
+/* Generic EntryLo bit definitions */
+#define ENTRYLO_V		(_ULCAST_(1) << 0)
+#define ENTRYLO_D		(_ULCAST_(1) << 1)
+#define ENTRYLO_PLV_SHIFT	2
+#define ENTRYLO_PLV		(_ULCAST_(3) << ENTRYLO_PLV_SHIFT)
+#define ENTRYLO_C_SHIFT		4
+#define ENTRYLO_C		(_ULCAST_(3) << ENTRYLO_C_SHIFT)
+#define ENTRYLO_G		(_ULCAST_(1) << 6)
+#define ENTRYLO_NR		(_ULCAST_(1) << 61)
+#define ENTRYLO_NX		(_ULCAST_(1) << 62)
+
+/* LoongArch GlobalNumber definitions */
+#define LOONGARCH_GLOBALNUMBER_VP_SHF	0
+#define LOONGARCH_GLOBALNUMBER_VP		(_ULCAST_(0xff) << LOONGARCH_GLOBALNUMBER_VP_SHF)
+#define LOONGARCH_GLOBALNUMBER_CORE_SHF	8
+#define LOONGARCH_GLOBALNUMBER_CORE		(_ULCAST_(0xff) << LOONGARCH_GLOBALNUMBER_CORE_SHF)
+#define LOONGARCH_GLOBALNUMBER_CLUSTER_SHF	16
+#define LOONGARCH_GLOBALNUMBER_CLUSTER	(_ULCAST_(0xf) << LOONGARCH_GLOBALNUMBER_CLUSTER_SHF)
+
+/* Values for PageSize register */
+#define PS_4K		0x0000000c
+#define PS_8K		0x0000000d
+#define PS_16K		0x0000000e
+#define PS_32K		0x0000000f
+#define PS_64K		0x00000010
+#define PS_128K		0x00000011
+#define PS_256K		0x00000012
+#define PS_512K		0x00000013
+#define PS_1M		0x00000014
+#define PS_2M		0x00000015
+#define PS_4M		0x00000016
+#define PS_8M		0x00000017
+#define PS_16M		0x00000018
+#define PS_32M		0x00000019
+#define PS_64M		0x0000001a
+#define PS_128M		0x0000001b
+#define PS_256M		0x0000001c
+#define PS_512M		0x0000001d
+#define PS_1G		0x0000001e
+
+#define PS_MASK		0x3f000000
+#define PS_SHIFT	24
+
+/* Default page size for a given kernel configuration */
+#ifdef CONFIG_PAGE_SIZE_4KB
+#define PS_DEFAULT_SIZE PS_4K
+#elif defined(CONFIG_PAGE_SIZE_16KB)
+#define PS_DEFAULT_SIZE PS_16K
+#elif defined(CONFIG_PAGE_SIZE_64KB)
+#define PS_DEFAULT_SIZE PS_64K
+#else
+#error Bad page size configuration!
+#endif
+
+/* Default huge tlb size for a given kernel configuration */
+#ifdef CONFIG_PAGE_SIZE_4KB
+#define PS_HUGE_SIZE   PS_1M
+#elif defined(CONFIG_PAGE_SIZE_16KB)
+#define PS_HUGE_SIZE   PS_16M
+#elif defined(CONFIG_PAGE_SIZE_64KB)
+#define PS_HUGE_SIZE   PS_256M
+#else
+#error Bad page size configuration for hugetlbfs!
+#endif
+
+/* ExStatus.ExcCode */
+#define EXCCODE_RSV		0	/* Reserved */
+#define EXCCODE_TLBL		1	/* TLB miss on a load */
+#define EXCCODE_TLBS		2	/* TLB miss on a store */
+#define EXCCODE_TLBI		3	/* TLB miss on a ifetch */
+#define EXCCODE_TLBM		4	/* TLB modified fault */
+#define EXCCODE_TLBNR		5	/* TLB Read-Inhibit exception */
+#define EXCCODE_TLBNX		6	/* TLB Execution-Inhibit exception */
+#define EXCCODE_TLBPE		7	/* TLB Privilege Error */
+#define EXCCODE_ADE		8	/* Address Error */
+	#define EXSUBCODE_ADEF		0	/* Fetch Instruction */
+	#define EXSUBCODE_ADEM		1	/* Access Memory*/
+#define EXCCODE_ALE		9	/* Unalign Access */
+#define EXCCODE_OOB		10	/* Out of bounds */
+#define EXCCODE_SYS		11	/* System call */
+#define EXCCODE_BP		12	/* Breakpoint */
+#define EXCCODE_INE		13	/* Inst. Not Exist */
+#define EXCCODE_IPE		14	/* Inst. Privileged Error */
+#define EXCCODE_FPDIS		15	/* FPU Disabled */
+#define EXCCODE_LSXDIS		16	/* LSX Disabled */
+#define EXCCODE_LASXDIS		17	/* LASX Disabled */
+#define EXCCODE_FPE		18	/* Floating Point Exception */
+	#define EXCSUBCODE_FPE		0	/* Floating Point Exception */
+	#define EXCSUBCODE_VFPE		1	/* Vector Exception */
+#define EXCCODE_WATCH		19	/* Watch address reference */
+#define EXCCODE_BTDIS		20	/* Binary Trans. Disabled */
+#define EXCCODE_BTE		21	/* Binary Trans. Exception */
+#define EXCCODE_PSI		22	/* Guest Privileged Error */
+#define EXCCODE_HYP		23	/* Hypercall */
+#define EXCCODE_GCM		24	/* Guest CSR modified */
+	#define EXCSUBCODE_GCSC		0	/* Software caused */
+	#define EXCSUBCODE_GCHC		1	/* Hardware caused */
+#define EXCCODE_SE		25	/* Security */
+
+#define EXCCODE_INT_START   64
+#define EXCCODE_SIP0        64
+#define EXCCODE_SIP1        65
+#define EXCCODE_IP0         66
+#define EXCCODE_IP1         67
+#define EXCCODE_IP2         68
+#define EXCCODE_IP3         69
+#define EXCCODE_IP4         70
+#define EXCCODE_IP5         71
+#define EXCCODE_IP6         72
+#define EXCCODE_IP7         73
+#define EXCCODE_PMC         74 /* Performance Counter */
+#define EXCCODE_TIMER       75
+#define EXCCODE_IPI         76
+#define EXCCODE_NMI         77
+#define EXCCODE_INT_END     78
+#define EXCCODE_INT_NUM	    (EXCCODE_INT_END - EXCCODE_INT_START)
+
+/* FPU register names */
+#define LOONGARCH_FCSR0	$r0
+#define LOONGARCH_FCSR1	$r1
+#define LOONGARCH_FCSR2	$r2
+#define LOONGARCH_FCSR3	$r3
+
+/* FPU Status Register Values */
+#define FPU_CSR_RSVD	0xe0e0fce0
+
+/*
+ * X the exception cause indicator
+ * E the exception enable
+ * S the sticky/flag bit
+ */
+#define FPU_CSR_ALL_X	0x1f000000
+#define FPU_CSR_INV_X	0x10000000
+#define FPU_CSR_DIV_X	0x08000000
+#define FPU_CSR_OVF_X	0x04000000
+#define FPU_CSR_UDF_X	0x02000000
+#define FPU_CSR_INE_X	0x01000000
+
+#define FPU_CSR_ALL_S	0x001f0000
+#define FPU_CSR_INV_S	0x00100000
+#define FPU_CSR_DIV_S	0x00080000
+#define FPU_CSR_OVF_S	0x00040000
+#define FPU_CSR_UDF_S	0x00020000
+#define FPU_CSR_INE_S	0x00010000
+
+#define FPU_CSR_ALL_E	0x0000001f
+#define FPU_CSR_INV_E	0x00000010
+#define FPU_CSR_DIV_E	0x00000008
+#define FPU_CSR_OVF_E	0x00000004
+#define FPU_CSR_UDF_E	0x00000002
+#define FPU_CSR_INE_E	0x00000001
+
+/* Bits 8 and 9 of FPU Status Register specify the rounding mode */
+#define FPU_CSR_RM	0x300
+#define FPU_CSR_RN	0x000	/* nearest */
+#define FPU_CSR_RZ	0x100	/* towards zero */
+#define FPU_CSR_RU	0x200	/* towards +Infinity */
+#define FPU_CSR_RD	0x300	/* towards -Infinity */
+
+#define read_fcsr(source)	\
+({	\
+	unsigned int __res;	\
+\
+	__asm__ __volatile__(	\
+	"	movfcsr2gr	%0, "STR(source)"	\n"	\
+	: "=r" (__res));	\
+	__res;	\
+})
+
+#define write_fcsr(dest, val) \
+do {	\
+	__asm__ __volatile__(	\
+	"	movgr2fcsr	%0, "STR(dest)"	\n"	\
+	: : "r" (val));	\
+} while (0)
+
+#endif /* _ASM_LOONGARCH_H */
diff --git a/arch/loongarch/include/asm/loongson.h b/arch/loongarch/include/asm/loongson.h
new file mode 100644
index 000000000000..4cefd393fd5c
--- /dev/null
+++ b/arch/loongarch/include/asm/loongson.h
@@ -0,0 +1,159 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Author: Huacai Chen <chenhuacai@loongson.cn>
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#ifndef __ASM_LOONGSON_H
+#define __ASM_LOONGSON_H
+
+#include <linux/init.h>
+#include <linux/io.h>
+#include <linux/irq.h>
+#include <linux/pci.h>
+#include <asm/addrspace.h>
+#include <asm/boot_param.h>
+
+extern const struct plat_smp_ops loongson3_smp_ops;
+
+/* loongson-specific command line, env and memory initialization */
+extern void __init fw_init_environ(void);
+extern void __init fw_init_memory(void);
+extern void __init fw_init_numa_memory(void);
+
+#define LOONGSON_REG(x) \
+	(*(volatile u32 *)((char *)TO_UNCAC(LOONGSON_REG_BASE) + (x)))
+
+#define LOONGSON_LIO_BASE	0x18000000
+#define LOONGSON_LIO_SIZE	0x00100000	/* 1M */
+#define LOONGSON_LIO_TOP	(LOONGSON_LIO_BASE+LOONGSON_LIO_SIZE-1)
+
+#define LOONGSON_BOOT_BASE	0x1c000000
+#define LOONGSON_BOOT_SIZE	0x02000000	/* 32M */
+#define LOONGSON_BOOT_TOP	(LOONGSON_BOOT_BASE+LOONGSON_BOOT_SIZE-1)
+
+#define LOONGSON_REG_BASE	0x1fe00000
+#define LOONGSON_REG_SIZE	0x00100000	/* 1M */
+#define LOONGSON_REG_TOP	(LOONGSON_REG_BASE+LOONGSON_REG_SIZE-1)
+
+/* GPIO Regs - r/w */
+
+#define LOONGSON_GPIODATA		LOONGSON_REG(0x11c)
+#define LOONGSON_GPIOIE			LOONGSON_REG(0x120)
+#define LOONGSON_REG_GPIO_BASE          (LOONGSON_REG_BASE + 0x11c)
+
+#define MAX_PACKAGES 16
+
+/* Chip Config registor of each physical cpu package */
+extern u64 loongson_chipcfg[MAX_PACKAGES];
+#define LOONGSON_CHIPCFG(id) (*(volatile u32 *)(loongson_chipcfg[id]))
+
+/* Chip Temperature registor of each physical cpu package */
+extern u64 loongson_chiptemp[MAX_PACKAGES];
+#define LOONGSON_CHIPTEMP(id) (*(volatile u32 *)(loongson_chiptemp[id]))
+
+/* Freq Control register of each physical cpu package */
+extern u64 loongson_freqctrl[MAX_PACKAGES];
+#define LOONGSON_FREQCTRL(id) (*(volatile u32 *)(loongson_freqctrl[id]))
+
+#define xconf_readl(addr) readl(addr)
+#define xconf_readq(addr) readq(addr)
+
+static inline void xconf_writel(u32 val, volatile void __iomem *addr)
+{
+	asm volatile (
+	"	st.w	%[v], %[hw], 0	\n"
+	"	ld.b	$r0, %[hw], 0	\n"
+	:
+	: [hw] "r" (addr), [v] "r" (val)
+	);
+}
+
+static inline void xconf_writeq(u64 val64, volatile void __iomem *addr)
+{
+	asm volatile (
+	"	st.d	%[v], %[hw], 0	\n"
+	"	ld.b	$r0, %[hw], 0	\n"
+	:
+	: [hw] "r" (addr),  [v] "r" (val64)
+	);
+}
+
+/* ============== LS7A registers =============== */
+#define LS7A_PCH_REG_BASE		0x10000000UL
+/* LPC regs */
+#define LS7A_LPC_REG_BASE		(LS7A_PCH_REG_BASE + 0x00002000)
+/* CHIPCFG regs */
+#define LS7A_CHIPCFG_REG_BASE		(LS7A_PCH_REG_BASE + 0x00010000)
+/* MISC reg base */
+#define LS7A_MISC_REG_BASE		(LS7A_PCH_REG_BASE + 0x00080000)
+/* ACPI regs */
+#define LS7A_ACPI_REG_BASE		(LS7A_MISC_REG_BASE + 0x00050000)
+/* RTC regs */
+#define LS7A_RTC_REG_BASE		(LS7A_MISC_REG_BASE + 0x00050100)
+
+#define LS7A_DMA_CFG			(volatile void *)TO_UNCAC(LS7A_CHIPCFG_REG_BASE + 0x041c)
+#define LS7A_DMA_NODE_SHF		8
+#define LS7A_DMA_NODE_MASK		0x1F00
+
+#define LS7A_INT_MASK_REG		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x020)
+#define LS7A_INT_EDGE_REG		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x060)
+#define LS7A_INT_CLEAR_REG		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x080)
+#define LS7A_INT_HTMSI_EN_REG		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x040)
+#define LS7A_INT_ROUTE_ENTRY_REG	(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x100)
+#define LS7A_INT_HTMSI_VEC_REG		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x200)
+#define LS7A_INT_STATUS_REG		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x3a0)
+#define LS7A_INT_POL_REG		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x3e0)
+#define LS7A_LPC_INT_CTL		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x2000)
+#define LS7A_LPC_INT_ENA		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x2004)
+#define LS7A_LPC_INT_STS		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x2008)
+#define LS7A_LPC_INT_CLR		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x200c)
+#define LS7A_LPC_INT_POL		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x2010)
+
+#define LS7A_PMCON_SOC_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x000)
+#define LS7A_PMCON_RESUME_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x004)
+#define LS7A_PMCON_RTC_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x008)
+#define LS7A_PM1_EVT_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x00c)
+#define LS7A_PM1_ENA_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x010)
+#define LS7A_PM1_CNT_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x014)
+#define LS7A_PM1_TMR_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x018)
+#define LS7A_P_CNT_REG			(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x01c)
+#define LS7A_GPE0_STS_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x028)
+#define LS7A_GPE0_ENA_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x02c)
+#define LS7A_RST_CNT_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x030)
+#define LS7A_WD_SET_REG			(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x034)
+#define LS7A_WD_TIMER_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x038)
+#define LS7A_THSENS_CNT_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x04c)
+#define LS7A_GEN_RTC_1_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x050)
+#define LS7A_GEN_RTC_2_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x054)
+#define LS7A_DPM_CFG_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x400)
+#define LS7A_DPM_STS_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x404)
+#define LS7A_DPM_CNT_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x408)
+
+typedef enum {
+	ACPI_PCI_HOTPLUG_STATUS	= 1 << 1,
+	ACPI_CPU_HOTPLUG_STATUS	= 1 << 2,
+	ACPI_MEM_HOTPLUG_STATUS	= 1 << 3,
+	ACPI_POWERBUTTON_STATUS	= 1 << 8,
+	ACPI_RTC_WAKE_STATUS	= 1 << 10,
+	ACPI_PCI_WAKE_STATUS	= 1 << 14,
+	ACPI_ANY_WAKE_STATUS	= 1 << 15,
+} AcpiEventStatusBits;
+
+#define HT1LO_OFFSET		0xe0000000000UL
+
+/* PCI Configuration Space Base */
+#define MCFG_EXT_PCICFG_BASE		0xefe00000000UL
+
+/* REG ACCESS*/
+#define ls7a_readb(addr)			  (*(volatile unsigned char  *)TO_UNCAC(addr))
+#define ls7a_readw(addr)			  (*(volatile unsigned short *)TO_UNCAC(addr))
+#define ls7a_readl(addr)			  (*(volatile unsigned int   *)TO_UNCAC(addr))
+#define ls7a_readq(addr)			  (*(volatile unsigned long  *)TO_UNCAC(addr))
+#define ls7a_writeb(val, addr)		*(volatile unsigned char  *)TO_UNCAC(addr) = (val)
+#define ls7a_writew(val, addr)		*(volatile unsigned short *)TO_UNCAC(addr) = (val)
+#define ls7a_writel(val, addr)		ls7a_write_type(val, addr, uint32_t)
+#define ls7a_writeq(val, addr)		ls7a_write_type(val, addr, uint64_t)
+#define ls7a_write(val, addr)		ls7a_write_type(val, addr, uint64_t)
+
+#endif /* __ASM_LOONGSON_H */
diff --git a/arch/loongarch/include/asm/regdef.h b/arch/loongarch/include/asm/regdef.h
new file mode 100644
index 000000000000..9f24f0c05fe3
--- /dev/null
+++ b/arch/loongarch/include/asm/regdef.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_REGDEF_H
+#define _ASM_REGDEF_H
+
+#define zero	$r0	/* wired zero */
+#define ra	$r1	/* return address */
+#define tp	$r2
+#define sp	$r3	/* stack pointer */
+#define v0	$r4	/* return value - caller saved */
+#define v1	$r5
+#define a0	$r4	/* argument registers */
+#define a1	$r5
+#define a2	$r6
+#define a3	$r7
+#define a4	$r8
+#define a5	$r9
+#define a6	$r10
+#define a7	$r11
+#define t0	$r12	/* caller saved */
+#define t1	$r13
+#define t2	$r14
+#define t3	$r15
+#define t4	$r16
+#define t5	$r17
+#define t6	$r18
+#define t7	$r19
+#define t8	$r20
+#define u0	$r21
+#define fp	$r22	/* frame pointer */
+#define s0	$r23	/* callee saved */
+#define s1	$r24
+#define s2	$r25
+#define s3	$r26
+#define s4	$r27
+#define s5	$r28
+#define s6	$r29
+#define s7	$r30
+#define s8	$r31
+
+#endif /* _ASM_REGDEF_H */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 07/24] LoongArch: Add atomic/locking headers
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (5 preceding siblings ...)
  2022-04-30  9:05 ` [PATCH V9 06/24] LoongArch: Add CPU definition headers Huacai Chen
@ 2022-04-30  9:05 ` Huacai Chen
  2022-05-01 11:16   ` WANG Xuerui
  2022-04-30  9:05 ` [PATCH V9 08/24] LoongArch: Add other common headers Huacai Chen
                   ` (17 subsequent siblings)
  24 siblings, 1 reply; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:05 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen

This patch adds common headers (atomic, bitops, barrier and locking)
for basic LoongArch support.

LoongArch has no native sub-word xchg/cmpxchg instructions now, but
LoongArch-based CPUs support NUMA (e.g., quad-core Loongson-3A5000
supports as many as 16 nodes, 64 cores in total). So, we emulate sub-
word xchg/cmpxchg in software and use qspinlock/qrwlock rather than
ticket locks.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/include/asm/atomic.h         | 358 ++++++++++++++++++++
 arch/loongarch/include/asm/barrier.h        |  51 +++
 arch/loongarch/include/asm/bitops.h         |  33 ++
 arch/loongarch/include/asm/bitrev.h         |  34 ++
 arch/loongarch/include/asm/cmpxchg.h        | 135 ++++++++
 arch/loongarch/include/asm/local.h          | 138 ++++++++
 arch/loongarch/include/asm/percpu.h         |  20 ++
 arch/loongarch/include/asm/spinlock.h       |  12 +
 arch/loongarch/include/asm/spinlock_types.h |  11 +
 9 files changed, 792 insertions(+)
 create mode 100644 arch/loongarch/include/asm/atomic.h
 create mode 100644 arch/loongarch/include/asm/barrier.h
 create mode 100644 arch/loongarch/include/asm/bitops.h
 create mode 100644 arch/loongarch/include/asm/bitrev.h
 create mode 100644 arch/loongarch/include/asm/cmpxchg.h
 create mode 100644 arch/loongarch/include/asm/local.h
 create mode 100644 arch/loongarch/include/asm/percpu.h
 create mode 100644 arch/loongarch/include/asm/spinlock.h
 create mode 100644 arch/loongarch/include/asm/spinlock_types.h

diff --git a/arch/loongarch/include/asm/atomic.h b/arch/loongarch/include/asm/atomic.h
new file mode 100644
index 000000000000..f0ed7f9c08c9
--- /dev/null
+++ b/arch/loongarch/include/asm/atomic.h
@@ -0,0 +1,358 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Atomic operations.
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_ATOMIC_H
+#define _ASM_ATOMIC_H
+
+#include <linux/types.h>
+#include <asm/barrier.h>
+#include <asm/cmpxchg.h>
+#include <asm/compiler.h>
+
+#if _LOONGARCH_SZLONG == 32
+#define __LL		"ll.w	"
+#define __SC		"sc.w	"
+#define __AMADD		"amadd.w	"
+#define __AMAND_SYNC	"amand_db.w	"
+#define __AMOR_SYNC	"amor_db.w	"
+#define __AMXOR_SYNC	"amxor_db.w	"
+#elif _LOONGARCH_SZLONG == 64
+#define __LL		"ll.d	"
+#define __SC		"sc.d	"
+#define __AMADD		"amadd.d	"
+#define __AMAND_SYNC	"amand_db.d	"
+#define __AMOR_SYNC	"amor_db.d	"
+#define __AMXOR_SYNC	"amxor_db.d	"
+#endif
+
+#define ATOMIC_INIT(i)	  { (i) }
+
+/*
+ * arch_atomic_read - read atomic variable
+ * @v: pointer of type atomic_t
+ *
+ * Atomically reads the value of @v.
+ */
+#define arch_atomic_read(v)	READ_ONCE((v)->counter)
+
+/*
+ * arch_atomic_set - set atomic variable
+ * @v: pointer of type atomic_t
+ * @i: required value
+ *
+ * Atomically sets the value of @v to @i.
+ */
+#define arch_atomic_set(v, i)	WRITE_ONCE((v)->counter, (i))
+
+#define ATOMIC_OP(op, I, asm_op)					\
+static inline void arch_atomic_##op(int i, atomic_t *v)			\
+{									\
+	__asm__ __volatile__(						\
+	"am"#asm_op"_db.w" " $zero, %1, %0	\n"			\
+	: "+ZB" (v->counter)						\
+	: "r" (I)							\
+	: "memory");							\
+}
+
+#define ATOMIC_OP_RETURN(op, I, asm_op, c_op)				\
+static inline int arch_atomic_##op##_return_relaxed(int i, atomic_t *v)	\
+{									\
+	int result;							\
+									\
+	__asm__ __volatile__(						\
+	"am"#asm_op"_db.w" " %1, %2, %0		\n"			\
+	: "+ZB" (v->counter), "=&r" (result)				\
+	: "r" (I)							\
+	: "memory");							\
+									\
+	return result c_op I;						\
+}
+
+#define ATOMIC_FETCH_OP(op, I, asm_op)					\
+static inline int arch_atomic_fetch_##op##_relaxed(int i, atomic_t *v)	\
+{									\
+	int result;							\
+									\
+	__asm__ __volatile__(						\
+	"am"#asm_op"_db.w" " %1, %2, %0		\n"			\
+	: "+ZB" (v->counter), "=&r" (result)				\
+	: "r" (I)							\
+	: "memory");							\
+									\
+	return result;							\
+}
+
+#define ATOMIC_OPS(op, I, asm_op, c_op)					\
+	ATOMIC_OP(op, I, asm_op)					\
+	ATOMIC_OP_RETURN(op, I, asm_op, c_op)				\
+	ATOMIC_FETCH_OP(op, I, asm_op)
+
+ATOMIC_OPS(add, i, add, +)
+ATOMIC_OPS(sub, -i, add, +)
+
+#define arch_atomic_add_return_relaxed	arch_atomic_add_return_relaxed
+#define arch_atomic_sub_return_relaxed	arch_atomic_sub_return_relaxed
+#define arch_atomic_fetch_add_relaxed	arch_atomic_fetch_add_relaxed
+#define arch_atomic_fetch_sub_relaxed	arch_atomic_fetch_sub_relaxed
+
+#undef ATOMIC_OPS
+
+#define ATOMIC_OPS(op, I, asm_op)					\
+	ATOMIC_OP(op, I, asm_op)					\
+	ATOMIC_FETCH_OP(op, I, asm_op)
+
+ATOMIC_OPS(and, i, and)
+ATOMIC_OPS(or, i, or)
+ATOMIC_OPS(xor, i, xor)
+
+#define arch_atomic_fetch_and_relaxed	arch_atomic_fetch_and_relaxed
+#define arch_atomic_fetch_or_relaxed	arch_atomic_fetch_or_relaxed
+#define arch_atomic_fetch_xor_relaxed	arch_atomic_fetch_xor_relaxed
+
+#undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
+#undef ATOMIC_OP_RETURN
+#undef ATOMIC_OP
+
+static inline int arch_atomic_fetch_add_unless(atomic_t *v, int a, int u)
+{
+       int prev, rc;
+
+	__asm__ __volatile__ (
+		"0:	ll.w	%[p],  %[c]\n"
+		"	beq	%[p],  %[u], 1f\n"
+		"	add.w	%[rc], %[p], %[a]\n"
+		"	sc.w	%[rc], %[c]\n"
+		"	beqz	%[rc], 0b\n"
+		"	b	2f\n"
+		"1:\n"
+		__WEAK_LLSC_MB
+		"2:\n"
+		: [p]"=&r" (prev), [rc]"=&r" (rc),
+		  [c]"=ZB" (v->counter)
+		: [a]"r" (a), [u]"r" (u)
+		: "memory");
+
+	return prev;
+}
+#define arch_atomic_fetch_add_unless arch_atomic_fetch_add_unless
+
+/*
+ * arch_atomic_sub_if_positive - conditionally subtract integer from atomic variable
+ * @i: integer value to subtract
+ * @v: pointer of type atomic_t
+ *
+ * Atomically test @v and subtract @i if @v is greater or equal than @i.
+ * The function returns the old value of @v minus @i.
+ */
+static inline int arch_atomic_sub_if_positive(int i, atomic_t *v)
+{
+	int result;
+	int temp;
+
+	if (__builtin_constant_p(i)) {
+		__asm__ __volatile__(
+		"1:	ll.w	%1, %2		# atomic_sub_if_positive\n"
+		"	addi.w	%0, %1, %3				\n"
+		"	or	%1, %0, $zero				\n"
+		"	blt	%0, $zero, 2f				\n"
+		"	sc.w	%1, %2					\n"
+		"	beq	$zero, %1, 1b				\n"
+		"2:							\n"
+		: "=&r" (result), "=&r" (temp),
+		  "+" GCC_OFF_SMALL_ASM() (v->counter)
+		: "I" (-i));
+	} else {
+		__asm__ __volatile__(
+		"1:	ll.w	%1, %2		# atomic_sub_if_positive\n"
+		"	sub.w	%0, %1, %3				\n"
+		"	or	%1, %0, $zero				\n"
+		"	blt	%0, $zero, 2f				\n"
+		"	sc.w	%1, %2					\n"
+		"	beq	$zero, %1, 1b				\n"
+		"2:							\n"
+		: "=&r" (result), "=&r" (temp),
+		  "+" GCC_OFF_SMALL_ASM() (v->counter)
+		: "r" (i));
+	}
+
+	return result;
+}
+
+#define arch_atomic_cmpxchg(v, o, n) (arch_cmpxchg(&((v)->counter), (o), (n)))
+#define arch_atomic_xchg(v, new) (arch_xchg(&((v)->counter), (new)))
+
+/*
+ * arch_atomic_dec_if_positive - decrement by 1 if old value positive
+ * @v: pointer of type atomic_t
+ */
+#define arch_atomic_dec_if_positive(v)	arch_atomic_sub_if_positive(1, v)
+
+#ifdef CONFIG_64BIT
+
+#define ATOMIC64_INIT(i)    { (i) }
+
+/*
+ * arch_atomic64_read - read atomic variable
+ * @v: pointer of type atomic64_t
+ *
+ */
+#define arch_atomic64_read(v)	READ_ONCE((v)->counter)
+
+/*
+ * arch_atomic64_set - set atomic variable
+ * @v: pointer of type atomic64_t
+ * @i: required value
+ */
+#define arch_atomic64_set(v, i)	WRITE_ONCE((v)->counter, (i))
+
+#define ATOMIC64_OP(op, I, asm_op)					\
+static inline void arch_atomic64_##op(long i, atomic64_t *v)		\
+{									\
+	__asm__ __volatile__(						\
+	"am"#asm_op"_db.d " " $zero, %1, %0	\n"			\
+	: "+ZB" (v->counter)						\
+	: "r" (I)							\
+	: "memory");							\
+}
+
+#define ATOMIC64_OP_RETURN(op, I, asm_op, c_op)					\
+static inline long arch_atomic64_##op##_return_relaxed(long i, atomic64_t *v)	\
+{										\
+	long result;								\
+	__asm__ __volatile__(							\
+	"am"#asm_op"_db.d " " %1, %2, %0		\n"			\
+	: "+ZB" (v->counter), "=&r" (result)					\
+	: "r" (I)								\
+	: "memory");								\
+										\
+	return result c_op I;							\
+}
+
+#define ATOMIC64_FETCH_OP(op, I, asm_op)					\
+static inline long arch_atomic64_fetch_##op##_relaxed(long i, atomic64_t *v)	\
+{										\
+	long result;								\
+										\
+	__asm__ __volatile__(							\
+	"am"#asm_op"_db.d " " %1, %2, %0		\n"			\
+	: "+ZB" (v->counter), "=&r" (result)					\
+	: "r" (I)								\
+	: "memory");								\
+										\
+	return result;								\
+}
+
+#define ATOMIC64_OPS(op, I, asm_op, c_op)				      \
+	ATOMIC64_OP(op, I, asm_op)					      \
+	ATOMIC64_OP_RETURN(op, I, asm_op, c_op)				      \
+	ATOMIC64_FETCH_OP(op, I, asm_op)
+
+ATOMIC64_OPS(add, i, add, +)
+ATOMIC64_OPS(sub, -i, add, +)
+
+#define arch_atomic64_add_return_relaxed	arch_atomic64_add_return_relaxed
+#define arch_atomic64_sub_return_relaxed	arch_atomic64_sub_return_relaxed
+#define arch_atomic64_fetch_add_relaxed		arch_atomic64_fetch_add_relaxed
+#define arch_atomic64_fetch_sub_relaxed		arch_atomic64_fetch_sub_relaxed
+
+#undef ATOMIC64_OPS
+
+#define ATOMIC64_OPS(op, I, asm_op)					      \
+	ATOMIC64_OP(op, I, asm_op)					      \
+	ATOMIC64_FETCH_OP(op, I, asm_op)
+
+ATOMIC64_OPS(and, i, and)
+ATOMIC64_OPS(or, i, or)
+ATOMIC64_OPS(xor, i, xor)
+
+#define arch_atomic64_fetch_and_relaxed	arch_atomic64_fetch_and_relaxed
+#define arch_atomic64_fetch_or_relaxed	arch_atomic64_fetch_or_relaxed
+#define arch_atomic64_fetch_xor_relaxed	arch_atomic64_fetch_xor_relaxed
+
+#undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
+#undef ATOMIC64_OP_RETURN
+#undef ATOMIC64_OP
+
+static inline long arch_atomic64_fetch_add_unless(atomic64_t *v, long a, long u)
+{
+       long prev, rc;
+
+	__asm__ __volatile__ (
+		"0:	ll.d	%[p],  %[c]\n"
+		"	beq	%[p],  %[u], 1f\n"
+		"	add.d	%[rc], %[p], %[a]\n"
+		"	sc.d	%[rc], %[c]\n"
+		"	beqz	%[rc], 0b\n"
+		"	b	2f\n"
+		"1:\n"
+		__WEAK_LLSC_MB
+		"2:\n"
+		: [p]"=&r" (prev), [rc]"=&r" (rc),
+		  [c] "=ZB" (v->counter)
+		: [a]"r" (a), [u]"r" (u)
+		: "memory");
+
+	return prev;
+}
+#define arch_atomic64_fetch_add_unless arch_atomic64_fetch_add_unless
+
+/*
+ * arch_atomic64_sub_if_positive - conditionally subtract integer from atomic variable
+ * @i: integer value to subtract
+ * @v: pointer of type atomic64_t
+ *
+ * Atomically test @v and subtract @i if @v is greater or equal than @i.
+ * The function returns the old value of @v minus @i.
+ */
+static inline long arch_atomic64_sub_if_positive(long i, atomic64_t *v)
+{
+	long result;
+	long temp;
+
+	if (__builtin_constant_p(i)) {
+		__asm__ __volatile__(
+		"1:	ll.d	%1, %2	# atomic64_sub_if_positive	\n"
+		"	addi.d	%0, %1, %3				\n"
+		"	or	%1, %0, $zero				\n"
+		"	blt	%0, $zero, 2f				\n"
+		"	sc.d	%1, %2					\n"
+		"	beq	%1, $zero, 1b				\n"
+		"2:							\n"
+		: "=&r" (result), "=&r" (temp),
+		  "+" GCC_OFF_SMALL_ASM() (v->counter)
+		: "I" (-i));
+	} else {
+		__asm__ __volatile__(
+		"1:	ll.d	%1, %2	# atomic64_sub_if_positive	\n"
+		"	sub.d	%0, %1, %3				\n"
+		"	or	%1, %0, $zero				\n"
+		"	blt	%0, $zero, 2f				\n"
+		"	sc.d	%1, %2					\n"
+		"	beq	%1, $zero, 1b				\n"
+		"2:							\n"
+		: "=&r" (result), "=&r" (temp),
+		  "+" GCC_OFF_SMALL_ASM() (v->counter)
+		: "r" (i));
+	}
+
+	return result;
+}
+
+#define arch_atomic64_cmpxchg(v, o, n) \
+	((__typeof__((v)->counter))arch_cmpxchg(&((v)->counter), (o), (n)))
+#define arch_atomic64_xchg(v, new) (arch_xchg(&((v)->counter), (new)))
+
+/*
+ * arch_atomic64_dec_if_positive - decrement by 1 if old value positive
+ * @v: pointer of type atomic64_t
+ */
+#define arch_atomic64_dec_if_positive(v)	arch_atomic64_sub_if_positive(1, v)
+
+#endif /* CONFIG_64BIT */
+
+#endif /* _ASM_ATOMIC_H */
diff --git a/arch/loongarch/include/asm/barrier.h b/arch/loongarch/include/asm/barrier.h
new file mode 100644
index 000000000000..cc6c7e3f5ce6
--- /dev/null
+++ b/arch/loongarch/include/asm/barrier.h
@@ -0,0 +1,51 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM_BARRIER_H
+#define __ASM_BARRIER_H
+
+#define __sync()	__asm__ __volatile__("dbar 0" : : : "memory")
+
+#define fast_wmb()	__sync()
+#define fast_rmb()	__sync()
+#define fast_mb()	__sync()
+#define fast_iob()	__sync()
+#define wbflush()	__sync()
+
+#define wmb()		fast_wmb()
+#define rmb()		fast_rmb()
+#define mb()		fast_mb()
+#define iob()		fast_iob()
+
+/**
+ * array_index_mask_nospec() - generate a ~0 mask when index < size, 0 otherwise
+ * @index: array element index
+ * @size: number of elements in array
+ *
+ * Returns:
+ *     0 - (@index < @size)
+ */
+#define array_index_mask_nospec array_index_mask_nospec
+static inline unsigned long array_index_mask_nospec(unsigned long index,
+						    unsigned long size)
+{
+	unsigned long mask;
+
+	__asm__ __volatile__(
+		"sltu	%0, %1, %2\n\t"
+#if (_LOONGARCH_SZLONG == 32)
+		"sub.w	%0, $r0, %0\n\t"
+#elif (_LOONGARCH_SZLONG == 64)
+		"sub.d	%0, $r0, %0\n\t"
+#endif
+		: "=r" (mask)
+		: "r" (index), "r" (size)
+		:);
+
+	return mask;
+}
+
+#include <asm-generic/barrier.h>
+
+#endif /* __ASM_BARRIER_H */
diff --git a/arch/loongarch/include/asm/bitops.h b/arch/loongarch/include/asm/bitops.h
new file mode 100644
index 000000000000..69e00f8d8034
--- /dev/null
+++ b/arch/loongarch/include/asm/bitops.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_BITOPS_H
+#define _ASM_BITOPS_H
+
+#include <linux/compiler.h>
+
+#ifndef _LINUX_BITOPS_H
+#error only <linux/bitops.h> can be included directly
+#endif
+
+#include <asm/barrier.h>
+
+#include <asm-generic/bitops/builtin-ffs.h>
+#include <asm-generic/bitops/builtin-fls.h>
+#include <asm-generic/bitops/builtin-__ffs.h>
+#include <asm-generic/bitops/builtin-__fls.h>
+
+#include <asm-generic/bitops/ffz.h>
+#include <asm-generic/bitops/fls64.h>
+
+#include <asm-generic/bitops/sched.h>
+#include <asm-generic/bitops/hweight.h>
+
+#include <asm-generic/bitops/atomic.h>
+#include <asm-generic/bitops/non-atomic.h>
+#include <asm-generic/bitops/lock.h>
+#include <asm-generic/bitops/le.h>
+#include <asm-generic/bitops/ext2-atomic.h>
+
+#endif /* _ASM_BITOPS_H */
diff --git a/arch/loongarch/include/asm/bitrev.h b/arch/loongarch/include/asm/bitrev.h
new file mode 100644
index 000000000000..46f275b9cdf7
--- /dev/null
+++ b/arch/loongarch/include/asm/bitrev.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __LOONGARCH_ASM_BITREV_H__
+#define __LOONGARCH_ASM_BITREV_H__
+
+#include <linux/swab.h>
+
+static __always_inline __attribute_const__ u32 __arch_bitrev32(u32 x)
+{
+	u32 ret;
+
+	asm("bitrev.4b	%0, %1" : "=r"(ret) : "r"(__swab32(x)));
+	return ret;
+}
+
+static __always_inline __attribute_const__ u16 __arch_bitrev16(u16 x)
+{
+	u16 ret;
+
+	asm("bitrev.4b	%0, %1" : "=r"(ret) : "r"(__swab16(x)));
+	return ret;
+}
+
+static __always_inline __attribute_const__ u8 __arch_bitrev8(u8 x)
+{
+	u8 ret;
+
+	asm("bitrev.4b	%0, %1" : "=r"(ret) : "r"(x));
+	return ret;
+}
+
+#endif /* __LOONGARCH_ASM_BITREV_H__ */
diff --git a/arch/loongarch/include/asm/cmpxchg.h b/arch/loongarch/include/asm/cmpxchg.h
new file mode 100644
index 000000000000..69c3e2b7827d
--- /dev/null
+++ b/arch/loongarch/include/asm/cmpxchg.h
@@ -0,0 +1,135 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM_CMPXCHG_H
+#define __ASM_CMPXCHG_H
+
+#include <linux/build_bug.h>
+
+#define __xchg_asm(amswap_db, m, val)		\
+({						\
+		__typeof(val) __ret;		\
+						\
+		__asm__ __volatile__ (		\
+		" "amswap_db" %1, %z2, %0 \n"	\
+		: "+ZB" (*m), "=&r" (__ret)	\
+		: "Jr" (val)			\
+		: "memory");			\
+						\
+		__ret;				\
+})
+
+extern unsigned long __xchg_small(volatile void *ptr, unsigned long x,
+				  unsigned int size);
+
+static inline unsigned long __xchg(volatile void *ptr, unsigned long x,
+				   int size)
+{
+	switch (size) {
+	case 1:
+	case 2:
+		return __xchg_small(ptr, x, size);
+
+	case 4:
+		return __xchg_asm("amswap_db.w", (volatile u32 *)ptr, (u32)x);
+
+	case 8:
+		return __xchg_asm("amswap_db.d", (volatile u64 *)ptr, (u64)x);
+
+	default:
+		BUILD_BUG();
+	}
+
+	return 0;
+}
+
+#define arch_xchg(ptr, x)						\
+({									\
+	__typeof__(*(ptr)) __res;					\
+									\
+	__res = (__typeof__(*(ptr)))					\
+		__xchg((ptr), (unsigned long)(x), sizeof(*(ptr)));	\
+									\
+	__res;								\
+})
+
+#define __cmpxchg_asm(ld, st, m, old, new)				\
+({									\
+	__typeof(old) __ret;						\
+									\
+	__asm__ __volatile__(						\
+	"1:	" ld "	%0, %2		# __cmpxchg_asm \n"		\
+	"	bne	%0, %z3, 2f			\n"		\
+	"	or	$t0, %z4, $zero			\n"		\
+	"	" st "	$t0, %1				\n"		\
+	"	beq	$zero, $t0, 1b			\n"		\
+	"2:						\n"		\
+	: "=&r" (__ret), "=ZB"(*m)					\
+	: "ZB"(*m), "Jr" (old), "Jr" (new)				\
+	: "t0", "memory");						\
+									\
+	__ret;								\
+})
+
+extern unsigned long __cmpxchg_small(volatile void *ptr, unsigned long old,
+				     unsigned long new, unsigned int size);
+
+static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old,
+				      unsigned long new, unsigned int size)
+{
+	switch (size) {
+	case 1:
+	case 2:
+		return __cmpxchg_small(ptr, old, new, size);
+
+	case 4:
+		return __cmpxchg_asm("ll.w", "sc.w", (volatile u32 *)ptr,
+				     (u32)old, new);
+
+	case 8:
+		return __cmpxchg_asm("ll.d", "sc.d", (volatile u64 *)ptr,
+				     (u64)old, new);
+
+	default:
+		BUILD_BUG();
+	}
+
+	return 0;
+}
+
+#define arch_cmpxchg_local(ptr, old, new)				\
+	((__typeof__(*(ptr)))						\
+		__cmpxchg((ptr),					\
+			  (unsigned long)(__typeof__(*(ptr)))(old),	\
+			  (unsigned long)(__typeof__(*(ptr)))(new),	\
+			  sizeof(*(ptr))))
+
+#define arch_cmpxchg(ptr, old, new)					\
+({									\
+	__typeof__(*(ptr)) __res;					\
+									\
+	__res = arch_cmpxchg_local((ptr), (old), (new));		\
+									\
+	__res;								\
+})
+
+#ifdef CONFIG_64BIT
+#define arch_cmpxchg64_local(ptr, o, n)					\
+  ({									\
+	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
+	arch_cmpxchg_local((ptr), (o), (n));				\
+  })
+
+#define arch_cmpxchg64(ptr, o, n)					\
+  ({									\
+	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
+	arch_cmpxchg((ptr), (o), (n));					\
+  })
+#else
+#include <asm-generic/cmpxchg-local.h>
+#define arch_cmpxchg64_local(ptr, o, n) __generic_cmpxchg64_local((ptr), (o), (n))
+#define arch_cmpxchg64(ptr, o, n) arch_cmpxchg64_local((ptr), (o), (n))
+#endif
+
+#endif /* __ASM_CMPXCHG_H */
diff --git a/arch/loongarch/include/asm/local.h b/arch/loongarch/include/asm/local.h
new file mode 100644
index 000000000000..2052a2267337
--- /dev/null
+++ b/arch/loongarch/include/asm/local.h
@@ -0,0 +1,138 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ARCH_LOONGARCH_LOCAL_H
+#define _ARCH_LOONGARCH_LOCAL_H
+
+#include <linux/percpu.h>
+#include <linux/bitops.h>
+#include <linux/atomic.h>
+#include <asm/cmpxchg.h>
+#include <asm/compiler.h>
+
+typedef struct {
+	atomic_long_t a;
+} local_t;
+
+#define LOCAL_INIT(i)	{ ATOMIC_LONG_INIT(i) }
+
+#define local_read(l)	atomic_long_read(&(l)->a)
+#define local_set(l, i) atomic_long_set(&(l)->a, (i))
+
+#define local_add(i, l) atomic_long_add((i), (&(l)->a))
+#define local_sub(i, l) atomic_long_sub((i), (&(l)->a))
+#define local_inc(l)	atomic_long_inc(&(l)->a)
+#define local_dec(l)	atomic_long_dec(&(l)->a)
+
+/*
+ * Same as above, but return the result value
+ */
+static inline long local_add_return(long i, local_t *l)
+{
+	unsigned long result;
+
+	__asm__ __volatile__(
+	"   " __AMADD " %1, %2, %0      \n"
+	: "+ZB" (l->a.counter), "=&r" (result)
+	: "r" (i)
+	: "memory");
+	result = result + i;
+
+	return result;
+}
+
+static inline long local_sub_return(long i, local_t *l)
+{
+	unsigned long result;
+
+	__asm__ __volatile__(
+	"   " __AMADD "%1, %2, %0       \n"
+	: "+ZB" (l->a.counter), "=&r" (result)
+	: "r" (-i)
+	: "memory");
+
+	result = result - i;
+
+	return result;
+}
+
+#define local_cmpxchg(l, o, n) \
+	((long)cmpxchg_local(&((l)->a.counter), (o), (n)))
+#define local_xchg(l, n) (atomic_long_xchg((&(l)->a), (n)))
+
+/**
+ * local_add_unless - add unless the number is a given value
+ * @l: pointer of type local_t
+ * @a: the amount to add to l...
+ * @u: ...unless l is equal to u.
+ *
+ * Atomically adds @a to @l, so long as it was not @u.
+ * Returns non-zero if @l was not @u, and zero otherwise.
+ */
+#define local_add_unless(l, a, u)				\
+({								\
+	long c, old;						\
+	c = local_read(l);					\
+	while (c != (u) && (old = local_cmpxchg((l), c, c + (a))) != c) \
+		c = old;					\
+	c != (u);						\
+})
+#define local_inc_not_zero(l) local_add_unless((l), 1, 0)
+
+#define local_dec_return(l) local_sub_return(1, (l))
+#define local_inc_return(l) local_add_return(1, (l))
+
+/*
+ * local_sub_and_test - subtract value from variable and test result
+ * @i: integer value to subtract
+ * @l: pointer of type local_t
+ *
+ * Atomically subtracts @i from @l and returns
+ * true if the result is zero, or false for all
+ * other cases.
+ */
+#define local_sub_and_test(i, l) (local_sub_return((i), (l)) == 0)
+
+/*
+ * local_inc_and_test - increment and test
+ * @l: pointer of type local_t
+ *
+ * Atomically increments @l by 1
+ * and returns true if the result is zero, or false for all
+ * other cases.
+ */
+#define local_inc_and_test(l) (local_inc_return(l) == 0)
+
+/*
+ * local_dec_and_test - decrement by 1 and test
+ * @l: pointer of type local_t
+ *
+ * Atomically decrements @l by 1 and
+ * returns true if the result is 0, or false for all other
+ * cases.
+ */
+#define local_dec_and_test(l) (local_sub_return(1, (l)) == 0)
+
+/*
+ * local_add_negative - add and test if negative
+ * @l: pointer of type local_t
+ * @i: integer value to add
+ *
+ * Atomically adds @i to @l and returns true
+ * if the result is negative, or false when
+ * result is greater than or equal to zero.
+ */
+#define local_add_negative(i, l) (local_add_return(i, (l)) < 0)
+
+/* Use these for per-cpu local_t variables: on some archs they are
+ * much more efficient than these naive implementations.  Note they take
+ * a variable, not an address.
+ */
+
+#define __local_inc(l)		((l)->a.counter++)
+#define __local_dec(l)		((l)->a.counter++)
+#define __local_add(i, l)	((l)->a.counter += (i))
+#define __local_sub(i, l)	((l)->a.counter -= (i))
+
+#endif /* _ARCH_LOONGARCH_LOCAL_H */
diff --git a/arch/loongarch/include/asm/percpu.h b/arch/loongarch/include/asm/percpu.h
new file mode 100644
index 000000000000..7d5b22ebd834
--- /dev/null
+++ b/arch/loongarch/include/asm/percpu.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM_PERCPU_H
+#define __ASM_PERCPU_H
+
+/* Use r21 for fast access */
+register unsigned long __my_cpu_offset __asm__("$r21");
+
+static inline void set_my_cpu_offset(unsigned long off)
+{
+	__my_cpu_offset = off;
+	csr_writeq(off, PERCPU_BASE_KS);
+}
+#define __my_cpu_offset __my_cpu_offset
+
+#include <asm-generic/percpu.h>
+
+#endif /* __ASM_PERCPU_H */
diff --git a/arch/loongarch/include/asm/spinlock.h b/arch/loongarch/include/asm/spinlock.h
new file mode 100644
index 000000000000..7cb3476999be
--- /dev/null
+++ b/arch/loongarch/include/asm/spinlock.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_SPINLOCK_H
+#define _ASM_SPINLOCK_H
+
+#include <asm/processor.h>
+#include <asm/qspinlock.h>
+#include <asm/qrwlock.h>
+
+#endif /* _ASM_SPINLOCK_H */
diff --git a/arch/loongarch/include/asm/spinlock_types.h b/arch/loongarch/include/asm/spinlock_types.h
new file mode 100644
index 000000000000..7458d036c161
--- /dev/null
+++ b/arch/loongarch/include/asm/spinlock_types.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_SPINLOCK_TYPES_H
+#define _ASM_SPINLOCK_TYPES_H
+
+#include <asm-generic/qspinlock_types.h>
+#include <asm-generic/qrwlock_types.h>
+
+#endif
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 08/24] LoongArch: Add other common headers
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (6 preceding siblings ...)
  2022-04-30  9:05 ` [PATCH V9 07/24] LoongArch: Add atomic/locking headers Huacai Chen
@ 2022-04-30  9:05 ` Huacai Chen
  2022-05-01 11:39   ` WANG Xuerui
  2022-04-30  9:05 ` [PATCH V9 09/24] LoongArch: Add boot and setup routines Huacai Chen
                   ` (16 subsequent siblings)
  24 siblings, 1 reply; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:05 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen

This patch adds some other common headers for basic LoongArch support.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/include/asm/asm-prototypes.h   |   7 +
 arch/loongarch/include/asm/asm.h              | 190 +++++++++++
 arch/loongarch/include/asm/asmmacro.h         | 294 ++++++++++++++++++
 arch/loongarch/include/asm/clocksource.h      |  12 +
 arch/loongarch/include/asm/compiler.h         |  15 +
 arch/loongarch/include/asm/inst.h             |  63 ++++
 arch/loongarch/include/asm/linkage.h          |  36 +++
 arch/loongarch/include/asm/perf_event.h       |  10 +
 arch/loongarch/include/asm/prefetch.h         |  29 ++
 arch/loongarch/include/asm/serial.h           |  11 +
 arch/loongarch/include/asm/time.h             |  50 +++
 arch/loongarch/include/asm/timex.h            |  31 ++
 arch/loongarch/include/asm/topology.h         |  15 +
 arch/loongarch/include/asm/types.h            |  33 ++
 arch/loongarch/include/uapi/asm/bitfield.h    |  15 +
 arch/loongarch/include/uapi/asm/bitsperlong.h |   9 +
 arch/loongarch/include/uapi/asm/byteorder.h   |  13 +
 arch/loongarch/include/uapi/asm/inst.h        |  57 ++++
 arch/loongarch/include/uapi/asm/reg.h         |  59 ++++
 tools/include/uapi/asm/bitsperlong.h          |   2 +
 20 files changed, 951 insertions(+)
 create mode 100644 arch/loongarch/include/asm/asm-prototypes.h
 create mode 100644 arch/loongarch/include/asm/asm.h
 create mode 100644 arch/loongarch/include/asm/asmmacro.h
 create mode 100644 arch/loongarch/include/asm/clocksource.h
 create mode 100644 arch/loongarch/include/asm/compiler.h
 create mode 100644 arch/loongarch/include/asm/inst.h
 create mode 100644 arch/loongarch/include/asm/linkage.h
 create mode 100644 arch/loongarch/include/asm/perf_event.h
 create mode 100644 arch/loongarch/include/asm/prefetch.h
 create mode 100644 arch/loongarch/include/asm/serial.h
 create mode 100644 arch/loongarch/include/asm/time.h
 create mode 100644 arch/loongarch/include/asm/timex.h
 create mode 100644 arch/loongarch/include/asm/topology.h
 create mode 100644 arch/loongarch/include/asm/types.h
 create mode 100644 arch/loongarch/include/uapi/asm/bitfield.h
 create mode 100644 arch/loongarch/include/uapi/asm/bitsperlong.h
 create mode 100644 arch/loongarch/include/uapi/asm/byteorder.h
 create mode 100644 arch/loongarch/include/uapi/asm/inst.h
 create mode 100644 arch/loongarch/include/uapi/asm/reg.h

diff --git a/arch/loongarch/include/asm/asm-prototypes.h b/arch/loongarch/include/asm/asm-prototypes.h
new file mode 100644
index 000000000000..ed06d3997420
--- /dev/null
+++ b/arch/loongarch/include/asm/asm-prototypes.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#include <linux/uaccess.h>
+#include <asm/fpu.h>
+#include <asm/mmu_context.h>
+#include <asm/page.h>
+#include <asm/ftrace.h>
+#include <asm-generic/asm-prototypes.h>
diff --git a/arch/loongarch/include/asm/asm.h b/arch/loongarch/include/asm/asm.h
new file mode 100644
index 000000000000..6de8f9e6a21e
--- /dev/null
+++ b/arch/loongarch/include/asm/asm.h
@@ -0,0 +1,190 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Some useful macros for LoongArch assembler code
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ *
+ * Derived from MIPS:
+ * Copyright (C) 1995, 1996, 1997, 1999, 2001 by Ralf Baechle
+ * Copyright (C) 1999 by Silicon Graphics, Inc.
+ * Copyright (C) 2001 MIPS Technologies, Inc.
+ * Copyright (C) 2002  Maciej W. Rozycki
+ */
+#ifndef __ASM_ASM_H
+#define __ASM_ASM_H
+
+/* LoongArch pref instruction. */
+#ifdef CONFIG_CPU_HAS_PREFETCH
+
+#define PREF(hint, addr, offs)				\
+		preld	hint, addr, offs;		\
+
+#define PREFX(hint, addr, index)			\
+		preldx	hint, addr, index;		\
+
+#else /* !CONFIG_CPU_HAS_PREFETCH */
+
+#define PREF(hint, addr, offs)
+#define PREFX(hint, addr, index)
+
+#endif /* !CONFIG_CPU_HAS_PREFETCH */
+
+/*
+ * Stack alignment
+ */
+#define ALSZ	0xf
+#define ALMASK	~ALSZ
+
+/*
+ * Macros to handle different pointer/register sizes for 32/64-bit code
+ */
+
+/*
+ * Size of a register
+ */
+#ifndef __loongarch64
+#define SZREG	4
+#else
+#define SZREG	8
+#endif
+
+/*
+ * Use the following macros in assemblercode to load/store registers,
+ * pointers etc.
+ */
+#if (SZREG == 4)
+#define REG_L		ld.w
+#define REG_S		st.w
+#define REG_ADDU	add.w
+#define REG_SUBU	sub.w
+#else /* SZREG == 8 */
+#define REG_L		ld.d
+#define REG_S		st.d
+#define REG_ADDU	add.d
+#define REG_SUBU	sub.d
+#endif
+
+/*
+ * How to add/sub/load/store/shift C int variables.
+ */
+#if (_LOONGARCH_SZINT == 32)
+#define INT_ADDU	add.w
+#define INT_ADDIU	addi.w
+#define INT_SUBU	sub.w
+#define INT_L		ld.w
+#define INT_S		st.w
+#define INT_SLL		slli.w
+#define INT_SLLV	sll.w
+#define INT_SRL		srli.w
+#define INT_SRLV	srl.w
+#define INT_SRA		srai.w
+#define INT_SRAV	sra.w
+#endif
+
+#if (_LOONGARCH_SZINT == 64)
+#define INT_ADDU	add.d
+#define INT_ADDIU	addi.d
+#define INT_SUBU	sub.d
+#define INT_L		ld.d
+#define INT_S		st.d
+#define INT_SLL		slli.d
+#define INT_SLLV	sll.d
+#define INT_SRL		srli.d
+#define INT_SRLV	srl.d
+#define INT_SRA		sra.w
+#define INT_SRAV	sra.d
+#endif
+
+/*
+ * How to add/sub/load/store/shift C long variables.
+ */
+#if (_LOONGARCH_SZLONG == 32)
+#define LONG_ADDU	add.w
+#define LONG_ADDIU	addi.w
+#define LONG_SUBU	sub.w
+#define LONG_L		ld.w
+#define LONG_S		st.w
+#define LONG_SP		swp
+#define LONG_SLL	slli.w
+#define LONG_SLLV	sll.w
+#define LONG_SRL	srli.w
+#define LONG_SRLV	srl.w
+#define LONG_SRA	srai.w
+#define LONG_SRAV	sra.w
+
+#ifdef __ASSEMBLY__
+#define LONG		.word
+#endif
+#define LONGSIZE	4
+#define LONGMASK	3
+#define LONGLOG		2
+#endif
+
+#if (_LOONGARCH_SZLONG == 64)
+#define LONG_ADDU	add.d
+#define LONG_ADDIU	addi.d
+#define LONG_SUBU	sub.d
+#define LONG_L		ld.d
+#define LONG_S		st.d
+#define LONG_SP		sdp
+#define LONG_SLL	slli.d
+#define LONG_SLLV	sll.d
+#define LONG_SRL	srli.d
+#define LONG_SRLV	srl.d
+#define LONG_SRA	sra.w
+#define LONG_SRAV	sra.d
+
+#ifdef __ASSEMBLY__
+#define LONG		.dword
+#endif
+#define LONGSIZE	8
+#define LONGMASK	7
+#define LONGLOG		3
+#endif
+
+/*
+ * How to add/sub/load/store/shift pointers.
+ */
+#if (_LOONGARCH_SZPTR == 32)
+#define PTR_ADDU	add.w
+#define PTR_ADDIU	addi.w
+#define PTR_SUBU	sub.w
+#define PTR_L		ld.w
+#define PTR_S		st.w
+#define PTR_LI		li.w
+#define PTR_SLL		slli.w
+#define PTR_SLLV	sll.w
+#define PTR_SRL		srli.w
+#define PTR_SRLV	srl.w
+#define PTR_SRA		srai.w
+#define PTR_SRAV	sra.w
+
+#define PTR_SCALESHIFT	2
+
+#define PTR		.word
+#define PTRSIZE		4
+#define PTRLOG		2
+#endif
+
+#if (_LOONGARCH_SZPTR == 64)
+#define PTR_ADDU	add.d
+#define PTR_ADDIU	addi.d
+#define PTR_SUBU	sub.d
+#define PTR_L		ld.d
+#define PTR_S		st.d
+#define PTR_LI		li.d
+#define PTR_SLL		slli.d
+#define PTR_SLLV	sll.d
+#define PTR_SRL		srli.d
+#define PTR_SRLV	srl.d
+#define PTR_SRA		srai.d
+#define PTR_SRAV	sra.d
+
+#define PTR_SCALESHIFT	3
+
+#define PTR		.dword
+#define PTRSIZE		8
+#define PTRLOG		3
+#endif
+
+#endif /* __ASM_ASM_H */
diff --git a/arch/loongarch/include/asm/asmmacro.h b/arch/loongarch/include/asm/asmmacro.h
new file mode 100644
index 000000000000..d7089fab00e1
--- /dev/null
+++ b/arch/loongarch/include/asm/asmmacro.h
@@ -0,0 +1,294 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_ASMMACRO_H
+#define _ASM_ASMMACRO_H
+
+#include <asm/asm-offsets.h>
+#include <asm/regdef.h>
+#include <asm/fpregdef.h>
+#include <asm/loongarch.h>
+
+#undef v0
+#undef v1
+
+	.macro	parse_v var val
+	\var	= \val
+	.endm
+
+	.macro	parse_r var r
+	\var	= -1
+	.ifc	\r, $r0
+	\var	= 0
+	.endif
+	.ifc	\r, $r1
+	\var	= 1
+	.endif
+	.ifc	\r, $r2
+	\var	= 2
+	.endif
+	.ifc	\r, $r3
+	\var	= 3
+	.endif
+	.ifc	\r, $r4
+	\var	= 4
+	.endif
+	.ifc	\r, $r5
+	\var	= 5
+	.endif
+	.ifc	\r, $r6
+	\var	= 6
+	.endif
+	.ifc	\r, $r7
+	\var	= 7
+	.endif
+	.ifc	\r, $r8
+	\var	= 8
+	.endif
+	.ifc	\r, $r9
+	\var	= 9
+	.endif
+	.ifc	\r, $r10
+	\var	= 10
+	.endif
+	.ifc	\r, $r11
+	\var	= 11
+	.endif
+	.ifc	\r, $r12
+	\var	= 12
+	.endif
+	.ifc	\r, $r13
+	\var	= 13
+	.endif
+	.ifc	\r, $r14
+	\var	= 14
+	.endif
+	.ifc	\r, $r15
+	\var	= 15
+	.endif
+	.ifc	\r, $r16
+	\var	= 16
+	.endif
+	.ifc	\r, $r17
+	\var	= 17
+	.endif
+	.ifc	\r, $r18
+	\var	= 18
+	.endif
+	.ifc	\r, $r19
+	\var	= 19
+	.endif
+	.ifc	\r, $r20
+	\var	= 20
+	.endif
+	.ifc	\r, $r21
+	\var	= 21
+	.endif
+	.ifc	\r, $r22
+	\var	= 22
+	.endif
+	.ifc	\r, $r23
+	\var	= 23
+	.endif
+	.ifc	\r, $r24
+	\var	= 24
+	.endif
+	.ifc	\r, $r25
+	\var	= 25
+	.endif
+	.ifc	\r, $r26
+	\var	= 26
+	.endif
+	.ifc	\r, $r27
+	\var	= 27
+	.endif
+	.ifc	\r, $r28
+	\var	= 28
+	.endif
+	.ifc	\r, $r29
+	\var	= 29
+	.endif
+	.ifc	\r, $r30
+	\var	= 30
+	.endif
+	.ifc	\r, $r31
+	\var	= 31
+	.endif
+	.iflt	\var
+	.error	"Unable to parse register name \r"
+	.endif
+	.endm
+
+	.macro	cpu_save_nonscratch thread
+	stptr.d	s0, \thread, THREAD_REG23
+	stptr.d	s1, \thread, THREAD_REG24
+	stptr.d	s2, \thread, THREAD_REG25
+	stptr.d	s3, \thread, THREAD_REG26
+	stptr.d	s4, \thread, THREAD_REG27
+	stptr.d	s5, \thread, THREAD_REG28
+	stptr.d	s6, \thread, THREAD_REG29
+	stptr.d	s7, \thread, THREAD_REG30
+	stptr.d	s8, \thread, THREAD_REG31
+	stptr.d	sp, \thread, THREAD_REG03
+	stptr.d	fp, \thread, THREAD_REG22
+	.endm
+
+	.macro	cpu_restore_nonscratch thread
+	ldptr.d	s0, \thread, THREAD_REG23
+	ldptr.d	s1, \thread, THREAD_REG24
+	ldptr.d	s2, \thread, THREAD_REG25
+	ldptr.d	s3, \thread, THREAD_REG26
+	ldptr.d	s4, \thread, THREAD_REG27
+	ldptr.d	s5, \thread, THREAD_REG28
+	ldptr.d	s6, \thread, THREAD_REG29
+	ldptr.d	s7, \thread, THREAD_REG30
+	ldptr.d	s8, \thread, THREAD_REG31
+	ldptr.d	ra, \thread, THREAD_REG01
+	ldptr.d	sp, \thread, THREAD_REG03
+	ldptr.d	fp, \thread, THREAD_REG22
+	.endm
+
+	.macro fpu_save_csr thread tmp
+	movfcsr2gr	\tmp, fcsr0
+	stptr.w	\tmp, \thread, THREAD_FCSR
+	.endm
+
+	.macro fpu_restore_csr thread tmp
+	ldptr.w	\tmp, \thread, THREAD_FCSR
+	movgr2fcsr	fcsr0, \tmp
+	.endm
+
+	.macro fpu_save_cc thread tmp0 tmp1
+	movcf2gr	\tmp0, $fcc0
+	move	\tmp1, \tmp0
+	movcf2gr	\tmp0, $fcc1
+	bstrins.d	\tmp1, \tmp0, 15, 8
+	movcf2gr	\tmp0, $fcc2
+	bstrins.d	\tmp1, \tmp0, 23, 16
+	movcf2gr	\tmp0, $fcc3
+	bstrins.d	\tmp1, \tmp0, 31, 24
+	movcf2gr	\tmp0, $fcc4
+	bstrins.d	\tmp1, \tmp0, 39, 32
+	movcf2gr	\tmp0, $fcc5
+	bstrins.d	\tmp1, \tmp0, 47, 40
+	movcf2gr	\tmp0, $fcc6
+	bstrins.d	\tmp1, \tmp0, 55, 48
+	movcf2gr	\tmp0, $fcc7
+	bstrins.d	\tmp1, \tmp0, 63, 56
+	stptr.d		\tmp1, \thread, THREAD_FCC
+	.endm
+
+	.macro fpu_restore_cc thread tmp0 tmp1
+	ldptr.d	\tmp0, \thread, THREAD_FCC
+	bstrpick.d	\tmp1, \tmp0, 7, 0
+	movgr2cf	$fcc0, \tmp1
+	bstrpick.d	\tmp1, \tmp0, 15, 8
+	movgr2cf	$fcc1, \tmp1
+	bstrpick.d	\tmp1, \tmp0, 23, 16
+	movgr2cf	$fcc2, \tmp1
+	bstrpick.d	\tmp1, \tmp0, 31, 24
+	movgr2cf	$fcc3, \tmp1
+	bstrpick.d	\tmp1, \tmp0, 39, 32
+	movgr2cf	$fcc4, \tmp1
+	bstrpick.d	\tmp1, \tmp0, 47, 40
+	movgr2cf	$fcc5, \tmp1
+	bstrpick.d	\tmp1, \tmp0, 55, 48
+	movgr2cf	$fcc6, \tmp1
+	bstrpick.d	\tmp1, \tmp0, 63, 56
+	movgr2cf	$fcc7, \tmp1
+	.endm
+
+	.macro	fpu_save_double thread tmp
+	li.w	\tmp, THREAD_FPR0
+	PTR_ADDU \tmp, \tmp, \thread
+	fst.d	$f0, \tmp, THREAD_FPR0  - THREAD_FPR0
+	fst.d	$f1, \tmp, THREAD_FPR1  - THREAD_FPR0
+	fst.d	$f2, \tmp, THREAD_FPR2  - THREAD_FPR0
+	fst.d	$f3, \tmp, THREAD_FPR3  - THREAD_FPR0
+	fst.d	$f4, \tmp, THREAD_FPR4  - THREAD_FPR0
+	fst.d	$f5, \tmp, THREAD_FPR5  - THREAD_FPR0
+	fst.d	$f6, \tmp, THREAD_FPR6  - THREAD_FPR0
+	fst.d	$f7, \tmp, THREAD_FPR7  - THREAD_FPR0
+	fst.d	$f8, \tmp, THREAD_FPR8  - THREAD_FPR0
+	fst.d	$f9, \tmp, THREAD_FPR9  - THREAD_FPR0
+	fst.d	$f10, \tmp, THREAD_FPR10 - THREAD_FPR0
+	fst.d	$f11, \tmp, THREAD_FPR11 - THREAD_FPR0
+	fst.d	$f12, \tmp, THREAD_FPR12 - THREAD_FPR0
+	fst.d	$f13, \tmp, THREAD_FPR13 - THREAD_FPR0
+	fst.d	$f14, \tmp, THREAD_FPR14 - THREAD_FPR0
+	fst.d	$f15, \tmp, THREAD_FPR15 - THREAD_FPR0
+	fst.d	$f16, \tmp, THREAD_FPR16 - THREAD_FPR0
+	fst.d	$f17, \tmp, THREAD_FPR17 - THREAD_FPR0
+	fst.d	$f18, \tmp, THREAD_FPR18 - THREAD_FPR0
+	fst.d	$f19, \tmp, THREAD_FPR19 - THREAD_FPR0
+	fst.d	$f20, \tmp, THREAD_FPR20 - THREAD_FPR0
+	fst.d	$f21, \tmp, THREAD_FPR21 - THREAD_FPR0
+	fst.d	$f22, \tmp, THREAD_FPR22 - THREAD_FPR0
+	fst.d	$f23, \tmp, THREAD_FPR23 - THREAD_FPR0
+	fst.d	$f24, \tmp, THREAD_FPR24 - THREAD_FPR0
+	fst.d	$f25, \tmp, THREAD_FPR25 - THREAD_FPR0
+	fst.d	$f26, \tmp, THREAD_FPR26 - THREAD_FPR0
+	fst.d	$f27, \tmp, THREAD_FPR27 - THREAD_FPR0
+	fst.d	$f28, \tmp, THREAD_FPR28 - THREAD_FPR0
+	fst.d	$f29, \tmp, THREAD_FPR29 - THREAD_FPR0
+	fst.d	$f30, \tmp, THREAD_FPR30 - THREAD_FPR0
+	fst.d	$f31, \tmp, THREAD_FPR31 - THREAD_FPR0
+	.endm
+
+	.macro	fpu_restore_double thread tmp
+	li.w	\tmp, THREAD_FPR0
+	PTR_ADDU \tmp, \tmp, \thread
+	fld.d	$f0, \tmp, THREAD_FPR0  - THREAD_FPR0
+	fld.d	$f1, \tmp, THREAD_FPR1  - THREAD_FPR0
+	fld.d	$f2, \tmp, THREAD_FPR2  - THREAD_FPR0
+	fld.d	$f3, \tmp, THREAD_FPR3  - THREAD_FPR0
+	fld.d	$f4, \tmp, THREAD_FPR4  - THREAD_FPR0
+	fld.d	$f5, \tmp, THREAD_FPR5  - THREAD_FPR0
+	fld.d	$f6, \tmp, THREAD_FPR6  - THREAD_FPR0
+	fld.d	$f7, \tmp, THREAD_FPR7  - THREAD_FPR0
+	fld.d	$f8, \tmp, THREAD_FPR8  - THREAD_FPR0
+	fld.d	$f9, \tmp, THREAD_FPR9  - THREAD_FPR0
+	fld.d	$f10, \tmp, THREAD_FPR10 - THREAD_FPR0
+	fld.d	$f11, \tmp, THREAD_FPR11 - THREAD_FPR0
+	fld.d	$f12, \tmp, THREAD_FPR12 - THREAD_FPR0
+	fld.d	$f13, \tmp, THREAD_FPR13 - THREAD_FPR0
+	fld.d	$f14, \tmp, THREAD_FPR14 - THREAD_FPR0
+	fld.d	$f15, \tmp, THREAD_FPR15 - THREAD_FPR0
+	fld.d	$f16, \tmp, THREAD_FPR16 - THREAD_FPR0
+	fld.d	$f17, \tmp, THREAD_FPR17 - THREAD_FPR0
+	fld.d	$f18, \tmp, THREAD_FPR18 - THREAD_FPR0
+	fld.d	$f19, \tmp, THREAD_FPR19 - THREAD_FPR0
+	fld.d	$f20, \tmp, THREAD_FPR20 - THREAD_FPR0
+	fld.d	$f21, \tmp, THREAD_FPR21 - THREAD_FPR0
+	fld.d	$f22, \tmp, THREAD_FPR22 - THREAD_FPR0
+	fld.d	$f23, \tmp, THREAD_FPR23 - THREAD_FPR0
+	fld.d	$f24, \tmp, THREAD_FPR24 - THREAD_FPR0
+	fld.d	$f25, \tmp, THREAD_FPR25 - THREAD_FPR0
+	fld.d	$f26, \tmp, THREAD_FPR26 - THREAD_FPR0
+	fld.d	$f27, \tmp, THREAD_FPR27 - THREAD_FPR0
+	fld.d	$f28, \tmp, THREAD_FPR28 - THREAD_FPR0
+	fld.d	$f29, \tmp, THREAD_FPR29 - THREAD_FPR0
+	fld.d	$f30, \tmp, THREAD_FPR30 - THREAD_FPR0
+	fld.d	$f31, \tmp, THREAD_FPR31 - THREAD_FPR0
+	.endm
+
+.macro not dst src
+	nor	\dst, \src, zero
+.endm
+
+.macro bgt r0 r1 label
+	blt	\r1, \r0, \label
+.endm
+
+.macro bltz r0 label
+	blt	\r0, zero, \label
+.endm
+
+.macro bgez r0 label
+	bge	\r0, zero, \label
+.endm
+
+#define v0 $r4
+#define v1 $r5
+#endif /* _ASM_ASMMACRO_H */
diff --git a/arch/loongarch/include/asm/clocksource.h b/arch/loongarch/include/asm/clocksource.h
new file mode 100644
index 000000000000..58e64aa05d26
--- /dev/null
+++ b/arch/loongarch/include/asm/clocksource.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Author: Huacai Chen <chenhuacai@loongson.cn>
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#ifndef __ASM_CLOCKSOURCE_H
+#define __ASM_CLOCKSOURCE_H
+
+#include <asm/vdso/clocksource.h>
+
+#endif /* __ASM_CLOCKSOURCE_H */
diff --git a/arch/loongarch/include/asm/compiler.h b/arch/loongarch/include/asm/compiler.h
new file mode 100644
index 000000000000..657cebe70ace
--- /dev/null
+++ b/arch/loongarch/include/asm/compiler.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_COMPILER_H
+#define _ASM_COMPILER_H
+
+#define GCC_OFF_SMALL_ASM() "ZC"
+
+#define LOONGARCH_ISA_LEVEL "loongarch"
+#define LOONGARCH_ISA_ARCH_LEVEL "arch=loongarch"
+#define LOONGARCH_ISA_LEVEL_RAW loongarch
+#define LOONGARCH_ISA_ARCH_LEVEL_RAW LOONGARCH_ISA_LEVEL_RAW
+
+#endif /* _ASM_COMPILER_H */
diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
new file mode 100644
index 000000000000..46166ee1e33f
--- /dev/null
+++ b/arch/loongarch/include/asm/inst.h
@@ -0,0 +1,63 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_INST_H
+#define _ASM_INST_H
+
+#include <linux/types.h>
+#include <asm/asm.h>
+#include <uapi/asm/inst.h>
+
+#define ADDR_IMMMASK_LU52ID	0xFFF0000000000000
+#define ADDR_IMMMASK_LU32ID	0x000FFFFF00000000
+#define ADDR_IMMMASK_ADDU16ID	0x00000000FFFF0000
+
+#define ADDR_IMMSHIFT_LU52ID	52
+#define ADDR_IMMSHIFT_LU32ID	32
+#define ADDR_IMMSHIFT_ADDU16ID	16
+
+#define ADDR_IMM(addr, INSN)	((addr & ADDR_IMMMASK_##INSN) >> ADDR_IMMSHIFT_##INSN)
+
+enum loongarch_gpr {
+	LOONGARCH_GPR_ZERO = 0,
+	LOONGARCH_GPR_RA = 1,
+	LOONGARCH_GPR_TP = 2,
+	LOONGARCH_GPR_SP = 3,
+	LOONGARCH_GPR_A0 = 4,
+	LOONGARCH_GPR_A1,
+	LOONGARCH_GPR_A2,
+	LOONGARCH_GPR_A3,
+	LOONGARCH_GPR_A4,
+	LOONGARCH_GPR_A5,
+	LOONGARCH_GPR_A6,
+	LOONGARCH_GPR_A7,
+	LOONGARCH_GPR_V0 = 4,
+	LOONGARCH_GPR_V1 = 5,
+	LOONGARCH_GPR_T0 = 12,
+	LOONGARCH_GPR_T1,
+	LOONGARCH_GPR_T2,
+	LOONGARCH_GPR_T3,
+	LOONGARCH_GPR_T4,
+	LOONGARCH_GPR_T5,
+	LOONGARCH_GPR_T6,
+	LOONGARCH_GPR_T7,
+	LOONGARCH_GPR_T8,
+	LOONGARCH_GPR_FP = 22,
+	LOONGARCH_GPR_S0 = 23,
+	LOONGARCH_GPR_S1,
+	LOONGARCH_GPR_S2,
+	LOONGARCH_GPR_S3,
+	LOONGARCH_GPR_S4,
+	LOONGARCH_GPR_S5,
+	LOONGARCH_GPR_S6,
+	LOONGARCH_GPR_S7,
+	LOONGARCH_GPR_S8,
+	LOONGARCH_GPR_MAX
+};
+
+u32 larch_insn_gen_lu32id(enum loongarch_gpr rd, int imm);
+u32 larch_insn_gen_lu52id(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
+u32 larch_insn_gen_jirl(enum loongarch_gpr rd, enum loongarch_gpr rj, unsigned long pc, unsigned long dest);
+
+#endif /* _ASM_INST_H */
diff --git a/arch/loongarch/include/asm/linkage.h b/arch/loongarch/include/asm/linkage.h
new file mode 100644
index 000000000000..283b3389b561
--- /dev/null
+++ b/arch/loongarch/include/asm/linkage.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_LINKAGE_H
+#define __ASM_LINKAGE_H
+
+#define __ALIGN		.align 2
+#define __ALIGN_STR	".align 2"
+
+#define SYM_FUNC_START(name)				\
+	SYM_START(name, SYM_L_GLOBAL, SYM_A_ALIGN)	\
+	.cfi_startproc;
+
+#define SYM_FUNC_START_NOALIGN(name)			\
+	SYM_START(name, SYM_L_GLOBAL, SYM_A_NONE)	\
+	.cfi_startproc;
+
+#define SYM_FUNC_START_LOCAL(name)			\
+	SYM_START(name, SYM_L_LOCAL, SYM_A_ALIGN)	\
+	.cfi_startproc;
+
+#define SYM_FUNC_START_LOCAL_NOALIGN(name)		\
+	SYM_START(name, SYM_L_LOCAL, SYM_A_NONE)	\
+	.cfi_startproc;
+
+#define SYM_FUNC_START_WEAK(name)			\
+	SYM_START(name, SYM_L_WEAK, SYM_A_ALIGN)	\
+	.cfi_startproc;
+
+#define SYM_FUNC_START_WEAK_NOALIGN(name)		\
+	SYM_START(name, SYM_L_WEAK, SYM_A_NONE)		\
+	.cfi_startproc;
+
+#define SYM_FUNC_END(name)				\
+	.cfi_endproc;					\
+	SYM_END(name, SYM_T_FUNC)
+
+#endif
diff --git a/arch/loongarch/include/asm/perf_event.h b/arch/loongarch/include/asm/perf_event.h
new file mode 100644
index 000000000000..44293ec8c153
--- /dev/null
+++ b/arch/loongarch/include/asm/perf_event.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Author: Huacai Chen <chenhuacai@loongson.cn>
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#ifndef __LOONGARCH_PERF_EVENT_H__
+#define __LOONGARCH_PERF_EVENT_H__
+/* Leave it empty here. The file is required by linux/perf_event.h */
+#endif /* __LOONGARCH_PERF_EVENT_H__ */
diff --git a/arch/loongarch/include/asm/prefetch.h b/arch/loongarch/include/asm/prefetch.h
new file mode 100644
index 000000000000..1672262a5e2e
--- /dev/null
+++ b/arch/loongarch/include/asm/prefetch.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM_PREFETCH_H
+#define __ASM_PREFETCH_H
+
+#define Pref_Load	0
+#define Pref_Store	8
+
+#ifdef __ASSEMBLY__
+
+	.macro	__pref hint addr
+#ifdef CONFIG_CPU_HAS_PREFETCH
+	preld	\hint, \addr, 0
+#endif
+	.endm
+
+	.macro	pref_load addr
+	__pref	Pref_Load, \addr
+	.endm
+
+	.macro	pref_store addr
+	__pref	Pref_Store, \addr
+	.endm
+
+#endif
+
+#endif /* __ASM_PREFETCH_H */
diff --git a/arch/loongarch/include/asm/serial.h b/arch/loongarch/include/asm/serial.h
new file mode 100644
index 000000000000..3fb550eb9115
--- /dev/null
+++ b/arch/loongarch/include/asm/serial.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM__SERIAL_H
+#define __ASM__SERIAL_H
+
+#define BASE_BAUD 0
+#define STD_COM_FLAGS (ASYNC_BOOT_AUTOCONF | ASYNC_SKIP_TEST)
+
+#endif /* __ASM__SERIAL_H */
diff --git a/arch/loongarch/include/asm/time.h b/arch/loongarch/include/asm/time.h
new file mode 100644
index 000000000000..ace1665695b8
--- /dev/null
+++ b/arch/loongarch/include/asm/time.h
@@ -0,0 +1,50 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_TIME_H
+#define _ASM_TIME_H
+
+#include <linux/clockchips.h>
+#include <linux/clocksource.h>
+#include <asm/loongarch.h>
+
+extern u64 cpu_clock_freq;
+extern u64 const_clock_freq;
+
+extern void sync_counter(void);
+
+static inline unsigned int calc_const_freq(void)
+{
+	unsigned int res;
+	unsigned int base_freq;
+	unsigned int cfm, cfd;
+
+	res = read_cpucfg(LOONGARCH_CPUCFG2);
+	if (!(res & CPUCFG2_LLFTP))
+		return 0;
+
+	base_freq = read_cpucfg(LOONGARCH_CPUCFG4);
+	res = read_cpucfg(LOONGARCH_CPUCFG5);
+	cfm = res & 0xffff;
+	cfd = (res >> 16) & 0xffff;
+
+	if (!base_freq || !cfm || !cfd)
+		return 0;
+	else
+		return (base_freq * cfm / cfd);
+}
+
+/*
+ * Initialize the calling CPU's timer interrupt as clockevent device
+ */
+extern int constant_clockevent_init(void);
+extern int constant_clocksource_init(void);
+
+static inline void clockevent_set_clock(struct clock_event_device *cd,
+					unsigned int clock)
+{
+	clockevents_calc_mult_shift(cd, clock, 4);
+}
+
+#endif /* _ASM_TIME_H */
diff --git a/arch/loongarch/include/asm/timex.h b/arch/loongarch/include/asm/timex.h
new file mode 100644
index 000000000000..3f8db082f00d
--- /dev/null
+++ b/arch/loongarch/include/asm/timex.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_TIMEX_H
+#define _ASM_TIMEX_H
+
+#ifdef __KERNEL__
+
+#include <linux/compiler.h>
+
+#include <asm/cpu.h>
+#include <asm/cpu-features.h>
+
+/*
+ * Standard way to access the cycle counter.
+ * Currently only used on SMP for scheduling.
+ *
+ * We know that all SMP capable CPUs have cycle counters.
+ */
+
+typedef unsigned long cycles_t;
+
+static inline cycles_t get_cycles(void)
+{
+	return drdtime();
+}
+
+#endif /* __KERNEL__ */
+
+#endif /*  _ASM_TIMEX_H */
diff --git a/arch/loongarch/include/asm/topology.h b/arch/loongarch/include/asm/topology.h
new file mode 100644
index 000000000000..9ac71a25207a
--- /dev/null
+++ b/arch/loongarch/include/asm/topology.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM_TOPOLOGY_H
+#define __ASM_TOPOLOGY_H
+
+#include <linux/smp.h>
+
+#define cpu_logical_map(cpu)  0
+
+#include <asm-generic/topology.h>
+
+static inline void arch_fix_phys_package_id(int num, u32 slot) { }
+#endif /* __ASM_TOPOLOGY_H */
diff --git a/arch/loongarch/include/asm/types.h b/arch/loongarch/include/asm/types.h
new file mode 100644
index 000000000000..f783cf11ea52
--- /dev/null
+++ b/arch/loongarch/include/asm/types.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_TYPES_H
+#define _ASM_TYPES_H
+
+#include <asm-generic/int-ll64.h>
+#include <uapi/asm/types.h>
+
+/*
+ * The following macros are especially useful for __asm__
+ * inline assembler.
+ */
+#ifndef __STR
+#define __STR(x) #x
+#endif
+#ifndef STR
+#define STR(x) __STR(x)
+#endif
+
+/*
+ *  Configure language
+ */
+#ifdef __ASSEMBLY__
+#define _ULCAST_
+#define _U64CAST_
+#else
+#define _ULCAST_ (unsigned long)
+#define _U64CAST_ (u64)
+#endif
+
+#endif /* _ASM_TYPES_H */
diff --git a/arch/loongarch/include/uapi/asm/bitfield.h b/arch/loongarch/include/uapi/asm/bitfield.h
new file mode 100644
index 000000000000..e31a719b7007
--- /dev/null
+++ b/arch/loongarch/include/uapi/asm/bitfield.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
+/*
+ * Author: Hanlu Li <lihanlu@loongson.cn>
+ *         Huacai Chen <chenhuacai@loongson.cn>
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __UAPI_ASM_BITFIELD_H
+#define __UAPI_ASM_BITFIELD_H
+
+#define __BITFIELD_FIELD(field, more)					\
+	more								\
+	field;
+
+#endif /* __UAPI_ASM_BITFIELD_H */
diff --git a/arch/loongarch/include/uapi/asm/bitsperlong.h b/arch/loongarch/include/uapi/asm/bitsperlong.h
new file mode 100644
index 000000000000..5c2c8779a695
--- /dev/null
+++ b/arch/loongarch/include/uapi/asm/bitsperlong.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef __ASM_LOONGARCH_BITSPERLONG_H
+#define __ASM_LOONGARCH_BITSPERLONG_H
+
+#define __BITS_PER_LONG _LOONGARCH_SZLONG
+
+#include <asm-generic/bitsperlong.h>
+
+#endif /* __ASM_LOONGARCH_BITSPERLONG_H */
diff --git a/arch/loongarch/include/uapi/asm/byteorder.h b/arch/loongarch/include/uapi/asm/byteorder.h
new file mode 100644
index 000000000000..b1722d890deb
--- /dev/null
+++ b/arch/loongarch/include/uapi/asm/byteorder.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
+/*
+ * Author: Hanlu Li <lihanlu@loongson.cn>
+ *         Huacai Chen <chenhuacai@loongson.cn>
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_BYTEORDER_H
+#define _ASM_BYTEORDER_H
+
+#include <linux/byteorder/little_endian.h>
+
+#endif /* _ASM_BYTEORDER_H */
diff --git a/arch/loongarch/include/uapi/asm/inst.h b/arch/loongarch/include/uapi/asm/inst.h
new file mode 100644
index 000000000000..fa00cc5ede9d
--- /dev/null
+++ b/arch/loongarch/include/uapi/asm/inst.h
@@ -0,0 +1,57 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Format of an instruction in memory.
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _UAPI_ASM_INST_H
+#define _UAPI_ASM_INST_H
+
+#include <asm/bitfield.h>
+
+enum reg1i20_op {
+	lu12iw_op	= 0x0a,
+	lu32id_op	= 0x0b,
+};
+
+enum reg2i12_op {
+	lu52id_op	= 0x0c,
+};
+
+enum reg2i16_op {
+	jirl_op		= 0x13,
+};
+
+struct reg1i20_format {
+	__BITFIELD_FIELD(unsigned int opcode : 7,
+	__BITFIELD_FIELD(unsigned int simmediate : 20,
+	__BITFIELD_FIELD(unsigned int rd : 5,
+	;)))
+};
+
+struct reg2i12_format {
+	__BITFIELD_FIELD(unsigned int opcode : 10,
+	__BITFIELD_FIELD(signed int simmediate : 12,
+	__BITFIELD_FIELD(unsigned int rj : 5,
+	__BITFIELD_FIELD(unsigned int rd : 5,
+	;))))
+};
+
+struct reg2i16_format {
+	__BITFIELD_FIELD(unsigned int opcode : 6,
+	__BITFIELD_FIELD(unsigned int simmediate : 16,
+	__BITFIELD_FIELD(unsigned int rj : 5,
+	__BITFIELD_FIELD(unsigned int rd : 5,
+	;))))
+};
+
+union loongarch_instruction {
+	unsigned int word;
+	struct reg1i20_format reg1i20_format;
+	struct reg2i12_format reg2i12_format;
+	struct reg2i16_format reg2i16_format;
+};
+
+#define LOONGARCH_INSN_SIZE	sizeof(union loongarch_instruction)
+
+#endif /* _UAPI_ASM_INST_H */
diff --git a/arch/loongarch/include/uapi/asm/reg.h b/arch/loongarch/include/uapi/asm/reg.h
new file mode 100644
index 000000000000..90ad910c60eb
--- /dev/null
+++ b/arch/loongarch/include/uapi/asm/reg.h
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Various register offset definitions for debuggers, core file
+ * examiners and whatnot.
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#ifndef __UAPI_ASM_LOONGARCH_REG_H
+#define __UAPI_ASM_LOONGARCH_REG_H
+
+#define LOONGARCH_EF_R0		0
+#define LOONGARCH_EF_R1		1
+#define LOONGARCH_EF_R2		2
+#define LOONGARCH_EF_R3		3
+#define LOONGARCH_EF_R4		4
+#define LOONGARCH_EF_R5		5
+#define LOONGARCH_EF_R6		6
+#define LOONGARCH_EF_R7		7
+#define LOONGARCH_EF_R8		8
+#define LOONGARCH_EF_R9		9
+#define LOONGARCH_EF_R10	10
+#define LOONGARCH_EF_R11	11
+#define LOONGARCH_EF_R12	12
+#define LOONGARCH_EF_R13	13
+#define LOONGARCH_EF_R14	14
+#define LOONGARCH_EF_R15	15
+#define LOONGARCH_EF_R16	16
+#define LOONGARCH_EF_R17	17
+#define LOONGARCH_EF_R18	18
+#define LOONGARCH_EF_R19	19
+#define LOONGARCH_EF_R20	20
+#define LOONGARCH_EF_R21	21
+#define LOONGARCH_EF_R22	22
+#define LOONGARCH_EF_R23	23
+#define LOONGARCH_EF_R24	24
+#define LOONGARCH_EF_R25	25
+#define LOONGARCH_EF_R26	26
+#define LOONGARCH_EF_R27	27
+#define LOONGARCH_EF_R28	28
+#define LOONGARCH_EF_R29	29
+#define LOONGARCH_EF_R30	30
+#define LOONGARCH_EF_R31	31
+
+/*
+ * Saved special registers
+ */
+#define LOONGARCH_EF_ORIG_A0	32
+#define LOONGARCH_EF_CSR_ERA	33
+#define LOONGARCH_EF_CSR_BADV	34
+#define LOONGARCH_EF_CSR_CRMD	35
+#define LOONGARCH_EF_CSR_PRMD	36
+#define LOONGARCH_EF_CSR_EUEN	37
+#define LOONGARCH_EF_CSR_ECFG	38
+#define LOONGARCH_EF_CSR_ESTAT	39
+
+#define LOONGARCH_EF_SIZE	320	/* size in bytes */
+
+#endif /* __UAPI_ASM_LOONGARCH_REG_H */
diff --git a/tools/include/uapi/asm/bitsperlong.h b/tools/include/uapi/asm/bitsperlong.h
index edba4d93e9e6..da5206517158 100644
--- a/tools/include/uapi/asm/bitsperlong.h
+++ b/tools/include/uapi/asm/bitsperlong.h
@@ -17,6 +17,8 @@
 #include "../../../arch/riscv/include/uapi/asm/bitsperlong.h"
 #elif defined(__alpha__)
 #include "../../../arch/alpha/include/uapi/asm/bitsperlong.h"
+#elif defined(__loongarch__)
+#include "../../../arch/loongarch/include/uapi/asm/bitsperlong.h"
 #else
 #include <asm-generic/bitsperlong.h>
 #endif
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 09/24] LoongArch: Add boot and setup routines
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (7 preceding siblings ...)
  2022-04-30  9:05 ` [PATCH V9 08/24] LoongArch: Add other common headers Huacai Chen
@ 2022-04-30  9:05 ` Huacai Chen
  2022-04-30  9:05 ` [PATCH V9 10/24] LoongArch: Add exception/interrupt handling Huacai Chen
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:05 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen,
	Ard Biesheuvel, linux-efi

This patch adds basic boot, setup and reset routines for LoongArch.
LoongArch uses UEFI-based firmware. The firmware uses ACPI and DMI/
SMBIOS to pass configuration information to the Linux kernel.

Now the boot information passed to kernel is like this:
1, kernel get 3 register values (a0, a1 and a2) from bootloader.
2, a0 is "argc", a1 is "argv", so "kernel cmdline" comes from a0/a1.
3, a2 is "environ", which is a pointer to the "struct boot_params".
4, "struct boot_params" include a "systemtable" pointer, whose type is
   "efi_system_table_t". Most configuration information, include ACPI
   tables and SMBIOS tables, come from here.

The above interface is an internal interface between bootloader (grub,
efistub, etc.) and the raw kernel. You can use this method to boot the
Linux kernel in raw elf format, but it is recommend to use the standard
UEFI boot protocol which is added in later patches.

ECR for adding LoongArch support in ACPI:
https://mantis.uefi.org/mantis/view.php?id=2203

ECR for adding LoongArch support in ACPI (version update):
https://mantis.uefi.org/mantis/view.php?id=2268

ECR for adding LoongArch support in UEFI:
https://mantis.uefi.org/mantis/view.php?id=2313

ACPI changes of LoongArch have been approved in the last year, but the
new version of ACPI SPEC hasn't been made public yet. And UEFI changes
of LoongArch are under review now.

Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: linux-efi@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/include/asm/acenv.h      |  17 +
 arch/loongarch/include/asm/acpi.h       |  38 +++
 arch/loongarch/include/asm/boot_param.h |  97 ++++++
 arch/loongarch/include/asm/bootinfo.h   |  29 ++
 arch/loongarch/include/asm/dmi.h        |  24 ++
 arch/loongarch/include/asm/efi.h        |  33 ++
 arch/loongarch/include/asm/fw.h         |  18 +
 arch/loongarch/include/asm/reboot.h     |  10 +
 arch/loongarch/include/asm/setup.h      |  21 ++
 arch/loongarch/kernel/acpi.c            | 338 ++++++++++++++++++
 arch/loongarch/kernel/cacheinfo.c       | 122 +++++++
 arch/loongarch/kernel/cmdline.c         |  31 ++
 arch/loongarch/kernel/cpu-probe.c       | 305 +++++++++++++++++
 arch/loongarch/kernel/efi.c             | 235 +++++++++++++
 arch/loongarch/kernel/env.c             | 176 ++++++++++
 arch/loongarch/kernel/head.S            |  72 ++++
 arch/loongarch/kernel/mem.c             |  83 +++++
 arch/loongarch/kernel/reset.c           |  90 +++++
 arch/loongarch/kernel/setup.c           | 434 ++++++++++++++++++++++++
 arch/loongarch/kernel/time.c            | 220 ++++++++++++
 arch/loongarch/kernel/topology.c        |  13 +
 include/linux/efi.h                     |   1 +
 22 files changed, 2407 insertions(+)
 create mode 100644 arch/loongarch/include/asm/acenv.h
 create mode 100644 arch/loongarch/include/asm/acpi.h
 create mode 100644 arch/loongarch/include/asm/boot_param.h
 create mode 100644 arch/loongarch/include/asm/bootinfo.h
 create mode 100644 arch/loongarch/include/asm/dmi.h
 create mode 100644 arch/loongarch/include/asm/efi.h
 create mode 100644 arch/loongarch/include/asm/fw.h
 create mode 100644 arch/loongarch/include/asm/reboot.h
 create mode 100644 arch/loongarch/include/asm/setup.h
 create mode 100644 arch/loongarch/kernel/acpi.c
 create mode 100644 arch/loongarch/kernel/cacheinfo.c
 create mode 100644 arch/loongarch/kernel/cmdline.c
 create mode 100644 arch/loongarch/kernel/cpu-probe.c
 create mode 100644 arch/loongarch/kernel/efi.c
 create mode 100644 arch/loongarch/kernel/env.c
 create mode 100644 arch/loongarch/kernel/head.S
 create mode 100644 arch/loongarch/kernel/mem.c
 create mode 100644 arch/loongarch/kernel/reset.c
 create mode 100644 arch/loongarch/kernel/setup.c
 create mode 100644 arch/loongarch/kernel/time.c
 create mode 100644 arch/loongarch/kernel/topology.c

diff --git a/arch/loongarch/include/asm/acenv.h b/arch/loongarch/include/asm/acenv.h
new file mode 100644
index 000000000000..290a15a41258
--- /dev/null
+++ b/arch/loongarch/include/asm/acenv.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * LoongArch specific ACPICA environments and implementation
+ *
+ * Author: Jianmin Lv <lvjianmin@loongson.cn>
+ *         Huacai Chen <chenhuacai@loongson.cn>
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#ifndef _ASM_LOONGARCH_ACENV_H
+#define _ASM_LOONGARCH_ACENV_H
+
+/* The head file is required by ACPI core, but we have nothing to fill
+ * it now, update it later when needed.
+ */
+
+#endif /* _ASM_LOONGARCH_ACENV_H */
diff --git a/arch/loongarch/include/asm/acpi.h b/arch/loongarch/include/asm/acpi.h
new file mode 100644
index 000000000000..62044cd5b7bc
--- /dev/null
+++ b/arch/loongarch/include/asm/acpi.h
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Author: Jianmin Lv <lvjianmin@loongson.cn>
+ *         Huacai Chen <chenhuacai@loongson.cn>
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#ifndef _ASM_LOONGARCH_ACPI_H
+#define _ASM_LOONGARCH_ACPI_H
+
+#ifdef CONFIG_ACPI
+extern int acpi_strict;
+extern int acpi_disabled;
+extern int acpi_pci_disabled;
+extern int acpi_noirq;
+
+#define acpi_os_ioremap acpi_os_ioremap
+void __init __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size);
+
+static inline void disable_acpi(void)
+{
+	acpi_disabled = 1;
+	acpi_pci_disabled = 1;
+	acpi_noirq = 1;
+}
+
+static inline bool acpi_has_cpu_in_madt(void)
+{
+	return true;
+}
+
+extern struct list_head acpi_wakeup_device_list;
+
+#endif /* !CONFIG_ACPI */
+
+#define ACPI_TABLE_UPGRADE_MAX_PHYS ARCH_LOW_ADDRESS_LIMIT
+
+#endif /* _ASM_LOONGARCH_ACPI_H */
diff --git a/arch/loongarch/include/asm/boot_param.h b/arch/loongarch/include/asm/boot_param.h
new file mode 100644
index 000000000000..cae274b7aa14
--- /dev/null
+++ b/arch/loongarch/include/asm/boot_param.h
@@ -0,0 +1,97 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_BOOT_PARAM_H_
+#define __ASM_BOOT_PARAM_H_
+
+#ifdef CONFIG_VT
+#include <linux/screen_info.h>
+#endif
+
+#define ADDRESS_TYPE_SYSRAM	1
+#define ADDRESS_TYPE_RESERVED	2
+#define ADDRESS_TYPE_ACPI	3
+#define ADDRESS_TYPE_NVS	4
+#define ADDRESS_TYPE_PMEM	5
+
+#define EFI_RUNTIME_MAP_START	   100
+#define LOONGSON3_BOOT_MEM_MAP_MAX 128
+
+#define LOONGSON_EFIBOOT_SIGNATURE	"BPI"
+#define LOONGSON_MEM_SIGNATURE		"MEM"
+#define LOONGSON_VBIOS_SIGNATURE	"VBIOS"
+#define LOONGSON_SCREENINFO_SIGNATURE	"SINFO"
+
+/* Values for Version BPI */
+enum bpi_version {
+	BPI_VERSION_V1 = 1000, /* Signature="BPI01000" */
+	BPI_VERSION_V2 = 1001, /* Signature="BPI01001" */
+	BPI_VERSION_V3 = 1002, /* Signature="BPI01002" */
+};
+
+/* Flags in bootparamsinterface */
+#define BPI_FLAGS_UEFI_SUPPORTED BIT(0)
+
+struct _extention_list_hdr {
+	u64	signature;
+	u32	length;
+	u8	revision;
+	u8	checksum;
+	u64	next_offset;
+} __packed;
+
+struct boot_params {
+	u64	signature;	/* {"B", "P", "I", "0", "1", ... } */
+	void	*systemtable;
+	u64	extlist_offset;
+	u64	flags;
+} __packed;
+
+struct loongsonlist_mem_map {
+	struct	_extention_list_hdr header;	/* {"M", "E", "M"} */
+	u8	map_count;
+	u32	desc_version;
+	struct efi_mmap {
+		u32 mem_type;
+		u32 padding;
+		u64 mem_start;
+		u64 mem_vaddr;
+		u64 mem_size;
+		u64 attribute;
+	} __packed map[LOONGSON3_BOOT_MEM_MAP_MAX];
+} __packed;
+
+struct loongsonlist_vbios {
+	struct	_extention_list_hdr header;	/* {"V", "B", "I", "O", "S"} */
+	u64	vbios_addr;
+} __packed;
+
+struct loongsonlist_screeninfo {
+	struct	_extention_list_hdr header;	/* {"S", "I", "N", "F", "O"} */
+	struct	screen_info si;
+} __packed;
+
+struct loongson_board_info {
+	int bios_size;
+	char *bios_vendor;
+	char *bios_version;
+	char *bios_release_date;
+	char *board_name;
+	char *board_vendor;
+};
+
+struct loongson_system_configuration {
+	int bpi_ver;
+	int nr_cpus;
+	int nr_nodes;
+	int nr_io_pics;
+	int boot_cpu_id;
+	int cores_per_node;
+	int cores_per_package;
+	char *cpuname;
+	u64 vgabios_addr;
+};
+
+extern struct boot_params *efi_bp;
+extern struct loongson_board_info b_info;
+extern struct loongsonlist_mem_map *loongson_mem_map;
+extern struct loongson_system_configuration loongson_sysconf;
+#endif
diff --git a/arch/loongarch/include/asm/bootinfo.h b/arch/loongarch/include/asm/bootinfo.h
new file mode 100644
index 000000000000..74fbba536568
--- /dev/null
+++ b/arch/loongarch/include/asm/bootinfo.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_BOOTINFO_H
+#define _ASM_BOOTINFO_H
+
+#include <linux/types.h>
+#include <asm/setup.h>
+
+const char *get_system_type(void);
+
+extern void early_init(void);
+extern void early_memblock_init(void);
+extern void platform_init(void);
+
+/*
+ * Initial kernel command line, usually setup by fw_init_cmdline()
+ */
+extern char arcs_cmdline[COMMAND_LINE_SIZE];
+
+/*
+ * Registers a0, a1, a2 and a3 as passed to the kernel entry by firmware
+ */
+extern unsigned long fw_arg0, fw_arg1, fw_arg2, fw_arg3;
+
+extern unsigned long initrd_start, initrd_end;
+
+#endif /* _ASM_BOOTINFO_H */
diff --git a/arch/loongarch/include/asm/dmi.h b/arch/loongarch/include/asm/dmi.h
new file mode 100644
index 000000000000..3df58ddce50a
--- /dev/null
+++ b/arch/loongarch/include/asm/dmi.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_DMI_H
+#define _ASM_DMI_H
+
+#include <linux/io.h>
+#include <linux/memblock.h>
+
+#define dmi_early_remap(x, l)	dmi_remap(x, l)
+#define dmi_early_unmap(x, l)	dmi_unmap(x)
+#define dmi_alloc(l)		memblock_alloc(l, PAGE_SIZE)
+
+static inline void *dmi_remap(u64 phys_addr, unsigned long size)
+{
+	return ((void *)TO_CAC(phys_addr));
+}
+
+static inline void dmi_unmap(void *addr)
+{
+}
+
+#endif /* _ASM_DMI_H */
diff --git a/arch/loongarch/include/asm/efi.h b/arch/loongarch/include/asm/efi.h
new file mode 100644
index 000000000000..00623ca3f52e
--- /dev/null
+++ b/arch/loongarch/include/asm/efi.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_LOONGARCH_EFI_H
+#define _ASM_LOONGARCH_EFI_H
+
+#include <linux/efi.h>
+
+extern void __init efi_init(void);
+extern void __init efi_runtime_init(void);
+extern void efifb_setup_from_dmi(struct screen_info *si, const char *opt);
+
+#define ARCH_EFI_IRQ_FLAGS_MASK  0x00000001  /*bit0: CP0 Status.IE*/
+
+#define arch_efi_call_virt_setup()               \
+({                                               \
+})
+
+#define arch_efi_call_virt(p, f, args...)        \
+({                                               \
+	efi_##f##_t * __f;                       \
+	__f = p->f;                              \
+	__f(args);                               \
+})
+
+#define arch_efi_call_virt_teardown()            \
+({                                               \
+})
+
+#define EFI_ALLOC_ALIGN		SZ_64K
+
+#endif /* _ASM_LOONGARCH_EFI_H */
diff --git a/arch/loongarch/include/asm/fw.h b/arch/loongarch/include/asm/fw.h
new file mode 100644
index 000000000000..c1c3384630c2
--- /dev/null
+++ b/arch/loongarch/include/asm/fw.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM_FW_H_
+#define __ASM_FW_H_
+
+#include <asm/bootinfo.h>
+
+extern int fw_argc;
+extern long *_fw_argv, *_fw_envp;
+
+#define fw_argv(index)		((char *)TO_CAC((long)_fw_argv[(index)]))
+#define fw_envp(index)		((char *)TO_CAC((long)_fw_envp[(index)]))
+
+extern void fw_init_cmdline(void);
+
+#endif /* __ASM_FW_H_ */
diff --git a/arch/loongarch/include/asm/reboot.h b/arch/loongarch/include/asm/reboot.h
new file mode 100644
index 000000000000..51151749d8f0
--- /dev/null
+++ b/arch/loongarch/include/asm/reboot.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_REBOOT_H
+#define _ASM_REBOOT_H
+
+extern void (*pm_restart)(void);
+
+#endif /* _ASM_REBOOT_H */
diff --git a/arch/loongarch/include/asm/setup.h b/arch/loongarch/include/asm/setup.h
new file mode 100644
index 000000000000..6d7d2a3e23dd
--- /dev/null
+++ b/arch/loongarch/include/asm/setup.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#ifndef _LOONGARCH_SETUP_H
+#define _LOONGARCH_SETUP_H
+
+#include <linux/types.h>
+#include <uapi/asm/setup.h>
+
+#define VECSIZE 0x200
+
+extern unsigned long eentry;
+extern unsigned long tlbrentry;
+extern void cpu_cache_init(void);
+extern void per_cpu_trap_init(int cpu);
+extern void set_handler(unsigned long offset, void *addr, unsigned long len);
+extern void set_merr_handler(unsigned long offset, void *addr, unsigned long len);
+
+#endif /* __SETUP_H */
diff --git a/arch/loongarch/kernel/acpi.c b/arch/loongarch/kernel/acpi.c
new file mode 100644
index 000000000000..506ab9912c51
--- /dev/null
+++ b/arch/loongarch/kernel/acpi.c
@@ -0,0 +1,338 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * acpi.c - Architecture-Specific Low-Level ACPI Boot Support
+ *
+ * Author: Jianmin Lv <lvjianmin@loongson.cn>
+ *         Huacai Chen <chenhuacai@loongson.cn>
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <linux/init.h>
+#include <linux/acpi.h>
+#include <linux/irq.h>
+#include <linux/irqdomain.h>
+#include <linux/memblock.h>
+#include <linux/serial_core.h>
+#include <asm/io.h>
+#include <asm/loongson.h>
+
+int acpi_disabled;
+EXPORT_SYMBOL(acpi_disabled);
+int acpi_noirq;
+int acpi_pci_disabled;
+EXPORT_SYMBOL(acpi_pci_disabled);
+int acpi_strict = 1; /* We have no workarounds on LoongArch */
+int num_processors;
+int disabled_cpus;
+enum acpi_irq_model_id acpi_irq_model = ACPI_IRQ_MODEL_PLATFORM;
+
+u64 acpi_saved_sp;
+
+#define MAX_CORE_PIC 256
+
+#define PREFIX			"ACPI: "
+
+int acpi_gsi_to_irq(u32 gsi, unsigned int *irqp)
+{
+	if (irqp != NULL)
+		*irqp = acpi_register_gsi(NULL, gsi, -1, -1);
+	return (*irqp >= 0) ? 0 : -EINVAL;
+}
+EXPORT_SYMBOL_GPL(acpi_gsi_to_irq);
+
+int acpi_isa_irq_to_gsi(unsigned int isa_irq, u32 *gsi)
+{
+	if (gsi)
+		*gsi = isa_irq;
+	return 0;
+}
+
+/*
+ * success: return IRQ number (>=0)
+ * failure: return < 0
+ */
+int acpi_register_gsi(struct device *dev, u32 gsi, int trigger, int polarity)
+{
+	int id;
+	struct irq_fwspec fwspec;
+
+	switch (gsi) {
+	case GSI_MIN_CPU_IRQ ... GSI_MAX_CPU_IRQ:
+		fwspec.fwnode = liointc_domain->fwnode;
+		fwspec.param[0] = gsi - GSI_MIN_CPU_IRQ;
+		fwspec.param_count = 1;
+
+		return irq_create_fwspec_mapping(&fwspec);
+
+	case GSI_MIN_LPC_IRQ ... GSI_MAX_LPC_IRQ:
+		if (!pch_lpc_domain)
+			return -EINVAL;
+
+		fwspec.fwnode = pch_lpc_domain->fwnode;
+		fwspec.param[0] = gsi - GSI_MIN_LPC_IRQ;
+		fwspec.param[1] = acpi_dev_get_irq_type(trigger, polarity);
+		fwspec.param_count = 2;
+
+		return irq_create_fwspec_mapping(&fwspec);
+
+	case GSI_MIN_PCH_IRQ ... GSI_MAX_PCH_IRQ:
+		id = find_pch_pic(gsi);
+		if (id < 0)
+			return -EINVAL;
+
+		fwspec.fwnode = pch_pic_domain[id]->fwnode;
+		fwspec.param[0] = gsi - acpi_pchpic[id]->gsi_base;
+		fwspec.param[1] = IRQ_TYPE_LEVEL_HIGH;
+		fwspec.param_count = 2;
+
+		return irq_create_fwspec_mapping(&fwspec);
+	}
+
+	return -EINVAL;
+}
+EXPORT_SYMBOL_GPL(acpi_register_gsi);
+
+void acpi_unregister_gsi(u32 gsi)
+{
+
+}
+EXPORT_SYMBOL_GPL(acpi_unregister_gsi);
+
+void __init __iomem * __acpi_map_table(unsigned long phys, unsigned long size)
+{
+
+	if (!phys || !size)
+		return NULL;
+
+	return early_memremap(phys, size);
+}
+void __init __acpi_unmap_table(void __iomem *map, unsigned long size)
+{
+	if (!map || !size)
+		return;
+
+	early_memunmap(map, size);
+}
+
+void __init __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size)
+{
+	if (!memblock_is_memory(phys))
+		return ioremap(phys, size);
+	else
+		return ioremap_cache(phys, size);
+}
+
+void __init acpi_boot_table_init(void)
+{
+	/*
+	 * If acpi_disabled, bail out
+	 */
+	if (acpi_disabled)
+		return;
+
+	/*
+	 * Initialize the ACPI boot-time table parser.
+	 */
+	if (acpi_table_init()) {
+		disable_acpi();
+		return;
+	}
+}
+
+static int __init
+acpi_parse_cpuintc(union acpi_subtable_headers *header, const unsigned long end)
+{
+	struct acpi_madt_core_pic *processor = NULL;
+
+	processor = (struct acpi_madt_core_pic *)header;
+	if (BAD_MADT_ENTRY(processor, end))
+		return -EINVAL;
+
+	acpi_table_print_madt_entry(&header->common);
+
+	return 0;
+}
+
+static int __init
+acpi_parse_liointc(union acpi_subtable_headers *header, const unsigned long end)
+{
+	struct acpi_madt_lio_pic *liointc = NULL;
+
+	liointc = (struct acpi_madt_lio_pic *)header;
+
+	if (BAD_MADT_ENTRY(liointc, end))
+		return -EINVAL;
+
+	acpi_liointc = liointc;
+
+	return 0;
+}
+
+static int __init
+acpi_parse_eiointc(union acpi_subtable_headers *header, const unsigned long end)
+{
+	static int id = 0;
+	struct acpi_madt_eio_pic *eiointc = NULL;
+
+	eiointc = (struct acpi_madt_eio_pic *)header;
+
+	if (BAD_MADT_ENTRY(eiointc, end))
+		return -EINVAL;
+
+	acpi_eiointc[id++] = eiointc;
+	loongson_sysconf.nr_io_pics = id;
+
+	return 0;
+}
+
+static int __init
+acpi_parse_htintc(union acpi_subtable_headers *header, const unsigned long end)
+{
+	struct acpi_madt_ht_pic *htintc = NULL;
+
+	htintc = (struct acpi_madt_ht_pic *)header;
+
+	if (BAD_MADT_ENTRY(htintc, end))
+		return -EINVAL;
+
+	acpi_htintc = htintc;
+	loongson_sysconf.nr_io_pics = 1;
+
+	return 0;
+}
+
+static int __init
+acpi_parse_pch_pic(union acpi_subtable_headers *header, const unsigned long end)
+{
+	static int id = 0;
+	struct acpi_madt_bio_pic *pchpic = NULL;
+
+	pchpic = (struct acpi_madt_bio_pic *)header;
+
+	if (BAD_MADT_ENTRY(pchpic, end))
+		return -EINVAL;
+
+	acpi_pchpic[id++] = pchpic;
+
+	return 0;
+}
+
+static int __init
+acpi_parse_pch_msi(union acpi_subtable_headers *header, const unsigned long end)
+{
+	static int id = 0;
+	struct acpi_madt_msi_pic *pchmsi = NULL;
+
+	pchmsi = (struct acpi_madt_msi_pic *)header;
+
+	if (BAD_MADT_ENTRY(pchmsi, end))
+		return -EINVAL;
+
+	acpi_pchmsi[id++] = pchmsi;
+
+	return 0;
+}
+
+static int __init
+acpi_parse_pch_lpc(union acpi_subtable_headers *header, const unsigned long end)
+{
+	struct acpi_madt_lpc_pic *pchlpc = NULL;
+
+	pchlpc = (struct acpi_madt_lpc_pic *)header;
+
+	if (BAD_MADT_ENTRY(pchlpc, end))
+		return -EINVAL;
+
+	acpi_pchlpc = pchlpc;
+
+	return 0;
+}
+
+static void __init acpi_process_madt(void)
+{
+	int error;
+
+	/* Parse MADT CPUINTC entries */
+	error = acpi_table_parse_madt(ACPI_MADT_TYPE_CORE_PIC, acpi_parse_cpuintc, MAX_CORE_PIC);
+	if (error < 0) {
+		disable_acpi();
+		pr_err(PREFIX "Invalid BIOS MADT (CPUINTC entries), ACPI disabled\n");
+		return;
+	}
+
+	loongson_sysconf.nr_cpus = num_processors;
+
+	/* Parse MADT LIOINTC entries */
+	error = acpi_table_parse_madt(ACPI_MADT_TYPE_LIO_PIC, acpi_parse_liointc, 1);
+	if (error < 0) {
+		disable_acpi();
+		pr_err(PREFIX "Invalid BIOS MADT (LIOINTC entries), ACPI disabled\n");
+		return;
+	}
+
+	/* Parse MADT EIOINTC entries */
+	error = acpi_table_parse_madt(ACPI_MADT_TYPE_EIO_PIC, acpi_parse_eiointc, MAX_IO_PICS);
+	if (error < 0) {
+		disable_acpi();
+		pr_err(PREFIX "Invalid BIOS MADT (EIOINTC entries), ACPI disabled\n");
+		return;
+	}
+
+	/* Parse MADT HTVEC entries */
+	error = acpi_table_parse_madt(ACPI_MADT_TYPE_HT_PIC, acpi_parse_htintc, 1);
+	if (error < 0) {
+		disable_acpi();
+		pr_err(PREFIX "Invalid BIOS MADT (HTVEC entries), ACPI disabled\n");
+		return;
+	}
+
+	/* Parse MADT PCHPIC entries */
+	error = acpi_table_parse_madt(ACPI_MADT_TYPE_BIO_PIC, acpi_parse_pch_pic, MAX_IO_PICS);
+	if (error < 0) {
+		disable_acpi();
+		pr_err(PREFIX "Invalid BIOS MADT (PCHPIC entries), ACPI disabled\n");
+		return;
+	}
+
+	/* Parse MADT PCHMSI entries */
+	error = acpi_table_parse_madt(ACPI_MADT_TYPE_MSI_PIC, acpi_parse_pch_msi, MAX_IO_PICS);
+	if (error < 0) {
+		disable_acpi();
+		pr_err(PREFIX "Invalid BIOS MADT (PCHMSI entries), ACPI disabled\n");
+		return;
+	}
+
+	/* Parse MADT PCHLPC entries */
+	error = acpi_table_parse_madt(ACPI_MADT_TYPE_LPC_PIC, acpi_parse_pch_lpc, 1);
+	if (error < 0) {
+		disable_acpi();
+		pr_err(PREFIX "Invalid BIOS MADT (PCHLPC entries), ACPI disabled\n");
+		return;
+	}
+}
+
+int __init acpi_boot_init(void)
+{
+	/*
+	 * If acpi_disabled, bail out
+	 */
+	if (acpi_disabled)
+		return -1;
+
+	loongson_sysconf.boot_cpu_id = read_csr_cpuid();
+
+	/*
+	 * Process the Multiple APIC Description Table (MADT), if present
+	 */
+	acpi_process_madt();
+
+	/* Do not enable ACPI SPCR console by default */
+	acpi_parse_spcr(earlycon_acpi_spcr_enable, false);
+
+	return 0;
+}
+
+void __init arch_reserve_mem_area(acpi_physical_address addr, size_t size)
+{
+	memblock_reserve(addr, size);
+}
diff --git a/arch/loongarch/kernel/cacheinfo.c b/arch/loongarch/kernel/cacheinfo.c
new file mode 100644
index 000000000000..8c9fe29e98f0
--- /dev/null
+++ b/arch/loongarch/kernel/cacheinfo.c
@@ -0,0 +1,122 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * LoongArch cacheinfo support
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/cacheinfo.h>
+
+/* Populates leaf and increments to next leaf */
+#define populate_cache(cache, leaf, c_level, c_type)		\
+do {								\
+	leaf->type = c_type;					\
+	leaf->level = c_level;					\
+	leaf->coherency_line_size = c->cache.linesz;		\
+	leaf->number_of_sets = c->cache.sets;			\
+	leaf->ways_of_associativity = c->cache.ways;		\
+	leaf->size = c->cache.linesz * c->cache.sets *		\
+		c->cache.ways;					\
+	leaf++;							\
+} while (0)
+
+int init_cache_level(unsigned int cpu)
+{
+	struct cpuinfo_loongarch *c = &current_cpu_data;
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+	int levels = 0, leaves = 0;
+
+	/*
+	 * If Dcache is not set, we assume the cache structures
+	 * are not properly initialized.
+	 */
+	if (c->dcache.waysize)
+		levels += 1;
+	else
+		return -ENOENT;
+
+
+	leaves += (c->icache.waysize) ? 2 : 1;
+
+	if (c->vcache.waysize) {
+		levels++;
+		leaves++;
+	}
+
+	if (c->scache.waysize) {
+		levels++;
+		leaves++;
+	}
+
+	if (c->tcache.waysize) {
+		levels++;
+		leaves++;
+	}
+
+	this_cpu_ci->num_levels = levels;
+	this_cpu_ci->num_leaves = leaves;
+	return 0;
+}
+
+static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
+					   struct cacheinfo *sib_leaf)
+{
+	return !((this_leaf->level == 1) || (this_leaf->level == 2));
+}
+
+static void cache_cpumap_setup(unsigned int cpu)
+{
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+	struct cacheinfo *this_leaf, *sib_leaf;
+	unsigned int index;
+
+	for (index = 0; index < this_cpu_ci->num_leaves; index++) {
+		unsigned int i;
+
+		this_leaf = this_cpu_ci->info_list + index;
+		/* skip if shared_cpu_map is already populated */
+		if (!cpumask_empty(&this_leaf->shared_cpu_map))
+			continue;
+
+		cpumask_set_cpu(cpu, &this_leaf->shared_cpu_map);
+		for_each_online_cpu(i) {
+			struct cpu_cacheinfo *sib_cpu_ci = get_cpu_cacheinfo(i);
+
+			if (i == cpu || !sib_cpu_ci->info_list)
+				continue;/* skip if itself or no cacheinfo */
+			sib_leaf = sib_cpu_ci->info_list + index;
+			if (cache_leaves_are_shared(this_leaf, sib_leaf)) {
+				cpumask_set_cpu(cpu, &sib_leaf->shared_cpu_map);
+				cpumask_set_cpu(i, &this_leaf->shared_cpu_map);
+			}
+		}
+	}
+}
+
+int populate_cache_leaves(unsigned int cpu)
+{
+	int level = 1;
+	struct cpuinfo_loongarch *c = &current_cpu_data;
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+	struct cacheinfo *this_leaf = this_cpu_ci->info_list;
+
+	if (c->icache.waysize) {
+		populate_cache(dcache, this_leaf, level, CACHE_TYPE_DATA);
+		populate_cache(icache, this_leaf, level++, CACHE_TYPE_INST);
+	} else {
+		populate_cache(dcache, this_leaf, level++, CACHE_TYPE_UNIFIED);
+	}
+
+	if (c->vcache.waysize)
+		populate_cache(vcache, this_leaf, level++, CACHE_TYPE_UNIFIED);
+
+	if (c->scache.waysize)
+		populate_cache(scache, this_leaf, level++, CACHE_TYPE_UNIFIED);
+
+	if (c->tcache.waysize)
+		populate_cache(tcache, this_leaf, level++, CACHE_TYPE_UNIFIED);
+
+	cache_cpumap_setup(cpu);
+	this_cpu_ci->cpu_map_populated = true;
+
+	return 0;
+}
diff --git a/arch/loongarch/kernel/cmdline.c b/arch/loongarch/kernel/cmdline.c
new file mode 100644
index 000000000000..a748214e0f2c
--- /dev/null
+++ b/arch/loongarch/kernel/cmdline.c
@@ -0,0 +1,31 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/sizes.h>
+#include <linux/string.h>
+
+#include <asm/addrspace.h>
+#include <asm/early_ioremap.h>
+#include <asm/fw.h>
+
+int fw_argc;
+long *_fw_argv, *_fw_envp;
+
+void __init fw_init_cmdline(void)
+{
+	int i;
+
+	fw_argc = fw_arg0;
+	_fw_argv = (long *)early_memremap_ro(fw_arg1, SZ_16K);
+	_fw_envp = (long *)early_memremap_ro(fw_arg2, SZ_64K);
+
+	arcs_cmdline[0] = '\0';
+	for (i = 1; i < fw_argc; i++) {
+		strlcat(arcs_cmdline, fw_argv(i), COMMAND_LINE_SIZE);
+		if (i < (fw_argc - 1))
+			strlcat(arcs_cmdline, " ", COMMAND_LINE_SIZE);
+	}
+}
diff --git a/arch/loongarch/kernel/cpu-probe.c b/arch/loongarch/kernel/cpu-probe.c
new file mode 100644
index 000000000000..ea591fe747bd
--- /dev/null
+++ b/arch/loongarch/kernel/cpu-probe.c
@@ -0,0 +1,305 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Processor capabilities determination functions.
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/ptrace.h>
+#include <linux/smp.h>
+#include <linux/stddef.h>
+#include <linux/export.h>
+#include <linux/printk.h>
+#include <linux/uaccess.h>
+
+#include <asm/cpu-features.h>
+#include <asm/elf.h>
+#include <asm/fpu.h>
+#include <asm/loongarch.h>
+#include <asm/pgtable-bits.h>
+#include <asm/setup.h>
+
+/* Hardware capabilities */
+unsigned int elf_hwcap __read_mostly;
+EXPORT_SYMBOL_GPL(elf_hwcap);
+
+/*
+ * Determine the FCSR mask for FPU hardware.
+ */
+static inline void cpu_set_fpu_fcsr_mask(struct cpuinfo_loongarch *c)
+{
+	unsigned long sr, mask, fcsr, fcsr0, fcsr1;
+
+	fcsr = c->fpu_csr0;
+	mask = FPU_CSR_ALL_X | FPU_CSR_ALL_E | FPU_CSR_ALL_S | FPU_CSR_RM;
+
+	sr = read_csr_euen();
+	enable_fpu();
+
+	fcsr0 = fcsr & mask;
+	write_fcsr(LOONGARCH_FCSR0, fcsr0);
+	fcsr0 = read_fcsr(LOONGARCH_FCSR0);
+
+	fcsr1 = fcsr | ~mask;
+	write_fcsr(LOONGARCH_FCSR0, fcsr1);
+	fcsr1 = read_fcsr(LOONGARCH_FCSR0);
+
+	write_fcsr(LOONGARCH_FCSR0, fcsr);
+
+	write_csr_euen(sr);
+
+	c->fpu_mask = ~(fcsr0 ^ fcsr1) & ~mask;
+}
+
+static inline void set_elf_platform(int cpu, const char *plat)
+{
+	if (cpu == 0)
+		__elf_platform = plat;
+}
+
+/* MAP BASE */
+unsigned long vm_map_base;
+EXPORT_SYMBOL_GPL(vm_map_base);
+
+static void cpu_probe_addrbits(struct cpuinfo_loongarch *c)
+{
+#ifdef __NEED_ADDRBITS_PROBE
+	c->pabits = (read_cpucfg(LOONGARCH_CPUCFG1) & CPUCFG1_PABITS) >> 4;
+	c->vabits = (read_cpucfg(LOONGARCH_CPUCFG1) & CPUCFG1_VABITS) >> 12;
+	vm_map_base = 0UL - (1UL << c->vabits);
+#endif
+}
+
+static void set_isa(struct cpuinfo_loongarch *c, unsigned int isa)
+{
+	switch (isa) {
+	case LOONGARCH_CPU_ISA_LA64:
+		c->isa_level |= LOONGARCH_CPU_ISA_LA64;
+		fallthrough;
+	case LOONGARCH_CPU_ISA_LA32S:
+		c->isa_level |= LOONGARCH_CPU_ISA_LA32S;
+		fallthrough;
+	case LOONGARCH_CPU_ISA_LA32R:
+		c->isa_level |= LOONGARCH_CPU_ISA_LA32R;
+		break;
+	}
+}
+
+static void cpu_probe_common(struct cpuinfo_loongarch *c)
+{
+	unsigned int config;
+	unsigned long asid_mask;
+
+	c->options = LOONGARCH_CPU_CPUCFG | LOONGARCH_CPU_CSR |
+		     LOONGARCH_CPU_TLB | LOONGARCH_CPU_VINT | LOONGARCH_CPU_WATCH;
+
+	elf_hwcap |= HWCAP_LOONGARCH_CRC32;
+
+	config = read_cpucfg(LOONGARCH_CPUCFG1);
+	if (config & CPUCFG1_UAL) {
+		c->options |= LOONGARCH_CPU_UAL;
+		elf_hwcap |= HWCAP_LOONGARCH_UAL;
+	}
+
+	config = read_cpucfg(LOONGARCH_CPUCFG2);
+	if (config & CPUCFG2_LAM) {
+		c->options |= LOONGARCH_CPU_LAM;
+		elf_hwcap |= HWCAP_LOONGARCH_LAM;
+	}
+	if (config & CPUCFG2_FP) {
+		c->options |= LOONGARCH_CPU_FPU;
+		elf_hwcap |= HWCAP_LOONGARCH_FPU;
+	}
+	if (config & CPUCFG2_COMPLEX) {
+		c->options |= LOONGARCH_CPU_COMPLEX;
+		elf_hwcap |= HWCAP_LOONGARCH_COMPLEX;
+	}
+	if (config & CPUCFG2_CRYPTO) {
+		c->options |= LOONGARCH_CPU_CRYPTO;
+		elf_hwcap |= HWCAP_LOONGARCH_CRYPTO;
+	}
+	if (config & CPUCFG2_LVZP) {
+		c->options |= LOONGARCH_CPU_LVZ;
+		elf_hwcap |= HWCAP_LOONGARCH_LVZ;
+	}
+
+	config = read_cpucfg(LOONGARCH_CPUCFG6);
+	if (config & CPUCFG6_PMP)
+		c->options |= LOONGARCH_CPU_PMP;
+
+	config = iocsr_readl(LOONGARCH_IOCSR_FEATURES);
+	if (config & IOCSRF_CSRIPI)
+		c->options |= LOONGARCH_CPU_CSRIPI;
+	if (config & IOCSRF_EXTIOI)
+		c->options |= LOONGARCH_CPU_EXTIOI;
+	if (config & IOCSRF_FREQSCALE)
+		c->options |= LOONGARCH_CPU_SCALEFREQ;
+	if (config & IOCSRF_FLATMODE)
+		c->options |= LOONGARCH_CPU_FLATMODE;
+	if (config & IOCSRF_EIODECODE)
+		c->options |= LOONGARCH_CPU_EIODECODE;
+	if (config & IOCSRF_VM)
+		c->options |= LOONGARCH_CPU_HYPERVISOR;
+
+	config = csr_readl(LOONGARCH_CSR_ASID);
+	config = (config & CSR_ASID_BIT) >> CSR_ASID_BIT_SHIFT;
+	asid_mask = GENMASK(config - 1, 0);
+	set_cpu_asid_mask(c, asid_mask);
+
+	config = read_csr_prcfg1();
+	c->kscratch_mask = GENMASK((config & CSR_CONF1_KSNUM) - 1, 0);
+	c->kscratch_mask &= ~(EXC_KSCRATCH_MASK | PERCPU_KSCRATCH_MASK | KVM_KSCRATCH_MASK);
+
+	config = read_csr_prcfg3();
+	switch (config & CSR_CONF3_TLBTYPE) {
+	case 0:
+		c->tlbsizemtlb = 0;
+		c->tlbsizestlbsets = 0;
+		c->tlbsizestlbways = 0;
+		c->tlbsize = 0;
+		break;
+	case 1:
+		c->tlbsizemtlb = ((config & CSR_CONF3_MTLBSIZE) >> CSR_CONF3_MTLBSIZE_SHIFT) + 1;
+		c->tlbsizestlbsets = 0;
+		c->tlbsizestlbways = 0;
+		c->tlbsize = c->tlbsizemtlb + c->tlbsizestlbsets * c->tlbsizestlbways;
+		break;
+	case 2:
+		c->tlbsizemtlb = ((config & CSR_CONF3_MTLBSIZE) >> CSR_CONF3_MTLBSIZE_SHIFT) + 1;
+		c->tlbsizestlbsets = 1 << ((config & CSR_CONF3_STLBIDX) >> CSR_CONF3_STLBIDX_SHIFT);
+		c->tlbsizestlbways = ((config & CSR_CONF3_STLBWAYS) >> CSR_CONF3_STLBWAYS_SHIFT) + 1;
+		c->tlbsize = c->tlbsizemtlb + c->tlbsizestlbsets * c->tlbsizestlbways;
+		break;
+	default:
+		pr_warn("Warning: unimplemented tlb type\n");
+	}
+}
+
+#define MAX_NAME_LEN	32
+#define VENDOR_OFFSET	0
+#define CPUNAME_OFFSET	9
+
+static char cpu_full_name[MAX_NAME_LEN] = "        -        ";
+
+static inline void cpu_probe_loongson(struct cpuinfo_loongarch *c, unsigned int cpu)
+{
+	uint64_t *vendor = (void *)(&cpu_full_name[VENDOR_OFFSET]);
+	uint64_t *cpuname = (void *)(&cpu_full_name[CPUNAME_OFFSET]);
+
+	__cpu_full_name[cpu] = cpu_full_name;
+	*vendor = iocsr_readq(LOONGARCH_IOCSR_VENDOR);
+	*cpuname = iocsr_readq(LOONGARCH_IOCSR_CPUNAME);
+
+	switch (c->processor_id & PRID_IMP_MASK) {
+	case PRID_IMP_LOONGSON_32:
+		c->cputype = CPU_LOONGSON32;
+		set_isa(c, LOONGARCH_CPU_ISA_LA32S);
+		__cpu_family[cpu] = "Loongson-32bit";
+		pr_info("Standard 32-bit Loongson Processor probed\n");
+		break;
+	case PRID_IMP_LOONGSON_64R:
+		c->cputype = CPU_LOONGSON64;
+		set_isa(c, LOONGARCH_CPU_ISA_LA64);
+		__cpu_family[cpu] = "Loongson-64bit";
+		pr_info("Reduced 64-bit Loongson Processor probed\n");
+		break;
+	case PRID_IMP_LOONGSON_64C:
+		c->cputype = CPU_LOONGSON64;
+		set_isa(c, LOONGARCH_CPU_ISA_LA64);
+		__cpu_family[cpu] = "Loongson-64bit";
+		pr_info("Classic 64-bit Loongson Processor probed\n");
+		break;
+	case PRID_IMP_LOONGSON_64G:
+		c->cputype = CPU_LOONGSON64;
+		set_isa(c, LOONGARCH_CPU_ISA_LA64);
+		__cpu_family[cpu] = "Loongson-64bit";
+		pr_info("Generic 64-bit Loongson Processor probed\n");
+		break;
+	default: /* Default to 64 bit */
+		c->cputype = CPU_LOONGSON64;
+		set_isa(c, LOONGARCH_CPU_ISA_LA64);
+		__cpu_family[cpu] = "Loongson-64bit";
+		pr_info("Unknown 64-bit Loongson Processor probed\n");
+	}
+}
+
+#ifdef CONFIG_64BIT
+/* For use by uaccess.h */
+u64 __ua_limit;
+EXPORT_SYMBOL(__ua_limit);
+#endif
+
+const char *__cpu_family[NR_CPUS];
+const char *__cpu_full_name[NR_CPUS];
+const char *__elf_platform;
+
+static void cpu_report(void)
+{
+	struct cpuinfo_loongarch *c = &current_cpu_data;
+
+	pr_info("CPU%d revision is: %08x (%s)\n",
+		smp_processor_id(), c->processor_id, cpu_family_string());
+	if (c->options & LOONGARCH_CPU_FPU)
+		pr_info("FPU%d revision is: %08x\n", smp_processor_id(), c->fpu_vers);
+}
+
+void cpu_probe(void)
+{
+	unsigned int cpu = smp_processor_id();
+	struct cpuinfo_loongarch *c = &current_cpu_data;
+
+	/*
+	 * Set a default elf platform, cpu probe may later
+	 * overwrite it with a more precise value
+	 */
+	set_elf_platform(cpu, "loongarch");
+
+	c->cputype	= CPU_UNKNOWN;
+	c->processor_id = read_cpucfg(LOONGARCH_CPUCFG0);
+	c->fpu_vers	= (read_cpucfg(LOONGARCH_CPUCFG2) >> 3) & 0x3;
+
+	c->fpu_csr0	= FPU_CSR_RN;
+	c->fpu_mask	= FPU_CSR_RSVD;
+
+	cpu_probe_common(c);
+
+	per_cpu_trap_init(cpu);
+
+	switch (c->processor_id & PRID_COMP_MASK) {
+	case PRID_COMP_LOONGSON:
+		cpu_probe_loongson(c, cpu);
+		break;
+	}
+
+	BUG_ON(!__cpu_family[cpu]);
+	BUG_ON(c->cputype == CPU_UNKNOWN);
+
+	cpu_probe_addrbits(c);
+
+#ifdef CONFIG_64BIT
+	if (cpu == 0)
+		__ua_limit = ~((1ull << cpu_vabits) - 1);
+#endif
+
+	cpu_report();
+}
+
+void cpu_set_cluster(struct cpuinfo_loongarch *cpuinfo, unsigned int cluster)
+{
+	/* Ensure the core number fits in the field */
+	WARN_ON(cluster > (LOONGARCH_GLOBALNUMBER_CLUSTER >>
+			   LOONGARCH_GLOBALNUMBER_CLUSTER_SHF));
+
+	cpuinfo->globalnumber &= ~LOONGARCH_GLOBALNUMBER_CLUSTER;
+	cpuinfo->globalnumber |= cluster << LOONGARCH_GLOBALNUMBER_CLUSTER_SHF;
+}
+
+void cpu_set_core(struct cpuinfo_loongarch *cpuinfo, unsigned int core)
+{
+	/* Ensure the core number fits in the field */
+	WARN_ON(core > (LOONGARCH_GLOBALNUMBER_CORE >> LOONGARCH_GLOBALNUMBER_CORE_SHF));
+
+	cpuinfo->globalnumber &= ~LOONGARCH_GLOBALNUMBER_CORE;
+	cpuinfo->globalnumber |= core << LOONGARCH_GLOBALNUMBER_CORE_SHF;
+}
diff --git a/arch/loongarch/kernel/efi.c b/arch/loongarch/kernel/efi.c
new file mode 100644
index 000000000000..176aa06c1226
--- /dev/null
+++ b/arch/loongarch/kernel/efi.c
@@ -0,0 +1,235 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * EFI initialization
+ *
+ * Author: Jianmin Lv <lvjianmin@loongson.cn>
+ *         Huacai Chen <chenhuacai@loongson.cn>
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <linux/acpi.h>
+#include <linux/efi.h>
+#include <linux/efi-bgrt.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/export.h>
+#include <linux/io.h>
+#include <linux/kobject.h>
+#include <linux/memblock.h>
+#include <linux/reboot.h>
+#include <linux/uaccess.h>
+
+#include <asm/early_ioremap.h>
+#include <asm/efi.h>
+#include <asm/tlb.h>
+#include <asm/loongson.h>
+
+static unsigned long efi_nr_tables;
+static unsigned long efi_config_table;
+static unsigned long screen_info_table __initdata = EFI_INVALID_TABLE_ADDR;
+
+static efi_system_table_t *efi_systab;
+static efi_config_table_type_t arch_tables[] __initdata = {
+	{LINUX_EFI_LARCH_SCREEN_INFO_TABLE_GUID, &screen_info_table, "SINFO"},
+	{},
+};
+
+static void __init init_screen_info(void)
+{
+	struct screen_info *si;
+
+	if (screen_info_table == EFI_INVALID_TABLE_ADDR)
+		return;
+
+	si = early_memremap_ro(screen_info_table, sizeof(*si));
+	if (!si) {
+		pr_err("Could not map screen_info config table\n");
+		return;
+	}
+	screen_info = *si;
+	early_memunmap(si, sizeof(*si));
+
+	if (screen_info.orig_video_isVGA == VIDEO_TYPE_EFI)
+		memblock_reserve(screen_info.lfb_base, screen_info.lfb_size);
+}
+
+static void __init create_tlb(u32 index, u64 vppn, u32 ps, u32 mat)
+{
+	unsigned long tlblo0, tlblo1;
+
+	write_csr_pagesize(ps);
+
+	tlblo0 = vppn | CSR_TLBLO0_V | CSR_TLBLO0_WE |
+		CSR_TLBLO0_GLOBAL | (mat << CSR_TLBLO0_CCA_SHIFT);
+	tlblo1 = tlblo0 + (1 << ps);
+
+	csr_writeq(vppn, LOONGARCH_CSR_TLBEHI);
+	csr_writeq(tlblo0, LOONGARCH_CSR_TLBELO0);
+	csr_writeq(tlblo1, LOONGARCH_CSR_TLBELO1);
+	csr_xchgl(0, CSR_TLBIDX_EHINV, LOONGARCH_CSR_TLBIDX);
+	csr_xchgl(index, CSR_TLBIDX_IDX, LOONGARCH_CSR_TLBIDX);
+
+	tlb_write_indexed();
+}
+
+#define MTLB_ENTRY_INDEX	0x800
+
+/* Create VA == PA mapping as UEFI */
+static void __init fix_efi_mapping(void)
+{
+	unsigned int i;
+	unsigned int index = MTLB_ENTRY_INDEX;
+	unsigned int tlbnr = boot_cpu_data.tlbsizemtlb - 2;
+	unsigned long vppn;
+
+	/* Low Memory, Cached */
+	create_tlb(index++, 0x00000000, PS_128M, 1);
+	/* MMIO Registers, Uncached */
+	create_tlb(index++, 0x10000000, PS_128M, 0);
+
+	/* High Memory, Cached */
+	for (i = 0; i < tlbnr; i++) {
+		vppn = 0x80000000ULL + (SZ_2G * i);
+		create_tlb(index++, vppn, PS_1G, 1);
+	}
+}
+
+/*
+ * set_virtual_map() - create a virtual mapping for the EFI memory map and call
+ * efi_set_virtual_address_map enter virtual for runtime service
+ *
+ * This function populates the virt_addr fields of all memory region descriptors
+ * in @memory_map whose EFI_MEMORY_RUNTIME attribute is set. Those descriptors
+ * are also copied to @runtime_map, and their total count is returned in @count.
+ */
+static unsigned int __init set_virtual_map(void)
+{
+	int i, count = 0;
+	unsigned int size;
+	unsigned long attr;
+	efi_status_t status;
+	efi_runtime_services_t *rt;
+	efi_set_virtual_address_map_t *svam;
+	efi_memory_desc_t *runtime_map, *out;
+	struct loongsonlist_mem_map *map = loongson_mem_map;
+
+	size = sizeof(struct efi_mmap);
+	out = runtime_map = (efi_memory_desc_t *)&map->map[EFI_RUNTIME_MAP_START];
+
+	for (i = 0; i < map->map_count; i++) {
+		attr = map->map[i].attribute;
+		if (!(attr & EFI_MEMORY_RUNTIME))
+			continue;
+
+		map->map[i].mem_vaddr = TO_CAC(map->map[i].mem_start);
+		map->map[i].mem_size  = map->map[i].mem_size >> EFI_PAGE_SHIFT;
+
+		memcpy(out, &map->map[i], size);
+		out = (void *)out + size;
+		++count;
+
+	}
+
+	rt = early_memremap_ro((unsigned long)efi_systab->runtime, sizeof(*rt));
+
+	/* Install the new virtual address map */
+	svam = rt->set_virtual_address_map;
+
+	fix_efi_mapping();
+
+	status = svam(size * count, size, map->desc_version,
+			(efi_memory_desc_t *)TO_PHYS((unsigned long)runtime_map));
+
+	local_flush_tlb_all();
+	write_csr_pagesize(PS_DEFAULT_SIZE);
+
+	if (status != EFI_SUCCESS)
+		return -1;
+
+	return 0;
+}
+
+void __init efi_runtime_init(void)
+{
+	efi_status_t status;
+
+	if (!efi_enabled(EFI_BOOT))
+		return;
+
+	if (!efi_systab->runtime)
+		return;
+
+	status = set_virtual_map();
+	if (status < 0)
+		return;
+
+	if (efi_runtime_disabled()) {
+		pr_info("EFI runtime services will be disabled.\n");
+		return;
+	}
+
+	efi.runtime = (efi_runtime_services_t *)efi_systab->runtime;
+	efi.runtime_version = (unsigned int)efi.runtime->hdr.revision;
+
+	efi_native_runtime_setup();
+	set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
+}
+
+void __init efi_init(void)
+{
+	int size;
+	void *config_tables;
+
+	if (!efi_bp)
+		return;
+
+	efi_systab = (efi_system_table_t *)early_memremap_ro
+		((unsigned long)efi_bp->systemtable, sizeof(efi_systab));
+
+	if (!efi_systab) {
+		pr_err("Can't find EFI system table.\n");
+		return;
+	}
+
+	set_bit(EFI_64BIT, &efi.flags);
+	efi_nr_tables	 = efi_systab->nr_tables;
+	efi_config_table = (unsigned long)efi_systab->tables;
+
+	size = sizeof(efi_config_table_t);
+	config_tables = early_memremap(efi_config_table, efi_nr_tables * size);
+	efi_config_parse_tables(config_tables, efi_systab->nr_tables, arch_tables);
+	early_memunmap(config_tables, efi_nr_tables * size);
+
+	init_screen_info();
+}
+
+static ssize_t boardinfo_show(struct kobject *kobj,
+			      struct kobj_attribute *attr, char *buf)
+{
+	return sprintf(buf,
+		"BIOS Information\n"
+		"Vendor\t\t\t: %s\n"
+		"Version\t\t\t: %s\n"
+		"ROM Size\t\t: %d KB\n"
+		"Release Date\t\t: %s\n\n"
+		"Board Information\n"
+		"Manufacturer\t\t: %s\n"
+		"Board Name\t\t: %s\n"
+		"Family\t\t\t: LOONGSON64\n\n",
+		b_info.bios_vendor, b_info.bios_version,
+		b_info.bios_size, b_info.bios_release_date,
+		b_info.board_vendor, b_info.board_name);
+}
+
+static struct kobj_attribute boardinfo_attr = __ATTR(boardinfo, 0444,
+						     boardinfo_show, NULL);
+
+static int __init boardinfo_init(void)
+{
+	if (!efi_kobj)
+		return -EINVAL;
+
+	return sysfs_create_file(efi_kobj, &boardinfo_attr.attr);
+}
+late_initcall(boardinfo_init);
diff --git a/arch/loongarch/kernel/env.c b/arch/loongarch/kernel/env.c
new file mode 100644
index 000000000000..491c209ec260
--- /dev/null
+++ b/arch/loongarch/kernel/env.c
@@ -0,0 +1,176 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Author: Huacai Chen <chenhuacai@loongson.cn>
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/export.h>
+#include <linux/acpi.h>
+#include <linux/efi.h>
+#include <asm/early_ioremap.h>
+#include <asm/fw.h>
+#include <asm/time.h>
+#include <asm/bootinfo.h>
+#include <asm/loongson.h>
+
+struct boot_params *efi_bp;
+struct loongsonlist_mem_map *loongson_mem_map;
+struct loongsonlist_vbios *pvbios;
+struct loongson_system_configuration loongson_sysconf;
+EXPORT_SYMBOL(loongson_sysconf);
+
+u64 loongson_chipcfg[MAX_PACKAGES];
+u64 loongson_chiptemp[MAX_PACKAGES];
+u64 loongson_freqctrl[MAX_PACKAGES];
+unsigned long long smp_group[MAX_PACKAGES];
+
+static void __init register_addrs_set(u64 *registers, const u64 addr, int num)
+{
+	u64 i;
+
+	for (i = 0; i < num; i++) {
+		*registers = (i << 44) | addr;
+		registers++;
+	}
+}
+
+static u8 ext_listhdr_checksum(u8 *buffer, u32 length)
+{
+	u8 sum = 0;
+	u8 *end = buffer + length;
+
+	while (buffer < end) {
+		sum = (u8)(sum + *(buffer++));
+	}
+
+	return (sum);
+}
+
+static int parse_mem(struct _extention_list_hdr *head)
+{
+	loongson_mem_map = (struct loongsonlist_mem_map *)head;
+
+	if (ext_listhdr_checksum((u8 *)loongson_mem_map, head->length)) {
+		pr_warn("mem checksum error\n");
+		return -EPERM;
+	}
+
+	return 0;
+}
+
+static int parse_vbios(struct _extention_list_hdr *head)
+{
+	pvbios = (struct loongsonlist_vbios *)head;
+
+	if (ext_listhdr_checksum((u8 *)pvbios, head->length)) {
+		pr_warn("vbios_addr checksum error\n");
+		return -EPERM;
+	}
+
+	loongson_sysconf.vgabios_addr =
+		(u64)early_memremap_ro(pvbios->vbios_addr, 64);
+
+	return 0;
+}
+
+static int parse_screeninfo(struct _extention_list_hdr *head)
+{
+	struct loongsonlist_screeninfo *pscreeninfo;
+
+	pscreeninfo = (struct loongsonlist_screeninfo *)head;
+	if (ext_listhdr_checksum((u8 *)pscreeninfo, head->length)) {
+		pr_warn("screeninfo_addr checksum error\n");
+		return -EPERM;
+	}
+
+	memcpy(&screen_info, &pscreeninfo->si, sizeof(screen_info));
+
+	return 0;
+}
+
+static int parse_extlist(struct boot_params *bp)
+{
+	unsigned long next_offset;
+	struct _extention_list_hdr *fhead;
+
+	fhead = (struct _extention_list_hdr *)((void *)bp + bp->extlist_offset);
+	if (fhead == NULL) {
+		pr_warn("the ext struct is empty!\n");
+		return -1;
+	}
+
+	do {
+		next_offset = fhead->next_offset;
+		if (memcmp(&(fhead->signature), LOONGSON_MEM_SIGNATURE, 3) == 0) {
+			if (parse_mem(fhead) != 0) {
+				pr_warn("parse mem failed\n");
+				return -EPERM;
+			}
+		} else if (memcmp(&(fhead->signature), LOONGSON_VBIOS_SIGNATURE, 5) == 0) {
+			if (parse_vbios(fhead) != 0) {
+				pr_warn("parse vbios failed\n");
+				return -EPERM;
+			}
+		} else if (memcmp(&(fhead->signature), LOONGSON_SCREENINFO_SIGNATURE, 5) == 0) {
+			if (parse_screeninfo(fhead) != 0) {
+				pr_warn("parse screeninfo failed\n");
+				return -EPERM;
+			}
+		}
+		fhead = (struct _extention_list_hdr *)((void *)bp + next_offset);
+	} while (next_offset);
+
+	return 0;
+}
+
+static void __init parse_flags(u64 flags)
+{
+	if (flags & BPI_FLAGS_UEFI_SUPPORTED)
+		set_bit(EFI_BOOT, &efi.flags);
+	else
+		clear_bit(EFI_BOOT, &efi.flags);
+}
+
+static int get_bpi_version(void *signature)
+{
+	char data[8];
+	int r, version = 0;
+
+	memset(data, 0, 8);
+	memcpy(data, signature + 4, 4);
+	r = kstrtoint(data, 10, &version);
+
+	if (r < 0 || version < BPI_VERSION_V1)
+		panic("Fatal error, invalid BPI version: %d\n", version);
+
+	if (version >= BPI_VERSION_V2)
+		parse_flags(efi_bp->flags);
+
+	return version;
+}
+
+void __init fw_init_environ(void)
+{
+	efi_bp = (struct boot_params *)_fw_envp;
+	loongson_sysconf.bpi_ver = get_bpi_version(&efi_bp->signature);
+
+	register_addrs_set(smp_group, TO_UNCAC(0x1fe01000), 16);
+	register_addrs_set(loongson_chipcfg, TO_UNCAC(0x1fe00180), 16);
+	register_addrs_set(loongson_chiptemp, TO_UNCAC(0x1fe0019c), 16);
+	register_addrs_set(loongson_freqctrl, TO_UNCAC(0x1fe001d0), 16);
+
+	if (parse_extlist(efi_bp))
+		pr_warn("Scan bootparam failed\n");
+}
+
+static int __init init_cpu_fullname(void)
+{
+	int cpu;
+
+	if (loongson_sysconf.cpuname && !strncmp(loongson_sysconf.cpuname, "Loongson", 8)) {
+		for (cpu = 0; cpu < NR_CPUS; cpu++)
+			__cpu_full_name[cpu] = loongson_sysconf.cpuname;
+	}
+	return 0;
+}
+arch_initcall(init_cpu_fullname);
diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
new file mode 100644
index 000000000000..b4a0b28da3e7
--- /dev/null
+++ b/arch/loongarch/kernel/head.S
@@ -0,0 +1,72 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/init.h>
+#include <linux/threads.h>
+
+#include <asm/addrspace.h>
+#include <asm/asm.h>
+#include <asm/asmmacro.h>
+#include <asm/regdef.h>
+#include <asm/loongarch.h>
+#include <asm/stackframe.h>
+
+SYM_ENTRY(_stext, SYM_L_GLOBAL, SYM_A_NONE)
+
+	__REF
+
+SYM_CODE_START(kernel_entry)			# kernel entry point
+
+	/* Config direct window and set PG */
+	li.d		t0, CSR_DMW0_INIT	# UC, PLV0, 0x8000 xxxx xxxx xxxx
+	csrwr		t0, LOONGARCH_CSR_DMWIN0
+	li.d		t0, CSR_DMW1_INIT	# CA, PLV0, 0x9000 xxxx xxxx xxxx
+	csrwr		t0, LOONGARCH_CSR_DMWIN1
+	/* Enable PG */
+	li.w		t0, 0xb0		# PLV=0, IE=0, PG=1
+	csrwr		t0, LOONGARCH_CSR_CRMD
+	li.w		t0, 0x04		# PLV=0, PIE=1, PWE=0
+	csrwr		t0, LOONGARCH_CSR_PRMD
+	li.w		t0, 0x00		# FPE=0, SXE=0, ASXE=0, BTE=0
+	csrwr		t0, LOONGARCH_CSR_EUEN
+
+	/* We might not get launched at the address the kernel is linked to,
+	   so we jump there.  */
+	la.abs		t0, 0f
+	jirl		zero, t0, 0
+0:
+	la		t0, __bss_start		# clear .bss
+	st.d		zero, t0, 0
+	la		t1, __bss_stop - LONGSIZE
+1:
+	addi.d		t0, t0, LONGSIZE
+	st.d		zero, t0, 0
+	bne		t0, t1, 1b
+
+	la		t0, fw_arg0
+	st.d		a0, t0, 0		# firmware arguments
+	la		t0, fw_arg1
+	st.d		a1, t0, 0
+	la		t0, fw_arg2
+	st.d		a2, t0, 0
+	la		t0, fw_arg3
+	st.d		a3, t0, 0
+
+	/* KScratch3 used for percpu base, initialized as 0 */
+	csrwr		zero, PERCPU_BASE_KS
+	/* GPR21 used for percpu base (runtime), initialized as 0 */
+	or		u0, zero, zero
+
+	la		tp, init_thread_union
+	/* Set the SP after an empty pt_regs.  */
+	PTR_LI		sp, (_THREAD_SIZE - 32 - PT_SIZE)
+	PTR_ADDU	sp, sp, tp
+	set_saved_sp	sp, t0, t1
+	PTR_ADDIU	sp, sp, -4 * SZREG	# init stack pointer
+
+	bl		start_kernel
+
+SYM_CODE_END(kernel_entry)
+
+SYM_ENTRY(kernel_entry_end, SYM_L_GLOBAL, SYM_A_NONE)
diff --git a/arch/loongarch/kernel/mem.c b/arch/loongarch/kernel/mem.c
new file mode 100644
index 000000000000..53a0449f09b2
--- /dev/null
+++ b/arch/loongarch/kernel/mem.c
@@ -0,0 +1,83 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/fs.h>
+#include <linux/mm.h>
+#include <linux/memblock.h>
+
+#include <asm/bootinfo.h>
+#include <asm/loongson.h>
+#include <asm/sections.h>
+
+void __init early_memblock_init(void)
+{
+	int i;
+	u32 mem_type;
+	u64 mem_start, mem_end, mem_size;
+
+	/* Parse memory information */
+	for (i = 0; i < loongson_mem_map->map_count; i++) {
+		mem_type = loongson_mem_map->map[i].mem_type;
+		mem_start = loongson_mem_map->map[i].mem_start;
+		mem_size = loongson_mem_map->map[i].mem_size;
+		mem_end = mem_start + mem_size;
+
+		switch (mem_type) {
+		case ADDRESS_TYPE_SYSRAM:
+			memblock_add(mem_start, mem_size);
+			if (max_low_pfn < (mem_end >> PAGE_SHIFT))
+				max_low_pfn = mem_end >> PAGE_SHIFT;
+			break;
+		}
+	}
+	memblock_set_current_limit(PFN_PHYS(max_low_pfn));
+	memblock_set_node(0, PHYS_ADDR_MAX, &memblock.memory, 0);
+
+	/* Reserve the first 2MB */
+	memblock_reserve(PHYS_OFFSET, 0x200000);
+
+	/* Reserve the kernel text/data/bss */
+	memblock_reserve(__pa_symbol(&_text),
+			 __pa_symbol(&_end) - __pa_symbol(&_text));
+}
+
+void __init fw_init_memory(void)
+{
+	int i;
+	u32 mem_type;
+	u64 mem_start, mem_end, mem_size;
+	unsigned long start_pfn, end_pfn;
+	static unsigned long num_physpages;
+
+	/* Parse memory information */
+	for (i = 0; i < loongson_mem_map->map_count; i++) {
+		mem_type = loongson_mem_map->map[i].mem_type;
+		mem_start = loongson_mem_map->map[i].mem_start;
+		mem_size = loongson_mem_map->map[i].mem_size;
+		mem_end = mem_start + mem_size;
+
+		switch (mem_type) {
+		case ADDRESS_TYPE_SYSRAM:
+			mem_start = PFN_ALIGN(mem_start);
+			mem_end = PFN_ALIGN(mem_end - PAGE_SIZE + 1);
+			num_physpages += (mem_size >> PAGE_SHIFT);
+			memblock_set_node(mem_start, mem_size, &memblock.memory, 0);
+			break;
+		case ADDRESS_TYPE_ACPI:
+			mem_start = PFN_ALIGN(mem_start);
+			mem_end = PFN_ALIGN(mem_end - PAGE_SIZE + 1);
+			num_physpages += (mem_size >> PAGE_SHIFT);
+			memblock_add(mem_start, mem_size);
+			memblock_set_node(mem_start, mem_size, &memblock.memory, 0);
+			fallthrough;
+		case ADDRESS_TYPE_RESERVED:
+			memblock_reserve(mem_start, mem_size);
+			break;
+		}
+	}
+
+	get_pfn_range_for_nid(0, &start_pfn, &end_pfn);
+	pr_info("start_pfn=0x%lx, end_pfn=0x%lx, num_physpages:0x%lx\n",
+				start_pfn, end_pfn, num_physpages);
+}
diff --git a/arch/loongarch/kernel/reset.c b/arch/loongarch/kernel/reset.c
new file mode 100644
index 000000000000..ef484ce43c5c
--- /dev/null
+++ b/arch/loongarch/kernel/reset.c
@@ -0,0 +1,90 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/kernel.h>
+#include <linux/acpi.h>
+#include <linux/efi.h>
+#include <linux/export.h>
+#include <linux/pm.h>
+#include <linux/types.h>
+#include <linux/reboot.h>
+#include <linux/delay.h>
+#include <linux/console.h>
+
+#include <acpi/reboot.h>
+#include <asm/compiler.h>
+#include <asm/idle.h>
+#include <asm/loongarch.h>
+#include <asm/reboot.h>
+
+static void default_halt(void)
+{
+	local_irq_disable();
+	clear_csr_ecfg(ECFG0_IM);
+
+	pr_notice("\n\n** You can safely turn off the power now **\n\n");
+	console_flush_on_panic(CONSOLE_FLUSH_PENDING);
+
+	while (true) {
+		__arch_cpu_idle();
+	}
+}
+
+static void default_poweroff(void)
+{
+#ifdef CONFIG_EFI
+	efi.reset_system(EFI_RESET_SHUTDOWN, EFI_SUCCESS, 0, NULL);
+#endif
+	while (true) {
+		__arch_cpu_idle();
+	}
+}
+
+static void default_restart(void)
+{
+#ifdef CONFIG_EFI
+	if (efi_capsule_pending(NULL))
+		efi_reboot(REBOOT_WARM, NULL);
+	else
+		efi_reboot(REBOOT_COLD, NULL);
+#endif
+	if (!acpi_disabled)
+		acpi_reboot();
+
+	while (true) {
+		__arch_cpu_idle();
+	}
+}
+
+void (*pm_restart)(void);
+EXPORT_SYMBOL(pm_restart);
+
+void (*pm_power_off)(void);
+EXPORT_SYMBOL(pm_power_off);
+
+void machine_halt(void)
+{
+	default_halt();
+}
+
+void machine_power_off(void)
+{
+	pm_power_off();
+}
+
+void machine_restart(char *command)
+{
+	do_kernel_restart(command);
+	pm_restart();
+}
+
+static int __init loongarch_reboot_setup(void)
+{
+	pm_restart = default_restart;
+	pm_power_off = default_poweroff;
+
+	return 0;
+}
+
+arch_initcall(loongarch_reboot_setup);
diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c
new file mode 100644
index 000000000000..13c79c9ce558
--- /dev/null
+++ b/arch/loongarch/kernel/setup.c
@@ -0,0 +1,434 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ *
+ * Derived from MIPS:
+ * Copyright (C) 1995 Linus Torvalds
+ * Copyright (C) 1995 Waldorf Electronics
+ * Copyright (C) 1994, 95, 96, 97, 98, 99, 2000, 01, 02, 03  Ralf Baechle
+ * Copyright (C) 1996 Stoned Elipot
+ * Copyright (C) 1999 Silicon Graphics, Inc.
+ * Copyright (C) 2000, 2001, 2002, 2007	 Maciej W. Rozycki
+ */
+#include <linux/init.h>
+#include <linux/acpi.h>
+#include <linux/dmi.h>
+#include <linux/efi.h>
+#include <linux/export.h>
+#include <linux/screen_info.h>
+#include <linux/memblock.h>
+#include <linux/initrd.h>
+#include <linux/ioport.h>
+#include <linux/root_dev.h>
+#include <linux/console.h>
+#include <linux/pfn.h>
+#include <linux/platform_device.h>
+#include <linux/sizes.h>
+#include <linux/device.h>
+#include <linux/dma-map-ops.h>
+#include <linux/swiotlb.h>
+
+#include <asm/addrspace.h>
+#include <asm/bootinfo.h>
+#include <asm/cache.h>
+#include <asm/cpu.h>
+#include <asm/dma.h>
+#include <asm/efi.h>
+#include <asm/fw.h>
+#include <asm/loongson.h>
+#include <asm/pgalloc.h>
+#include <asm/sections.h>
+#include <asm/setup.h>
+#include <asm/time.h>
+
+#define SMBIOS_BIOSSIZE_OFFSET		0x09
+#define SMBIOS_BIOSEXTERN_OFFSET	0x13
+#define SMBIOS_FREQLOW_OFFSET		0x16
+#define SMBIOS_FREQHIGH_OFFSET		0x17
+#define SMBIOS_FREQLOW_MASK		0xFF
+#define SMBIOS_CORE_PACKAGE_OFFSET	0x23
+#define LOONGSON_EFI_ENABLE		(1 << 3)
+
+#ifdef CONFIG_VT
+struct screen_info screen_info;
+#endif
+
+DEFINE_PER_CPU(unsigned long, kernelsp);
+unsigned long fw_arg0, fw_arg1, fw_arg2, fw_arg3;
+struct cpuinfo_loongarch cpu_data[NR_CPUS] __read_mostly;
+
+EXPORT_SYMBOL(cpu_data);
+
+struct loongson_board_info b_info;
+static const char dmi_empty_string[] = "        ";
+
+/*
+ * Setup information
+ *
+ * These are initialized so they are in the .data section
+ */
+
+char __initdata arcs_cmdline[COMMAND_LINE_SIZE];
+static char __initdata command_line[COMMAND_LINE_SIZE];
+
+static int num_standard_resources;
+static struct resource *standard_resources;
+
+static struct resource code_resource = { .name = "Kernel code", };
+static struct resource data_resource = { .name = "Kernel data", };
+static struct resource bss_resource  = { .name = "Kernel bss", };
+
+const char *get_system_type(void)
+{
+	return "generic-loongson-machine";
+}
+
+static const char *dmi_string_parse(const struct dmi_header *dm, u8 s)
+{
+	const u8 *bp = ((u8 *) dm) + dm->length;
+
+	if (s) {
+		s--;
+		while (s > 0 && *bp) {
+			bp += strlen(bp) + 1;
+			s--;
+		}
+
+		if (*bp != 0) {
+			size_t len = strlen(bp)+1;
+			size_t cmp_len = len > 8 ? 8 : len;
+
+			if (!memcmp(bp, dmi_empty_string, cmp_len))
+				return dmi_empty_string;
+
+			return bp;
+		}
+	}
+
+	return "";
+}
+
+static void __init parse_cpu_table(const struct dmi_header *dm)
+{
+	long freq_temp = 0;
+	char *dmi_data = (char *)dm;
+
+	freq_temp = ((*(dmi_data + SMBIOS_FREQHIGH_OFFSET) << 8) +
+			((*(dmi_data + SMBIOS_FREQLOW_OFFSET)) & SMBIOS_FREQLOW_MASK));
+	cpu_clock_freq = freq_temp * 1000000;
+
+	loongson_sysconf.cpuname = (void *)dmi_string_parse(dm, dmi_data[16]);
+	loongson_sysconf.cores_per_package = *(dmi_data + SMBIOS_CORE_PACKAGE_OFFSET);
+
+	pr_info("CpuClock = %llu\n", cpu_clock_freq);
+}
+
+static void __init parse_bios_table(const struct dmi_header *dm)
+{
+	int bios_extern;
+	char *dmi_data = (char *)dm;
+
+	bios_extern = *(dmi_data + SMBIOS_BIOSEXTERN_OFFSET);
+	b_info.bios_size = *(dmi_data + SMBIOS_BIOSSIZE_OFFSET);
+
+	if (bios_extern & LOONGSON_EFI_ENABLE)
+		set_bit(EFI_BOOT, &efi.flags);
+	else
+		clear_bit(EFI_BOOT, &efi.flags);
+}
+
+static void __init find_tokens(const struct dmi_header *dm, void *dummy)
+{
+	switch (dm->type) {
+	case 0x0: /* Extern BIOS */
+		parse_bios_table(dm);
+		break;
+	case 0x4: /* Calling interface */
+		parse_cpu_table(dm);
+		break;
+	}
+}
+static void __init smbios_parse(void)
+{
+	b_info.bios_vendor = (void *)dmi_get_system_info(DMI_BIOS_VENDOR);
+	b_info.bios_version = (void *)dmi_get_system_info(DMI_BIOS_VERSION);
+	b_info.bios_release_date = (void *)dmi_get_system_info(DMI_BIOS_DATE);
+	b_info.board_vendor = (void *)dmi_get_system_info(DMI_BOARD_VENDOR);
+	b_info.board_name = (void *)dmi_get_system_info(DMI_BOARD_NAME);
+	dmi_walk(find_tokens, NULL);
+}
+
+/*
+ * Manage initrd
+ */
+#ifdef CONFIG_BLK_DEV_INITRD
+
+static unsigned long __init init_initrd(void)
+{
+	if (!phys_initrd_start || !phys_initrd_size)
+		return 0;
+
+	initrd_start = (unsigned long)phys_to_virt(phys_initrd_start);
+	initrd_end   = (unsigned long)phys_to_virt(phys_initrd_start + phys_initrd_size);
+
+	if (!initrd_start || initrd_end <= initrd_start)
+		goto disable;
+
+	if (initrd_start & ~PAGE_MASK) {
+		pr_err("initrd start must be page aligned\n");
+		goto disable;
+	}
+	if (initrd_start < PAGE_OFFSET) {
+		pr_err("initrd start < PAGE_OFFSET\n");
+		goto disable;
+	}
+
+	ROOT_DEV = Root_RAM0;
+
+	initrd_below_start_ok = 1;
+	memblock_reserve(phys_initrd_start, phys_initrd_size);
+
+	pr_info("Initial ramdisk at: 0x%lx (%lu bytes)\n",
+		initrd_start, initrd_end - initrd_start);
+
+	return 0;
+
+disable:
+	initrd_start = 0;
+	initrd_end = 0;
+	pr_cont(" - disabling initrd\n");
+
+	return 0;
+}
+
+#else  /* !CONFIG_BLK_DEV_INITRD */
+
+static unsigned long __init init_initrd(void)
+{
+	return 0;
+}
+
+#endif
+
+static int usermem __initdata;
+
+static int __init early_parse_mem(char *p)
+{
+	phys_addr_t start, size;
+
+	/*
+	 * If a user specifies memory size, we
+	 * blow away any automatically generated
+	 * size.
+	 */
+	if (usermem == 0) {
+		usermem = 1;
+		memblock_remove(memblock_start_of_DRAM(),
+			memblock_end_of_DRAM() - memblock_start_of_DRAM());
+	}
+	start = 0;
+	size = memparse(p, &p);
+	if (*p == '@')
+		start = memparse(p + 1, &p);
+	else {
+		pr_err("Invalid format!\n");
+		return -EINVAL;
+	}
+
+	memblock_add(start, size);
+
+	return 0;
+}
+early_param("mem", early_parse_mem);
+
+static void __init bootcmdline_append(const char *s, size_t max)
+{
+	if (!s[0] || !max)
+		return;
+
+	if (boot_command_line[0])
+		strlcat(boot_command_line, " ", COMMAND_LINE_SIZE);
+
+	strlcat(boot_command_line, s, max);
+}
+
+static void __init bootcmdline_init(char **cmdline_p)
+{
+	boot_command_line[0] = 0;
+
+	/*
+	 * Take arguments from the bootloader at first. Early code should have
+	 * filled arcs_cmdline with arguments from the bootloader.
+	 */
+	bootcmdline_append(arcs_cmdline, COMMAND_LINE_SIZE);
+
+	strlcpy(command_line, boot_command_line, COMMAND_LINE_SIZE);
+	*cmdline_p = command_line;
+
+	parse_early_param();
+}
+
+void __init platform_init(void)
+{
+	efi_init();
+#ifdef CONFIG_ACPI_TABLE_UPGRADE
+	acpi_table_upgrade();
+#endif
+#ifdef CONFIG_ACPI
+	acpi_gbl_use_default_register_widths = false;
+	acpi_boot_table_init();
+	acpi_boot_init();
+#endif
+
+	fw_init_memory();
+	dmi_setup();
+	smbios_parse();
+	pr_info("The BIOS Version: %s\n", b_info.bios_version);
+
+	efi_runtime_init();
+}
+
+static void __init check_kernel_sections_mem(void)
+{
+	phys_addr_t start = __pa_symbol(&_text);
+	phys_addr_t size = __pa_symbol(&_end) - start;
+
+	if (!memblock_is_region_memory(start, size)) {
+		pr_info("Kernel sections are not in the memory maps\n");
+		memblock_add(start, size);
+	}
+}
+
+/*
+ * arch_mem_init - initialize memory management subsystem
+ */
+static void __init arch_mem_init(char **cmdline_p)
+{
+	if (usermem)
+		pr_info("User-defined physical RAM map overwrite\n");
+
+	check_kernel_sections_mem();
+
+	/*
+	 * In order to reduce the possibility of kernel panic when failed to
+	 * get IO TLB memory under CONFIG_SWIOTLB, it is better to allocate
+	 * low memory as small as possible before plat_swiotlb_setup(), so
+	 * make sparse_init() using top-down allocation.
+	 */
+	memblock_set_bottom_up(false);
+	sparse_init();
+	memblock_set_bottom_up(true);
+
+	swiotlb_init(1);
+
+	dma_contiguous_reserve(PFN_PHYS(max_low_pfn));
+
+	memblock_dump_all();
+
+	early_memtest(PFN_PHYS(ARCH_PFN_OFFSET), PFN_PHYS(max_low_pfn));
+}
+
+static void __init resource_init(void)
+{
+	long i = 0;
+	size_t res_size;
+	struct resource *res;
+	struct memblock_region *region;
+
+	code_resource.start = __pa_symbol(&_text);
+	code_resource.end = __pa_symbol(&_etext) - 1;
+	data_resource.start = __pa_symbol(&_etext);
+	data_resource.end = __pa_symbol(&_edata) - 1;
+	bss_resource.start = __pa_symbol(&__bss_start);
+	bss_resource.end = __pa_symbol(&__bss_stop) - 1;
+
+	num_standard_resources = memblock.memory.cnt;
+	res_size = num_standard_resources * sizeof(*standard_resources);
+	standard_resources = memblock_alloc(res_size, SMP_CACHE_BYTES);
+
+	for_each_mem_region(region) {
+		res = &standard_resources[i++];
+		if (!memblock_is_nomap(region)) {
+			res->name  = "System RAM";
+			res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+			res->start = __pfn_to_phys(memblock_region_memory_base_pfn(region));
+			res->end = __pfn_to_phys(memblock_region_memory_end_pfn(region)) - 1;
+		} else {
+			res->name  = "Reserved";
+			res->flags = IORESOURCE_MEM;
+			res->start = __pfn_to_phys(memblock_region_reserved_base_pfn(region));
+			res->end = __pfn_to_phys(memblock_region_reserved_end_pfn(region)) - 1;
+		}
+
+		request_resource(&iomem_resource, res);
+
+		/*
+		 *  We don't know which RAM region contains kernel data,
+		 *  so we try it repeatedly and let the resource manager
+		 *  test it.
+		 */
+		request_resource(res, &code_resource);
+		request_resource(res, &data_resource);
+		request_resource(res, &bss_resource);
+	}
+}
+
+static int __init reserve_memblock_reserved_regions(void)
+{
+	u64 i, j;
+
+	for (i = 0; i < num_standard_resources; ++i) {
+		struct resource *mem = &standard_resources[i];
+		phys_addr_t r_start, r_end, mem_size = resource_size(mem);
+
+		if (!memblock_is_region_reserved(mem->start, mem_size))
+			continue;
+
+		for_each_reserved_mem_range(j, &r_start, &r_end) {
+			resource_size_t start, end;
+
+			start = max(PFN_PHYS(PFN_DOWN(r_start)), mem->start);
+			end = min(PFN_PHYS(PFN_UP(r_end)) - 1, mem->end);
+
+			if (start > mem->end || end < mem->start)
+				continue;
+
+			reserve_region_with_split(mem, start, end, "Reserved");
+		}
+	}
+
+	return 0;
+}
+arch_initcall(reserve_memblock_reserved_regions);
+
+void __init setup_arch(char **cmdline_p)
+{
+	cpu_probe();
+
+	fw_init_cmdline();
+	fw_init_environ();
+	early_memblock_init();
+	bootcmdline_init(cmdline_p);
+
+	init_initrd();
+	platform_init();
+	pagetable_init();
+
+	arch_mem_init(cmdline_p);
+
+	resource_init();
+
+	paging_init();
+}
+
+static int __init register_gop_device(void)
+{
+	void *pd;
+
+	if (screen_info.orig_video_isVGA != VIDEO_TYPE_EFI)
+		return 0;
+	pd = platform_device_register_data(NULL, "efi-framebuffer", 0,
+			&screen_info, sizeof(screen_info));
+	return PTR_ERR_OR_ZERO(pd);
+}
+subsys_initcall(register_gop_device);
diff --git a/arch/loongarch/kernel/time.c b/arch/loongarch/kernel/time.c
new file mode 100644
index 000000000000..5d2b2c6712bc
--- /dev/null
+++ b/arch/loongarch/kernel/time.c
@@ -0,0 +1,220 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Common time service routines for LoongArch machines.
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/clockchips.h>
+#include <linux/delay.h>
+#include <linux/export.h>
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/kernel.h>
+#include <linux/sched_clock.h>
+#include <linux/spinlock.h>
+
+#include <asm/cpu-features.h>
+#include <asm/loongarch.h>
+#include <asm/time.h>
+
+u64 cpu_clock_freq;
+EXPORT_SYMBOL(cpu_clock_freq);
+u64 const_clock_freq;
+EXPORT_SYMBOL(const_clock_freq);
+
+static DEFINE_RAW_SPINLOCK(state_lock);
+static DEFINE_PER_CPU(struct clock_event_device, constant_clockevent_device);
+
+static void constant_event_handler(struct clock_event_device *dev)
+{
+}
+
+irqreturn_t constant_timer_interrupt(int irq, void *data)
+{
+	int cpu = smp_processor_id();
+	struct clock_event_device *cd;
+
+	/* Clear Timer Interrupt */
+	write_csr_tintclear(CSR_TINTCLR_TI);
+	cd = &per_cpu(constant_clockevent_device, cpu);
+	cd->event_handler(cd);
+
+	return IRQ_HANDLED;
+}
+
+static int constant_set_state_oneshot(struct clock_event_device *evt)
+{
+	unsigned long timer_config;
+
+	raw_spin_lock(&state_lock);
+
+	timer_config = csr_readq(LOONGARCH_CSR_TCFG);
+	timer_config |= CSR_TCFG_EN;
+	timer_config &= ~CSR_TCFG_PERIOD;
+	csr_writeq(timer_config, LOONGARCH_CSR_TCFG);
+
+	raw_spin_unlock(&state_lock);
+
+	return 0;
+}
+
+static int constant_set_state_oneshot_stopped(struct clock_event_device *evt)
+{
+	unsigned long timer_config;
+
+	raw_spin_lock(&state_lock);
+
+	timer_config = csr_readq(LOONGARCH_CSR_TCFG);
+	timer_config &= ~CSR_TCFG_EN;
+	csr_writeq(timer_config, LOONGARCH_CSR_TCFG);
+
+	raw_spin_unlock(&state_lock);
+
+	return 0;
+}
+
+static int constant_set_state_periodic(struct clock_event_device *evt)
+{
+	unsigned long period;
+	unsigned long timer_config;
+
+	raw_spin_lock(&state_lock);
+
+	period = const_clock_freq / HZ;
+	timer_config = period & CSR_TCFG_VAL;
+	timer_config |= (CSR_TCFG_PERIOD | CSR_TCFG_EN);
+	csr_writeq(timer_config, LOONGARCH_CSR_TCFG);
+
+	raw_spin_unlock(&state_lock);
+
+	return 0;
+}
+
+static int constant_set_state_shutdown(struct clock_event_device *evt)
+{
+	return 0;
+}
+
+static int constant_timer_next_event(unsigned long delta, struct clock_event_device *evt)
+{
+	unsigned long timer_config;
+
+	delta &= CSR_TCFG_VAL;
+	timer_config = delta | CSR_TCFG_EN;
+	csr_writeq(timer_config, LOONGARCH_CSR_TCFG);
+
+	return 0;
+}
+
+static unsigned long __init get_loops_per_jiffy(void)
+{
+	unsigned long lpj = (unsigned long)const_clock_freq;
+
+	do_div(lpj, HZ);
+
+	return lpj;
+}
+
+static long init_timeval;
+
+void sync_counter(void)
+{
+	/* Ensure counter begin at 0 */
+	csr_writeq(-init_timeval, LOONGARCH_CSR_CNTC);
+}
+
+int constant_clockevent_init(void)
+{
+	unsigned int irq;
+	unsigned int cpu = smp_processor_id();
+	unsigned long min_delta = 0x600;
+	unsigned long max_delta = (1UL << 48) - 1;
+	struct clock_event_device *cd;
+	static int timer_irq_installed = 0;
+
+	irq = get_timer_irq();
+
+	cd = &per_cpu(constant_clockevent_device, cpu);
+
+	cd->name = "Constant";
+	cd->features = CLOCK_EVT_FEAT_ONESHOT | CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_PERCPU;
+
+	cd->irq = irq;
+	cd->rating = 320;
+	cd->cpumask = cpumask_of(cpu);
+	cd->set_state_oneshot = constant_set_state_oneshot;
+	cd->set_state_oneshot_stopped = constant_set_state_oneshot_stopped;
+	cd->set_state_periodic = constant_set_state_periodic;
+	cd->set_state_shutdown = constant_set_state_shutdown;
+	cd->set_next_event = constant_timer_next_event;
+	cd->event_handler = constant_event_handler;
+
+	clockevents_config_and_register(cd, const_clock_freq, min_delta, max_delta);
+
+	if (timer_irq_installed)
+		return 0;
+
+	timer_irq_installed = 1;
+
+	sync_counter();
+
+	if (request_irq(irq, constant_timer_interrupt, IRQF_PERCPU | IRQF_TIMER, "timer", NULL))
+		pr_err("Failed to request irq %d (timer)\n", irq);
+
+	lpj_fine = get_loops_per_jiffy();
+	pr_info("Constant clock event device register\n");
+
+	return 0;
+}
+
+static u64 read_const_counter(struct clocksource *clk)
+{
+	return drdtime();
+}
+
+static u64 native_sched_clock(void)
+{
+	return read_const_counter(NULL);
+}
+
+static struct clocksource clocksource_const = {
+	.name = "Constant",
+	.rating = 400,
+	.read = read_const_counter,
+	.mask = CLOCKSOURCE_MASK(64),
+	.flags = CLOCK_SOURCE_IS_CONTINUOUS,
+	.mult = 0,
+	.shift = 10,
+};
+
+int __init constant_clocksource_init(void)
+{
+	int res;
+	unsigned long freq;
+
+	freq = const_clock_freq;
+
+	clocksource_const.mult =
+		clocksource_hz2mult(freq, clocksource_const.shift);
+
+	res = clocksource_register_hz(&clocksource_const, freq);
+
+	sched_clock_register(native_sched_clock, 64, freq);
+
+	pr_info("Constant clock source device register\n");
+
+	return res;
+}
+
+void __init time_init(void)
+{
+	if (!cpu_has_cpucfg)
+		const_clock_freq = cpu_clock_freq;
+	else
+		const_clock_freq = calc_const_freq();
+
+	init_timeval = drdtime() - csr_readq(LOONGARCH_CSR_CNTC);
+
+	constant_clockevent_init();
+	constant_clocksource_init();
+}
diff --git a/arch/loongarch/kernel/topology.c b/arch/loongarch/kernel/topology.c
new file mode 100644
index 000000000000..3b2cbb95875b
--- /dev/null
+++ b/arch/loongarch/kernel/topology.c
@@ -0,0 +1,13 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/cpu.h>
+#include <linux/init.h>
+#include <linux/percpu.h>
+
+static struct cpu cpu_device;
+
+static int __init topology_init(void)
+{
+	return register_cpu(&cpu_device, 0);
+}
+
+subsys_initcall(topology_init);
diff --git a/include/linux/efi.h b/include/linux/efi.h
index ccd4d3f91c98..559fabdb6b7d 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -397,6 +397,7 @@ void efi_native_runtime_setup(void);
  * associated with ConOut
  */
 #define LINUX_EFI_ARM_SCREEN_INFO_TABLE_GUID	EFI_GUID(0xe03fc20a, 0x85dc, 0x406e,  0xb9, 0x0e, 0x4a, 0xb5, 0x02, 0x37, 0x1d, 0x95)
+#define LINUX_EFI_LARCH_SCREEN_INFO_TABLE_GUID	EFI_GUID(0x07fd51a6, 0x9532, 0x926f,  0x51, 0xdc, 0x6a, 0x63, 0x60, 0x2f, 0x84, 0xb4)
 #define LINUX_EFI_ARM_CPU_STATE_TABLE_GUID	EFI_GUID(0xef79e4aa, 0x3c3d, 0x4989,  0xb9, 0x02, 0x07, 0xa9, 0x43, 0xe5, 0x50, 0xd2)
 #define LINUX_EFI_LOADER_ENTRY_GUID		EFI_GUID(0x4a67b082, 0x0a4c, 0x41cf,  0xb6, 0xc7, 0x44, 0x0b, 0x29, 0xbb, 0x8c, 0x4f)
 #define LINUX_EFI_RANDOM_SEED_TABLE_GUID	EFI_GUID(0x1ce1e5bc, 0x7ceb, 0x42f2,  0x81, 0xe5, 0x8a, 0xad, 0xf1, 0x80, 0xf5, 0x7b)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 10/24] LoongArch: Add exception/interrupt handling
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (8 preceding siblings ...)
  2022-04-30  9:05 ` [PATCH V9 09/24] LoongArch: Add boot and setup routines Huacai Chen
@ 2022-04-30  9:05 ` Huacai Chen
  2022-05-01 16:27   ` Xi Ruoyao
  2022-04-30  9:05 ` [PATCH V9 11/24] LoongArch: Add process management Huacai Chen
                   ` (14 subsequent siblings)
  24 siblings, 1 reply; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:05 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen

This patch adds the exception and interrupt handling machanism for
LoongArch.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/include/asm/branch.h       |  21 +
 arch/loongarch/include/asm/bug.h          |  23 +
 arch/loongarch/include/asm/entry-common.h |  13 +
 arch/loongarch/include/asm/hardirq.h      |  24 +
 arch/loongarch/include/asm/hw_irq.h       |  17 +
 arch/loongarch/include/asm/irq.h          | 130 ++++
 arch/loongarch/include/asm/irq_regs.h     |  27 +
 arch/loongarch/include/asm/irqflags.h     |  78 +++
 arch/loongarch/include/asm/kdebug.h       |  23 +
 arch/loongarch/include/asm/stackframe.h   | 212 ++++++
 arch/loongarch/include/asm/stacktrace.h   |  74 +++
 arch/loongarch/include/uapi/asm/break.h   |  23 +
 arch/loongarch/kernel/access-helper.h     |  13 +
 arch/loongarch/kernel/genex.S             |  95 +++
 arch/loongarch/kernel/irq.c               | 131 ++++
 arch/loongarch/kernel/traps.c             | 753 ++++++++++++++++++++++
 16 files changed, 1657 insertions(+)
 create mode 100644 arch/loongarch/include/asm/branch.h
 create mode 100644 arch/loongarch/include/asm/bug.h
 create mode 100644 arch/loongarch/include/asm/entry-common.h
 create mode 100644 arch/loongarch/include/asm/hardirq.h
 create mode 100644 arch/loongarch/include/asm/hw_irq.h
 create mode 100644 arch/loongarch/include/asm/irq.h
 create mode 100644 arch/loongarch/include/asm/irq_regs.h
 create mode 100644 arch/loongarch/include/asm/irqflags.h
 create mode 100644 arch/loongarch/include/asm/kdebug.h
 create mode 100644 arch/loongarch/include/asm/stackframe.h
 create mode 100644 arch/loongarch/include/asm/stacktrace.h
 create mode 100644 arch/loongarch/include/uapi/asm/break.h
 create mode 100644 arch/loongarch/kernel/access-helper.h
 create mode 100644 arch/loongarch/kernel/genex.S
 create mode 100644 arch/loongarch/kernel/irq.c
 create mode 100644 arch/loongarch/kernel/traps.c

diff --git a/arch/loongarch/include/asm/branch.h b/arch/loongarch/include/asm/branch.h
new file mode 100644
index 000000000000..3f33c89f35b4
--- /dev/null
+++ b/arch/loongarch/include/asm/branch.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_BRANCH_H
+#define _ASM_BRANCH_H
+
+#include <asm/ptrace.h>
+
+static inline unsigned long exception_era(struct pt_regs *regs)
+{
+	return regs->csr_era;
+}
+
+static inline int compute_return_era(struct pt_regs *regs)
+{
+	regs->csr_era += 4;
+	return 0;
+}
+
+#endif /* _ASM_BRANCH_H */
diff --git a/arch/loongarch/include/asm/bug.h b/arch/loongarch/include/asm/bug.h
new file mode 100644
index 000000000000..bda49108a76d
--- /dev/null
+++ b/arch/loongarch/include/asm/bug.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_BUG_H
+#define __ASM_BUG_H
+
+#include <linux/compiler.h>
+
+#ifdef CONFIG_BUG
+
+#include <asm/break.h>
+
+static inline void __noreturn BUG(void)
+{
+	__asm__ __volatile__("break %0" : : "i" (BRK_BUG));
+	unreachable();
+}
+
+#define HAVE_ARCH_BUG
+
+#endif
+
+#include <asm-generic/bug.h>
+
+#endif /* __ASM_BUG_H */
diff --git a/arch/loongarch/include/asm/entry-common.h b/arch/loongarch/include/asm/entry-common.h
new file mode 100644
index 000000000000..0fe2a098ded9
--- /dev/null
+++ b/arch/loongarch/include/asm/entry-common.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef ARCH_LOONGARCH_ENTRY_COMMON_H
+#define ARCH_LOONGARCH_ENTRY_COMMON_H
+
+#include <linux/sched.h>
+#include <linux/processor.h>
+
+static inline bool on_thread_stack(void)
+{
+	return !(((unsigned long)(current->stack) ^ current_stack_pointer) & ~(THREAD_SIZE - 1));
+}
+
+#endif
diff --git a/arch/loongarch/include/asm/hardirq.h b/arch/loongarch/include/asm/hardirq.h
new file mode 100644
index 000000000000..d32f83938880
--- /dev/null
+++ b/arch/loongarch/include/asm/hardirq.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_HARDIRQ_H
+#define _ASM_HARDIRQ_H
+
+#include <linux/cache.h>
+#include <linux/threads.h>
+#include <linux/irq.h>
+
+extern void ack_bad_irq(unsigned int irq);
+#define ack_bad_irq ack_bad_irq
+
+#define NR_IPI	2
+
+typedef struct {
+	unsigned int ipi_irqs[NR_IPI];
+	unsigned int __softirq_pending;
+} ____cacheline_aligned irq_cpustat_t;
+
+DECLARE_PER_CPU_ALIGNED(irq_cpustat_t, irq_stat);
+
+#endif /* _ASM_HARDIRQ_H */
diff --git a/arch/loongarch/include/asm/hw_irq.h b/arch/loongarch/include/asm/hw_irq.h
new file mode 100644
index 000000000000..af4f4e8fbd85
--- /dev/null
+++ b/arch/loongarch/include/asm/hw_irq.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM_HW_IRQ_H
+#define __ASM_HW_IRQ_H
+
+#include <linux/atomic.h>
+
+extern atomic_t irq_err_count;
+
+/*
+ * interrupt-retrigger: NOP for now. This may not be appropriate for all
+ * machines, we'll see ...
+ */
+
+#endif /* __ASM_HW_IRQ_H */
diff --git a/arch/loongarch/include/asm/irq.h b/arch/loongarch/include/asm/irq.h
new file mode 100644
index 000000000000..cd95d0d4e10f
--- /dev/null
+++ b/arch/loongarch/include/asm/irq.h
@@ -0,0 +1,130 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_IRQ_H
+#define _ASM_IRQ_H
+
+#include <linux/irqdomain.h>
+#include <linux/irqreturn.h>
+
+#define IRQ_STACK_SIZE			THREAD_SIZE
+#define IRQ_STACK_START			(IRQ_STACK_SIZE - 16)
+
+DECLARE_PER_CPU(unsigned long, irq_stack);
+
+/*
+ * The highest address on the IRQ stack contains a dummy frame which is
+ * structured as follows:
+ *
+ *   top ------------
+ *       | task sp  | <- irq_stack[cpu] + IRQ_STACK_START
+ *       ------------
+ *       |          | <- First frame of IRQ context
+ *       ------------
+ *
+ * task sp holds a copy of the task stack pointer where the struct pt_regs
+ * from exception entry can be found.
+ */
+
+static inline bool on_irq_stack(int cpu, unsigned long sp)
+{
+	unsigned long low = per_cpu(irq_stack, cpu);
+	unsigned long high = low + IRQ_STACK_SIZE;
+
+	return (low <= sp && sp <= high);
+}
+
+int get_ipi_irq(void);
+int get_pmc_irq(void);
+int get_timer_irq(void);
+void spurious_interrupt(void);
+
+#define NR_IRQS_LEGACY 16
+
+#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
+void arch_trigger_cpumask_backtrace(const struct cpumask *mask, bool exclude_self);
+
+#define MAX_IO_PICS 2
+#define NR_IRQS	(64 + (256 * MAX_IO_PICS))
+
+#define CORES_PER_EIO_NODE	4
+
+#define LOONGSON_CPU_UART0_VEC		10 /* CPU UART0 */
+#define LOONGSON_CPU_THSENS_VEC		14 /* CPU Thsens */
+#define LOONGSON_CPU_HT0_VEC		16 /* CPU HT0 irq vector base number */
+#define LOONGSON_CPU_HT1_VEC		24 /* CPU HT1 irq vector base number */
+
+/* IRQ number definitions */
+#define LOONGSON_LPC_IRQ_BASE		0
+#define LOONGSON_LPC_LAST_IRQ		(LOONGSON_LPC_IRQ_BASE + 15)
+
+#define LOONGSON_CPU_IRQ_BASE		16
+#define LOONGSON_CPU_LAST_IRQ		(LOONGSON_CPU_IRQ_BASE + 14)
+
+#define LOONGSON_PCH_IRQ_BASE		64
+#define LOONGSON_PCH_ACPI_IRQ		(LOONGSON_PCH_IRQ_BASE + 47)
+#define LOONGSON_PCH_LAST_IRQ		(LOONGSON_PCH_IRQ_BASE + 64 - 1)
+
+#define LOONGSON_MSI_IRQ_BASE		(LOONGSON_PCH_IRQ_BASE + 64)
+#define LOONGSON_MSI_LAST_IRQ		(LOONGSON_PCH_IRQ_BASE + 256 - 1)
+
+#define GSI_MIN_LPC_IRQ		LOONGSON_LPC_IRQ_BASE
+#define GSI_MAX_LPC_IRQ		(LOONGSON_LPC_IRQ_BASE + 16 - 1)
+#define GSI_MIN_CPU_IRQ		LOONGSON_CPU_IRQ_BASE
+#define GSI_MAX_CPU_IRQ		(LOONGSON_CPU_IRQ_BASE + 48 - 1)
+#define GSI_MIN_PCH_IRQ		LOONGSON_PCH_IRQ_BASE
+#define GSI_MAX_PCH_IRQ		(LOONGSON_PCH_IRQ_BASE + 256 - 1)
+
+extern int find_pch_pic(u32 gsi);
+extern int eiointc_get_node(int id);
+
+static inline void eiointc_enable(void)
+{
+	uint64_t misc;
+
+	misc = iocsr_readq(LOONGARCH_IOCSR_MISC_FUNC);
+	misc |= IOCSR_MISC_FUNC_EXT_IOI_EN;
+	iocsr_writeq(misc, LOONGARCH_IOCSR_MISC_FUNC);
+}
+
+struct acpi_madt_lio_pic;
+struct acpi_madt_eio_pic;
+struct acpi_madt_ht_pic;
+struct acpi_madt_bio_pic;
+struct acpi_madt_msi_pic;
+struct acpi_madt_lpc_pic;
+
+struct irq_domain *loongarch_cpu_irq_init(void);
+
+struct irq_domain *liointc_acpi_init(struct irq_domain *parent,
+					struct acpi_madt_lio_pic *acpi_liointc);
+struct irq_domain *eiointc_acpi_init(struct irq_domain *parent,
+					struct acpi_madt_eio_pic *acpi_eiointc);
+
+struct irq_domain *htvec_acpi_init(struct irq_domain *parent,
+					struct acpi_madt_ht_pic *acpi_htvec);
+struct irq_domain *pch_lpc_acpi_init(struct irq_domain *parent,
+					struct acpi_madt_lpc_pic *acpi_pchlpc);
+struct irq_domain *pch_msi_acpi_init(struct irq_domain *parent,
+					struct acpi_madt_msi_pic *acpi_pchmsi);
+struct irq_domain *pch_pic_acpi_init(struct irq_domain *parent,
+					struct acpi_madt_bio_pic *acpi_pchpic);
+
+extern struct acpi_madt_lio_pic *acpi_liointc;
+extern struct acpi_madt_eio_pic *acpi_eiointc[MAX_IO_PICS];
+
+extern struct acpi_madt_ht_pic *acpi_htintc;
+extern struct acpi_madt_lpc_pic *acpi_pchlpc;
+extern struct acpi_madt_msi_pic *acpi_pchmsi[MAX_IO_PICS];
+extern struct acpi_madt_bio_pic *acpi_pchpic[MAX_IO_PICS];
+
+extern struct irq_domain *cpu_domain;
+extern struct irq_domain *liointc_domain;
+extern struct irq_domain *pch_lpc_domain;
+extern struct irq_domain *pch_msi_domain[MAX_IO_PICS];
+extern struct irq_domain *pch_pic_domain[MAX_IO_PICS];
+
+#include <asm-generic/irq.h>
+
+#endif /* _ASM_IRQ_H */
diff --git a/arch/loongarch/include/asm/irq_regs.h b/arch/loongarch/include/asm/irq_regs.h
new file mode 100644
index 000000000000..3d62d815bf6b
--- /dev/null
+++ b/arch/loongarch/include/asm/irq_regs.h
@@ -0,0 +1,27 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM_IRQ_REGS_H
+#define __ASM_IRQ_REGS_H
+
+#define ARCH_HAS_OWN_IRQ_REGS
+
+#include <linux/thread_info.h>
+
+static inline struct pt_regs *get_irq_regs(void)
+{
+	return current_thread_info()->regs;
+}
+
+static inline struct pt_regs *set_irq_regs(struct pt_regs *new_regs)
+{
+	struct pt_regs *old_regs;
+
+	old_regs = get_irq_regs();
+	current_thread_info()->regs = new_regs;
+
+	return old_regs;
+}
+
+#endif /* __ASM_IRQ_REGS_H */
diff --git a/arch/loongarch/include/asm/irqflags.h b/arch/loongarch/include/asm/irqflags.h
new file mode 100644
index 000000000000..52121cd791fe
--- /dev/null
+++ b/arch/loongarch/include/asm/irqflags.h
@@ -0,0 +1,78 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_IRQFLAGS_H
+#define _ASM_IRQFLAGS_H
+
+#ifndef __ASSEMBLY__
+
+#include <linux/compiler.h>
+#include <linux/stringify.h>
+#include <asm/compiler.h>
+#include <asm/loongarch.h>
+
+static inline void arch_local_irq_enable(void)
+{
+	u32 flags = CSR_CRMD_IE;
+	__asm__ __volatile__(
+		"csrxchg %[val], %[mask], %[reg]\n\t"
+		: [val] "+r" (flags)
+		: [mask] "r" (CSR_CRMD_IE), [reg] "i" (LOONGARCH_CSR_CRMD)
+		: "memory");
+}
+
+static inline void arch_local_irq_disable(void)
+{
+	u32 flags = 0;
+	__asm__ __volatile__(
+		"csrxchg %[val], %[mask], %[reg]\n\t"
+		: [val] "+r" (flags)
+		: [mask] "r" (CSR_CRMD_IE), [reg] "i" (LOONGARCH_CSR_CRMD)
+		: "memory");
+}
+
+static inline unsigned long arch_local_irq_save(void)
+{
+	u32 flags = 0;
+	__asm__ __volatile__(
+		"csrxchg %[val], %[mask], %[reg]\n\t"
+		: [val] "+r" (flags)
+		: [mask] "r" (CSR_CRMD_IE), [reg] "i" (LOONGARCH_CSR_CRMD)
+		: "memory");
+	return flags;
+}
+
+static inline void arch_local_irq_restore(unsigned long flags)
+{
+	__asm__ __volatile__(
+		"csrxchg %[val], %[mask], %[reg]\n\t"
+		: [val] "+r" (flags)
+		: [mask] "r" (CSR_CRMD_IE), [reg] "i" (LOONGARCH_CSR_CRMD)
+		: "memory");
+}
+
+static inline unsigned long arch_local_save_flags(void)
+{
+	u32 flags;
+	__asm__ __volatile__(
+		"csrrd %[val], %[reg]\n\t"
+		: [val] "=r" (flags)
+		: [reg] "i" (LOONGARCH_CSR_CRMD)
+		: "memory");
+	return flags;
+}
+
+static inline int arch_irqs_disabled_flags(unsigned long flags)
+{
+	return !(flags & CSR_CRMD_IE);
+}
+
+static inline int arch_irqs_disabled(void)
+{
+	return arch_irqs_disabled_flags(arch_local_save_flags());
+}
+
+#endif /* #ifndef __ASSEMBLY__ */
+
+#endif /* _ASM_IRQFLAGS_H */
diff --git a/arch/loongarch/include/asm/kdebug.h b/arch/loongarch/include/asm/kdebug.h
new file mode 100644
index 000000000000..d721b4b82fae
--- /dev/null
+++ b/arch/loongarch/include/asm/kdebug.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_LOONGARCH_KDEBUG_H
+#define _ASM_LOONGARCH_KDEBUG_H
+
+#include <linux/notifier.h>
+
+enum die_val {
+	DIE_OOPS = 1,
+	DIE_RI,
+	DIE_FP,
+	DIE_SIMD,
+	DIE_TRAP,
+	DIE_PAGE_FAULT,
+	DIE_BREAK,
+	DIE_SSTEPBP,
+	DIE_UPROBE,
+	DIE_UPROBE_XOL,
+};
+
+#endif /* _ASM_LOONGARCH_KDEBUG_H */
diff --git a/arch/loongarch/include/asm/stackframe.h b/arch/loongarch/include/asm/stackframe.h
new file mode 100644
index 000000000000..fed198fbd51d
--- /dev/null
+++ b/arch/loongarch/include/asm/stackframe.h
@@ -0,0 +1,212 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_STACKFRAME_H
+#define _ASM_STACKFRAME_H
+
+#include <linux/threads.h>
+
+#include <asm/asm.h>
+#include <asm/asmmacro.h>
+#include <asm/asm-offsets.h>
+#include <asm/loongarch.h>
+#include <asm/thread_info.h>
+
+/* Make the addition of cfi info a little easier. */
+	.macro cfi_rel_offset reg offset=0 docfi=0
+	.if \docfi
+	.cfi_rel_offset \reg, \offset
+	.endif
+	.endm
+
+	.macro cfi_st reg offset=0 docfi=0
+	cfi_rel_offset \reg, \offset, \docfi
+	LONG_S	\reg, sp, \offset
+	.endm
+
+	.macro cfi_restore reg offset=0 docfi=0
+	.if \docfi
+	.cfi_restore \reg
+	.endif
+	.endm
+
+	.macro cfi_ld reg offset=0 docfi=0
+	LONG_L	\reg, sp, \offset
+	cfi_restore \reg \offset \docfi
+	.endm
+
+	.macro BACKUP_T0T1
+	csrwr	t0, EXCEPTION_KS0
+	csrwr	t1, EXCEPTION_KS1
+	.endm
+
+	.macro RELOAD_T0T1
+	csrrd   t0, EXCEPTION_KS0
+	csrrd   t1, EXCEPTION_KS1
+	.endm
+
+	.macro	SAVE_TEMP docfi=0
+	RELOAD_T0T1
+	cfi_st	t0, PT_R12, \docfi
+	cfi_st	t1, PT_R13, \docfi
+	cfi_st	t2, PT_R14, \docfi
+	cfi_st	t3, PT_R15, \docfi
+	cfi_st	t4, PT_R16, \docfi
+	cfi_st	t5, PT_R17, \docfi
+	cfi_st	t6, PT_R18, \docfi
+	cfi_st	t7, PT_R19, \docfi
+	cfi_st	t8, PT_R20, \docfi
+	.endm
+
+	.macro	SAVE_STATIC docfi=0
+	cfi_st	s0, PT_R23, \docfi
+	cfi_st	s1, PT_R24, \docfi
+	cfi_st	s2, PT_R25, \docfi
+	cfi_st	s3, PT_R26, \docfi
+	cfi_st	s4, PT_R27, \docfi
+	cfi_st	s5, PT_R28, \docfi
+	cfi_st	s6, PT_R29, \docfi
+	cfi_st	s7, PT_R30, \docfi
+	cfi_st	s8, PT_R31, \docfi
+	.endm
+
+/*
+ * get_saved_sp returns the SP for the current CPU by looking in the
+ * kernelsp array for it. It stores the current sp in t0 and loads the
+ * new value in sp.
+ */
+	.macro	get_saved_sp docfi=0
+	la.abs	t1, kernelsp
+	move	t0, sp
+	.if \docfi
+	.cfi_register sp, t0
+	.endif
+	LONG_L	sp, t1, 0
+	.endm
+
+	.macro	set_saved_sp stackp temp temp2
+	la.abs	\temp, kernelsp
+	LONG_S	\stackp, \temp, 0
+	.endm
+
+	.macro	SAVE_SOME docfi=0
+	csrrd	t1, LOONGARCH_CSR_PRMD
+	andi	t1, t1, 0x3	/* extract pplv bit */
+	move	t0, sp
+	beqz	t1, 8f
+	/* Called from user mode, new stack. */
+	get_saved_sp docfi=\docfi
+8:
+	PTR_ADDIU sp, sp, -PT_SIZE
+	.if \docfi
+	.cfi_def_cfa sp, 0
+	.endif
+	cfi_st	t0, PT_R3, \docfi
+	cfi_rel_offset  sp, PT_R3, \docfi
+	LONG_S	zero, sp, PT_R0
+	csrrd	t0, LOONGARCH_CSR_PRMD
+	LONG_S	t0, sp, PT_PRMD
+	csrrd	t0, LOONGARCH_CSR_CRMD
+	LONG_S	t0, sp, PT_CRMD
+	csrrd	t0, LOONGARCH_CSR_EUEN
+	LONG_S  t0, sp, PT_EUEN
+	csrrd	t0, LOONGARCH_CSR_ECFG
+	LONG_S	t0, sp, PT_ECFG
+	csrrd	t0, LOONGARCH_CSR_ESTAT
+	PTR_S	t0, sp, PT_ESTAT
+	cfi_st	ra, PT_R1, \docfi
+	cfi_st	a0, PT_R4, \docfi
+	cfi_st	a1, PT_R5, \docfi
+	cfi_st	a2, PT_R6, \docfi
+	cfi_st	a3, PT_R7, \docfi
+	cfi_st	a4, PT_R8, \docfi
+	cfi_st	a5, PT_R9, \docfi
+	cfi_st	a6, PT_R10, \docfi
+	cfi_st	a7, PT_R11, \docfi
+	csrrd	ra, LOONGARCH_CSR_ERA
+	LONG_S	ra, sp, PT_ERA
+	.if \docfi
+	.cfi_rel_offset ra, PT_ERA
+	.endif
+	cfi_st	tp, PT_R2, \docfi
+	cfi_st	fp, PT_R22, \docfi
+
+	/* Set thread_info if we're coming from user mode */
+	csrrd	t0, LOONGARCH_CSR_PRMD
+	andi	t0, t0, 0x3	/* extract pplv bit */
+	beqz	t0, 9f
+
+	li.d	tp, ~_THREAD_MASK
+	and	tp, tp, sp
+	cfi_st  u0, PT_R21, \docfi
+	csrrd	u0, PERCPU_BASE_KS
+9:
+	.endm
+
+	.macro	SAVE_ALL docfi=0
+	SAVE_SOME \docfi
+	SAVE_TEMP \docfi
+	SAVE_STATIC \docfi
+	.endm
+
+	.macro	RESTORE_TEMP docfi=0
+	cfi_ld	t0, PT_R12, \docfi
+	cfi_ld	t1, PT_R13, \docfi
+	cfi_ld	t2, PT_R14, \docfi
+	cfi_ld	t3, PT_R15, \docfi
+	cfi_ld	t4, PT_R16, \docfi
+	cfi_ld	t5, PT_R17, \docfi
+	cfi_ld	t6, PT_R18, \docfi
+	cfi_ld	t7, PT_R19, \docfi
+	cfi_ld	t8, PT_R20, \docfi
+	.endm
+
+	.macro	RESTORE_STATIC docfi=0
+	cfi_ld	s0, PT_R23, \docfi
+	cfi_ld	s1, PT_R24, \docfi
+	cfi_ld	s2, PT_R25, \docfi
+	cfi_ld	s3, PT_R26, \docfi
+	cfi_ld	s4, PT_R27, \docfi
+	cfi_ld	s5, PT_R28, \docfi
+	cfi_ld	s6, PT_R29, \docfi
+	cfi_ld	s7, PT_R30, \docfi
+	cfi_ld	s8, PT_R31, \docfi
+	.endm
+
+	.macro	RESTORE_SOME docfi=0
+	LONG_L	a0, sp, PT_PRMD
+	andi    a0, a0, 0x3	/* extract pplv bit */
+	beqz    a0, 8f
+	cfi_ld  u0, PT_R21, \docfi
+8:
+	LONG_L	a0, sp, PT_ERA
+	csrwr	a0, LOONGARCH_CSR_ERA
+	LONG_L	a0, sp, PT_PRMD
+	csrwr	a0, LOONGARCH_CSR_PRMD
+	cfi_ld	ra, PT_R1, \docfi
+	cfi_ld	a0, PT_R4, \docfi
+	cfi_ld	a1, PT_R5, \docfi
+	cfi_ld	a2, PT_R6, \docfi
+	cfi_ld	a3, PT_R7, \docfi
+	cfi_ld	a4, PT_R8, \docfi
+	cfi_ld	a5, PT_R9, \docfi
+	cfi_ld	a6, PT_R10, \docfi
+	cfi_ld	a7, PT_R11, \docfi
+	cfi_ld	tp, PT_R2, \docfi
+	cfi_ld	fp, PT_R22, \docfi
+	.endm
+
+	.macro	RESTORE_SP_AND_RET docfi=0
+	cfi_ld	sp, PT_R3, \docfi
+	ertn
+	.endm
+
+	.macro	RESTORE_ALL_AND_RET docfi=0
+	RESTORE_STATIC \docfi
+	RESTORE_TEMP \docfi
+	RESTORE_SOME \docfi
+	RESTORE_SP_AND_RET \docfi
+	.endm
+
+#endif /* _ASM_STACKFRAME_H */
diff --git a/arch/loongarch/include/asm/stacktrace.h b/arch/loongarch/include/asm/stacktrace.h
new file mode 100644
index 000000000000..26483e396ad1
--- /dev/null
+++ b/arch/loongarch/include/asm/stacktrace.h
@@ -0,0 +1,74 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_STACKTRACE_H
+#define _ASM_STACKTRACE_H
+
+#include <asm/asm.h>
+#include <asm/ptrace.h>
+#include <asm/loongarch.h>
+#include <linux/stringify.h>
+
+#define STR_LONG_L    __stringify(LONG_L)
+#define STR_LONG_S    __stringify(LONG_S)
+#define STR_LONGSIZE  __stringify(LONGSIZE)
+
+#define STORE_ONE_REG(r) \
+    STR_LONG_S   " $r" __stringify(r)", %1, "STR_LONGSIZE"*"__stringify(r)"\n\t"
+
+#define CSRRD_ONE_REG(reg) \
+    __stringify(csrrd) " %0, "__stringify(reg)"\n\t"
+
+static __always_inline void prepare_frametrace(struct pt_regs *regs)
+{
+	__asm__ __volatile__(
+		/* Save $r1 */
+		STORE_ONE_REG(1)
+		/* Use $r1 to save PC */
+		"pcaddi	$r1, 0\n\t"
+		STR_LONG_S " $r1, %0\n\t"
+		/* Restore $r1 */
+		STR_LONG_L " $r1, %1, "STR_LONGSIZE"\n\t"
+		STORE_ONE_REG(2)
+		STORE_ONE_REG(3)
+		STORE_ONE_REG(4)
+		STORE_ONE_REG(5)
+		STORE_ONE_REG(6)
+		STORE_ONE_REG(7)
+		STORE_ONE_REG(8)
+		STORE_ONE_REG(9)
+		STORE_ONE_REG(10)
+		STORE_ONE_REG(11)
+		STORE_ONE_REG(12)
+		STORE_ONE_REG(13)
+		STORE_ONE_REG(14)
+		STORE_ONE_REG(15)
+		STORE_ONE_REG(16)
+		STORE_ONE_REG(17)
+		STORE_ONE_REG(18)
+		STORE_ONE_REG(19)
+		STORE_ONE_REG(20)
+		STORE_ONE_REG(21)
+		STORE_ONE_REG(22)
+		STORE_ONE_REG(23)
+		STORE_ONE_REG(24)
+		STORE_ONE_REG(25)
+		STORE_ONE_REG(26)
+		STORE_ONE_REG(27)
+		STORE_ONE_REG(28)
+		STORE_ONE_REG(29)
+		STORE_ONE_REG(30)
+		STORE_ONE_REG(31)
+		: "=m" (regs->csr_era)
+		: "r" (regs->regs)
+		: "memory");
+	__asm__ __volatile__(CSRRD_ONE_REG(LOONGARCH_CSR_BADV) : "=r" (regs->csr_badvaddr));
+	__asm__ __volatile__(CSRRD_ONE_REG(LOONGARCH_CSR_CRMD) : "=r" (regs->csr_crmd));
+	__asm__ __volatile__(CSRRD_ONE_REG(LOONGARCH_CSR_PRMD) : "=r" (regs->csr_prmd));
+	__asm__ __volatile__(CSRRD_ONE_REG(LOONGARCH_CSR_EUEN) : "=r" (regs->csr_euen));
+	__asm__ __volatile__(CSRRD_ONE_REG(LOONGARCH_CSR_ECFG) : "=r" (regs->csr_ecfg));
+	__asm__ __volatile__(CSRRD_ONE_REG(LOONGARCH_CSR_ESTAT) : "=r" (regs->csr_estat));
+}
+
+#endif /* _ASM_STACKTRACE_H */
diff --git a/arch/loongarch/include/uapi/asm/break.h b/arch/loongarch/include/uapi/asm/break.h
new file mode 100644
index 000000000000..bb9b82ba59f2
--- /dev/null
+++ b/arch/loongarch/include/uapi/asm/break.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __UAPI_ASM_BREAK_H
+#define __UAPI_ASM_BREAK_H
+
+#define BRK_DEFAULT		0	/* Used as default */
+#define BRK_BUG			1	/* Used by BUG() */
+#define BRK_KDB			2	/* Used in KDB_ENTER() */
+#define BRK_MATHEMU		3	/* Used by FPU emulator */
+#define BRK_USERBP		4	/* User bp (used by debuggers) */
+#define BRK_SSTEPBP		5	/* User bp (used by debuggers) */
+#define BRK_OVERFLOW		6	/* Overflow check */
+#define BRK_DIVZERO		7	/* Divide by zero check */
+#define BRK_RANGE		8	/* Range error check */
+#define BRK_MULOVFL		9	/* Multiply overflow */
+#define BRK_KPROBE_BP		10	/* Kprobe break */
+#define BRK_KPROBE_SSTEPBP	11	/* Kprobe single step break */
+#define BRK_UPROBE_BP		12	/* See <asm/uprobes.h> */
+#define BRK_UPROBE_XOLBP	13	/* See <asm/uprobes.h> */
+
+#endif /* __UAPI_ASM_BREAK_H */
diff --git a/arch/loongarch/kernel/access-helper.h b/arch/loongarch/kernel/access-helper.h
new file mode 100644
index 000000000000..4a35ca81bd08
--- /dev/null
+++ b/arch/loongarch/kernel/access-helper.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#include <linux/uaccess.h>
+
+static inline int __get_inst(u32 *i, u32 *p, bool user)
+{
+	return user ? get_user(*i, (u32 __user *)p) : get_kernel_nofault(*i, p);
+}
+
+static inline int __get_addr(unsigned long *a, unsigned long *p, bool user)
+{
+	return user ? get_user(*a, (unsigned long __user *)p) : get_kernel_nofault(*a, p);
+}
diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
new file mode 100644
index 000000000000..93496852b3cc
--- /dev/null
+++ b/arch/loongarch/kernel/genex.S
@@ -0,0 +1,95 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ *
+ * Derived from MIPS:
+ * Copyright (C) 1994 - 2000, 2001, 2003 Ralf Baechle
+ * Copyright (C) 1999, 2000 Silicon Graphics, Inc.
+ * Copyright (C) 2002, 2007  Maciej W. Rozycki
+ * Copyright (C) 2001, 2012 MIPS Technologies, Inc.  All rights reserved.
+ */
+#include <asm/asm.h>
+#include <asm/asmmacro.h>
+#include <asm/loongarch.h>
+#include <asm/regdef.h>
+#include <asm/fpregdef.h>
+#include <asm/stackframe.h>
+#include <asm/thread_info.h>
+
+	.align	5
+SYM_FUNC_START(__arch_cpu_idle)
+	/* start of rollback region */
+	LONG_L	t0, tp, TI_FLAGS
+	nop
+	andi	t0, t0, _TIF_NEED_RESCHED
+	bnez	t0, 1f
+	nop
+	nop
+	nop
+	idle	0
+	/* end of rollback region */
+1:	jirl	zero, ra, 0
+SYM_FUNC_END(__arch_cpu_idle)
+
+SYM_FUNC_START(handle_vint)
+	BACKUP_T0T1
+	SAVE_ALL
+	la.abs	t1, __arch_cpu_idle
+	LONG_L  t0, sp, PT_ERA
+	/* 32 byte rollback region */
+	ori	t0, t0, 0x1f
+	xori	t0, t0, 0x1f
+	bne	t0, t1, 1f
+	LONG_S  t0, sp, PT_ERA
+1:	move	a0, sp
+	move	a1, sp
+	la.abs	t0, do_vint
+	jirl    ra, t0, 0
+	RESTORE_ALL_AND_RET
+SYM_FUNC_END(handle_vint)
+
+SYM_FUNC_START(except_vec_cex)
+	b	cache_parity_error
+SYM_FUNC_END(except_vec_cex)
+
+	.macro	build_prep_badv
+	csrrd	t0, LOONGARCH_CSR_BADV
+	PTR_S	t0, sp, PT_BVADDR
+	.endm
+
+	.macro	build_prep_fcsr
+	movfcsr2gr	a1, fcsr0
+	.endm
+
+	.macro	build_prep_none
+	.endm
+
+	.macro	BUILD_HANDLER exception handler prep
+	.align	5
+	SYM_FUNC_START(handle_\exception)
+	BACKUP_T0T1
+	SAVE_ALL
+	build_prep_\prep
+	move	a0, sp
+	la.abs	t0, do_\handler
+	jirl    ra, t0, 0
+	RESTORE_ALL_AND_RET
+	SYM_FUNC_END(handle_\exception)
+	.endm
+
+	BUILD_HANDLER ade ade badv
+	BUILD_HANDLER ale ale badv
+	BUILD_HANDLER bp bp none
+	BUILD_HANDLER fpe fpe fcsr
+	BUILD_HANDLER fpu fpu none
+	BUILD_HANDLER lsx lsx none
+	BUILD_HANDLER lasx lasx none
+	BUILD_HANDLER lbt lbt none
+	BUILD_HANDLER ri ri none
+	BUILD_HANDLER watch watch none
+	BUILD_HANDLER reserved reserved none	/* others */
+
+SYM_FUNC_START(handle_sys)
+	la.abs	t0, handle_syscall
+	jirl    zero, t0, 0
+SYM_FUNC_END(handle_sys)
diff --git a/arch/loongarch/kernel/irq.c b/arch/loongarch/kernel/irq.c
new file mode 100644
index 000000000000..48032ffd9331
--- /dev/null
+++ b/arch/loongarch/kernel/irq.c
@@ -0,0 +1,131 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/kernel.h>
+#include <linux/acpi.h>
+#include <linux/atomic.h>
+#include <linux/delay.h>
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/kernel_stat.h>
+#include <linux/proc_fs.h>
+#include <linux/mm.h>
+#include <linux/random.h>
+#include <linux/sched.h>
+#include <linux/seq_file.h>
+#include <linux/kallsyms.h>
+#include <linux/uaccess.h>
+
+#include <asm/irq.h>
+#include <asm/loongson.h>
+#include <asm/setup.h>
+
+DEFINE_PER_CPU(unsigned long, irq_stack);
+
+struct acpi_madt_lio_pic *acpi_liointc;
+struct acpi_madt_eio_pic *acpi_eiointc[MAX_IO_PICS];
+
+struct acpi_madt_ht_pic *acpi_htintc;
+struct acpi_madt_lpc_pic *acpi_pchlpc;
+struct acpi_madt_msi_pic *acpi_pchmsi[MAX_IO_PICS];
+struct acpi_madt_bio_pic *acpi_pchpic[MAX_IO_PICS];
+
+struct irq_domain *cpu_domain;
+struct irq_domain *liointc_domain;
+struct irq_domain *pch_lpc_domain;
+struct irq_domain *pch_msi_domain[MAX_IO_PICS];
+struct irq_domain *pch_pic_domain[MAX_IO_PICS];
+
+int find_pch_pic(u32 gsi)
+{
+	int i, start, end;
+
+	/* Find the PCH_PIC that manages this GSI. */
+	for (i = 0; i < loongson_sysconf.nr_io_pics; i++) {
+		struct acpi_madt_bio_pic *irq_cfg = acpi_pchpic[i];
+
+		start = irq_cfg->gsi_base;
+		end   = irq_cfg->gsi_base + irq_cfg->size;
+		if (gsi >= start && gsi < end)
+			return i;
+	}
+
+	pr_err("ERROR: Unable to locate PCH_PIC for GSI %d\n", gsi);
+	return -1;
+}
+
+/*
+ * 'what should we do if we get a hw irq event on an illegal vector'.
+ * each architecture has to answer this themselves.
+ */
+void ack_bad_irq(unsigned int irq)
+{
+	pr_warn("Unexpected IRQ # %d\n", irq);
+}
+
+atomic_t irq_err_count;
+
+asmlinkage void spurious_interrupt(void)
+{
+	atomic_inc(&irq_err_count);
+}
+
+int arch_show_interrupts(struct seq_file *p, int prec)
+{
+	seq_printf(p, "%*s: %10u\n", prec, "ERR", atomic_read(&irq_err_count));
+	return 0;
+}
+
+void __init setup_IRQ(void)
+{
+	int i;
+	struct irq_domain *parent_domain;
+
+	if (!acpi_eiointc[0])
+		cpu_data[0].options &= ~LOONGARCH_CPU_EXTIOI;
+
+	cpu_domain = loongarch_cpu_irq_init();
+	liointc_domain = liointc_acpi_init(cpu_domain, acpi_liointc);
+
+	if (cpu_has_extioi) {
+		pr_info("Using EIOINTC interrupt mode\n");
+		for (i = 0; i < loongson_sysconf.nr_io_pics; i++) {
+			parent_domain = eiointc_acpi_init(cpu_domain, acpi_eiointc[i]);
+			pch_pic_domain[i] = pch_pic_acpi_init(parent_domain, acpi_pchpic[i]);
+			pch_msi_domain[i] = pch_msi_acpi_init(parent_domain, acpi_pchmsi[i]);
+		}
+	} else {
+		pr_info("Using HTVECINTC interrupt mode\n");
+		parent_domain = htvec_acpi_init(liointc_domain, acpi_htintc);
+		pch_pic_domain[0] = pch_pic_acpi_init(parent_domain, acpi_pchpic[0]);
+		pch_msi_domain[0] = pch_msi_acpi_init(parent_domain, acpi_pchmsi[0]);
+	}
+
+	irq_set_default_host(pch_pic_domain[0]);
+	pch_lpc_domain = pch_lpc_acpi_init(pch_pic_domain[0], acpi_pchlpc);
+}
+
+void __init init_IRQ(void)
+{
+	int i;
+	unsigned int order = get_order(IRQ_STACK_SIZE);
+
+	clear_csr_ecfg(ECFG0_IM);
+	clear_csr_estat(ESTATF_IP);
+
+	setup_IRQ();
+
+	for (i = 0; i < NR_IRQS; i++)
+		irq_set_noprobe(i);
+
+	for_each_possible_cpu(i) {
+		void *s = (void *)__get_free_pages(GFP_KERNEL, order);
+
+		per_cpu(irq_stack, i) = (unsigned long)s;
+		pr_debug("CPU%d IRQ stack at 0x%lx - 0x%lx\n", i,
+			per_cpu(irq_stack, i), per_cpu(irq_stack, i) + IRQ_STACK_SIZE);
+	}
+
+	set_csr_ecfg(ECFGF_IP0 | ECFGF_IP1 | ECFGF_IP2 | ECFGF_IPI | ECFGF_PMC);
+}
diff --git a/arch/loongarch/kernel/traps.c b/arch/loongarch/kernel/traps.c
new file mode 100644
index 000000000000..182680e08571
--- /dev/null
+++ b/arch/loongarch/kernel/traps.c
@@ -0,0 +1,753 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Author: Huacai Chen <chenhuacai@loongson.cn>
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/bitops.h>
+#include <linux/bug.h>
+#include <linux/compiler.h>
+#include <linux/context_tracking.h>
+#include <linux/entry-common.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/extable.h>
+#include <linux/mm.h>
+#include <linux/sched/mm.h>
+#include <linux/sched/debug.h>
+#include <linux/smp.h>
+#include <linux/spinlock.h>
+#include <linux/kallsyms.h>
+#include <linux/memblock.h>
+#include <linux/interrupt.h>
+#include <linux/ptrace.h>
+#include <linux/kgdb.h>
+#include <linux/kdebug.h>
+#include <linux/kprobes.h>
+#include <linux/notifier.h>
+#include <linux/irq.h>
+#include <linux/perf_event.h>
+
+#include <asm/addrspace.h>
+#include <asm/bootinfo.h>
+#include <asm/branch.h>
+#include <asm/break.h>
+#include <asm/cpu.h>
+#include <asm/fpu.h>
+#include <asm/loongarch.h>
+#include <asm/mmu_context.h>
+#include <asm/pgtable.h>
+#include <asm/ptrace.h>
+#include <asm/sections.h>
+#include <asm/siginfo.h>
+#include <asm/stacktrace.h>
+#include <asm/tlb.h>
+#include <asm/types.h>
+
+#include "access-helper.h"
+
+extern asmlinkage void handle_ade(void);
+extern asmlinkage void handle_ale(void);
+extern asmlinkage void handle_sys(void);
+extern asmlinkage void handle_bp(void);
+extern asmlinkage void handle_ri(void);
+extern asmlinkage void handle_fpu(void);
+extern asmlinkage void handle_fpe(void);
+extern asmlinkage void handle_lbt(void);
+extern asmlinkage void handle_lsx(void);
+extern asmlinkage void handle_lasx(void);
+extern asmlinkage void handle_reserved(void);
+extern asmlinkage void handle_watch(void);
+extern asmlinkage void handle_vint(void);
+
+static void show_backtrace(struct task_struct *task, const struct pt_regs *regs,
+			   const char *loglvl, bool user)
+{
+	unsigned long addr;
+	unsigned long *sp = (unsigned long *)(regs->regs[3] & ~3);
+
+	printk("%sCall Trace:", loglvl);
+#ifdef CONFIG_KALLSYMS
+	printk("%s\n", loglvl);
+#endif
+	while (!kstack_end(sp)) {
+		if (__get_addr(&addr, sp++, user)) {
+			printk("%s (Bad stack address)", loglvl);
+			break;
+		}
+		if (__kernel_text_address(addr))
+			print_ip_sym(loglvl, addr);
+	}
+	printk("%s\n", loglvl);
+}
+
+static void show_stacktrace(struct task_struct *task,
+	const struct pt_regs *regs, const char *loglvl, bool user)
+{
+	int i;
+	const int field = 2 * sizeof(unsigned long);
+	unsigned long stackdata;
+	unsigned long *sp = (unsigned long *)regs->regs[3];
+
+	printk("%sStack :", loglvl);
+	i = 0;
+	while ((unsigned long) sp & (PAGE_SIZE - 1)) {
+		if (i && ((i % (64 / field)) == 0)) {
+			pr_cont("\n");
+			printk("%s       ", loglvl);
+		}
+		if (i > 39) {
+			pr_cont(" ...");
+			break;
+		}
+
+		if (__get_addr(&stackdata, sp++, user)) {
+			pr_cont(" (Bad stack address)");
+			break;
+		}
+
+		pr_cont(" %0*lx", field, stackdata);
+		i++;
+	}
+	pr_cont("\n");
+	show_backtrace(task, regs, loglvl, user);
+}
+
+void show_stack(struct task_struct *task, unsigned long *sp, const char *loglvl)
+{
+	struct pt_regs regs;
+
+	regs.csr_crmd = 0;
+	if (sp) {
+		regs.csr_era = 0;
+		regs.regs[1] = 0;
+		regs.regs[3] = (unsigned long)sp;
+	} else {
+		if (!task || task == current)
+			prepare_frametrace(&regs);
+		else {
+			regs.csr_era = task->thread.reg01;
+			regs.regs[1] = 0;
+			regs.regs[3] = task->thread.reg03;
+			regs.regs[22] = task->thread.reg22;
+		}
+	}
+
+	show_stacktrace(task, &regs, loglvl, false);
+}
+
+static void show_code(void *pc, bool user)
+{
+	long i;
+	unsigned int insn;
+
+	printk("Code:");
+
+	for(i = -3 ; i < 6 ; i++) {
+		if (__get_inst(&insn, pc + i, user)) {
+			pr_cont(" (Bad address in era)\n");
+			break;
+		}
+		pr_cont("%c%08x%c", (i?' ':'<'), insn, (i?' ':'>'));
+	}
+	pr_cont("\n");
+}
+
+static void __show_regs(const struct pt_regs *regs)
+{
+	const int field = 2 * sizeof(unsigned long);
+	unsigned int excsubcode;
+	unsigned int exccode;
+	int i;
+
+	show_regs_print_info(KERN_DEFAULT);
+
+	/*
+	 * Saved main processor registers
+	 */
+	for (i = 0; i < 32; ) {
+		if ((i % 4) == 0)
+			printk("$%2d   :", i);
+		pr_cont(" %0*lx", field, regs->regs[i]);
+
+		i++;
+		if ((i % 4) == 0)
+			pr_cont("\n");
+	}
+
+	/*
+	 * Saved csr registers
+	 */
+	printk("era   : %0*lx %pS\n", field, regs->csr_era,
+	       (void *) regs->csr_era);
+	printk("ra    : %0*lx %pS\n", field, regs->regs[1],
+	       (void *) regs->regs[1]);
+
+	printk("CSR crmd: %08lx	", regs->csr_crmd);
+	printk("CSR prmd: %08lx	", regs->csr_prmd);
+	printk("CSR euen: %08lx	", regs->csr_euen);
+	printk("CSR ecfg: %08lx	", regs->csr_ecfg);
+	printk("CSR estat: %08lx	", regs->csr_estat);
+
+	pr_cont("\n");
+
+	exccode = ((regs->csr_estat) & CSR_ESTAT_EXC) >> CSR_ESTAT_EXC_SHIFT;
+	excsubcode = ((regs->csr_estat) & CSR_ESTAT_ESUBCODE) >> CSR_ESTAT_ESUBCODE_SHIFT;
+	printk("ExcCode : %x (SubCode %x)\n", exccode, excsubcode);
+
+	if (exccode >= EXCCODE_TLBL && exccode <= EXCCODE_ALE)
+		printk("BadVA : %0*lx\n", field, regs->csr_badvaddr);
+
+	printk("PrId  : %08x (%s)\n", read_cpucfg(LOONGARCH_CPUCFG0),
+	       cpu_family_string());
+}
+
+void show_regs(struct pt_regs *regs)
+{
+	__show_regs((struct pt_regs *)regs);
+	dump_stack();
+}
+
+void show_registers(struct pt_regs *regs)
+{
+	__show_regs(regs);
+	print_modules();
+	printk("Process %s (pid: %d, threadinfo=%p, task=%p)\n",
+	       current->comm, current->pid, current_thread_info(), current);
+
+	show_stacktrace(current, regs, KERN_DEFAULT, user_mode(regs));
+	show_code((void *)regs->csr_era, user_mode(regs));
+	printk("\n");
+}
+
+static DEFINE_RAW_SPINLOCK(die_lock);
+
+void __noreturn die(const char *str, struct pt_regs *regs)
+{
+	static int die_counter;
+	int sig = SIGSEGV;
+
+	oops_enter();
+
+	if (notify_die(DIE_OOPS, str, regs, 0, current->thread.trap_nr,
+		       SIGSEGV) == NOTIFY_STOP)
+		sig = 0;
+
+	console_verbose();
+	raw_spin_lock_irq(&die_lock);
+	bust_spinlocks(1);
+
+	printk("%s[#%d]:\n", str, ++die_counter);
+	show_registers(regs);
+	add_taint(TAINT_DIE, LOCKDEP_NOW_UNRELIABLE);
+	raw_spin_unlock_irq(&die_lock);
+
+	oops_exit();
+
+	if (in_interrupt())
+		panic("Fatal exception in interrupt");
+
+	if (panic_on_oops)
+		panic("Fatal exception");
+
+	make_task_dead(sig);
+}
+
+static inline void setup_vint_size(unsigned int size)
+{
+	unsigned int vs;
+
+	vs = ilog2(size/4);
+
+	if (vs == 0 || vs > 7)
+		panic("vint_size %d Not support yet", vs);
+
+	csr_xchgl(vs<<CSR_ECFG_VS_SHIFT, CSR_ECFG_VS, LOONGARCH_CSR_ECFG);
+}
+
+/*
+ * Send SIGFPE according to FCSR Cause bits, which must have already
+ * been masked against Enable bits.  This is impotant as Inexact can
+ * happen together with Overflow or Underflow, and `ptrace' can set
+ * any bits.
+ */
+void force_fcsr_sig(unsigned long fcsr, void __user *fault_addr,
+		     struct task_struct *tsk)
+{
+	int si_code = FPE_FLTUNK;
+
+	if (fcsr & FPU_CSR_INV_X)
+		si_code = FPE_FLTINV;
+	else if (fcsr & FPU_CSR_DIV_X)
+		si_code = FPE_FLTDIV;
+	else if (fcsr & FPU_CSR_OVF_X)
+		si_code = FPE_FLTOVF;
+	else if (fcsr & FPU_CSR_UDF_X)
+		si_code = FPE_FLTUND;
+	else if (fcsr & FPU_CSR_INE_X)
+		si_code = FPE_FLTRES;
+
+	force_sig_fault(SIGFPE, si_code, fault_addr);
+}
+
+int process_fpemu_return(int sig, void __user *fault_addr, unsigned long fcsr)
+{
+	int si_code;
+
+	switch (sig) {
+	case 0:
+		return 0;
+
+	case SIGFPE:
+		force_fcsr_sig(fcsr, fault_addr, current);
+		return 1;
+
+	case SIGBUS:
+		force_sig_fault(SIGBUS, BUS_ADRERR, fault_addr);
+		return 1;
+
+	case SIGSEGV:
+		mmap_read_lock(current->mm);
+		if (vma_lookup(current->mm, (unsigned long)fault_addr))
+			si_code = SEGV_ACCERR;
+		else
+			si_code = SEGV_MAPERR;
+		mmap_read_unlock(current->mm);
+		force_sig_fault(SIGSEGV, si_code, fault_addr);
+		return 1;
+
+	default:
+		force_sig(sig);
+		return 1;
+	}
+}
+
+/*
+ * Delayed fp exceptions when doing a lazy ctx switch
+ */
+asmlinkage void noinstr do_fpe(struct pt_regs *regs, unsigned long fcsr)
+{
+	int sig;
+	void __user *fault_addr;
+	irqentry_state_t state = irqentry_enter(regs);
+
+	if (notify_die(DIE_FP, "FP exception", regs, 0, current->thread.trap_nr,
+		       SIGFPE) == NOTIFY_STOP)
+		goto out;
+
+	/* Clear FCSR.Cause before enabling interrupts */
+	write_fcsr(LOONGARCH_FCSR0, fcsr & ~mask_fcsr_x(fcsr));
+	local_irq_enable();
+
+	die_if_kernel("FP exception in kernel code", regs);
+
+	sig = SIGFPE;
+	fault_addr = (void __user *) regs->csr_era;
+
+	/* Send a signal if required.  */
+	process_fpemu_return(sig, fault_addr, fcsr);
+
+out:
+	local_irq_disable();
+	irqentry_exit(regs, state);
+}
+
+asmlinkage void noinstr do_ade(struct pt_regs *regs)
+{
+	irqentry_state_t state = irqentry_enter(regs);
+
+	die_if_kernel("Kernel ade access", regs);
+	force_sig_fault(SIGBUS, BUS_ADRERR, (void __user *)regs->csr_badvaddr);
+
+	irqentry_exit(regs, state);
+}
+
+asmlinkage void noinstr do_ale(struct pt_regs *regs)
+{
+	irqentry_state_t state = irqentry_enter(regs);
+
+	die_if_kernel("Kernel ale access", regs);
+	force_sig_fault(SIGBUS, BUS_ADRALN, (void __user *)regs->csr_badvaddr);
+
+	irqentry_exit(regs, state);
+}
+
+asmlinkage void noinstr do_bp(struct pt_regs *regs)
+{
+	bool user = user_mode(regs);
+	unsigned int opcode, bcode;
+	unsigned long era = exception_era(regs);
+	irqentry_state_t state = irqentry_enter(regs);
+
+	local_irq_enable();
+	current->thread.trap_nr = read_csr_excode();
+	if (__get_inst(&opcode, (u32 *)era, user))
+		goto out_sigsegv;
+
+	bcode = (opcode & 0x7fff);
+
+	/*
+	 * notify the kprobe handlers, if instruction is likely to
+	 * pertain to them.
+	 */
+	switch (bcode) {
+	case BRK_KPROBE_BP:
+		if (notify_die(DIE_BREAK, "Kprobe", regs, bcode,
+			       current->thread.trap_nr, SIGTRAP) == NOTIFY_STOP)
+			goto out;
+		else
+			break;
+	case BRK_KPROBE_SSTEPBP:
+		if (notify_die(DIE_SSTEPBP, "Kprobe_SingleStep", regs, bcode,
+			       current->thread.trap_nr, SIGTRAP) == NOTIFY_STOP)
+			goto out;
+		else
+			break;
+	case BRK_UPROBE_BP:
+		if (notify_die(DIE_UPROBE, "Uprobe", regs, bcode,
+			       current->thread.trap_nr, SIGTRAP) == NOTIFY_STOP)
+			goto out;
+		else
+			break;
+	case BRK_UPROBE_XOLBP:
+		if (notify_die(DIE_UPROBE_XOL, "Uprobe_XOL", regs, bcode,
+			       current->thread.trap_nr, SIGTRAP) == NOTIFY_STOP)
+			goto out;
+		else
+			break;
+	default:
+		if (notify_die(DIE_TRAP, "Break", regs, bcode,
+			       current->thread.trap_nr, SIGTRAP) == NOTIFY_STOP)
+			goto out;
+		else
+			break;
+	}
+
+	switch (bcode) {
+	case BRK_BUG:
+		die_if_kernel("Kernel bug detected", regs);
+		force_sig(SIGTRAP);
+		break;
+	case BRK_DIVZERO:
+		die_if_kernel("Break instruction in kernel code", regs);
+		force_sig_fault(SIGFPE, FPE_INTDIV, (void __user *)regs->csr_era);
+		break;
+	case BRK_OVERFLOW:
+		die_if_kernel("Break instruction in kernel code", regs);
+		force_sig_fault(SIGFPE, FPE_INTOVF, (void __user *)regs->csr_era);
+		break;
+	default:
+		die_if_kernel("Break instruction in kernel code", regs);
+		force_sig_fault(SIGTRAP, TRAP_BRKPT, (void __user *)regs->csr_era);
+		break;
+	}
+
+out:
+	local_irq_disable();
+	irqentry_exit(regs, state);
+	return;
+
+out_sigsegv:
+	force_sig(SIGSEGV);
+	goto out;
+}
+
+asmlinkage void noinstr do_watch(struct pt_regs *regs)
+{
+	pr_warn("Hardware watch point handler not implemented!\n");
+}
+
+asmlinkage void noinstr do_ri(struct pt_regs *regs)
+{
+	int status = -1;
+	unsigned int opcode = 0;
+	unsigned int __user *era = (unsigned int __user *)exception_era(regs);
+	unsigned long old_era = regs->csr_era;
+	unsigned long old_ra = regs->regs[1];
+	irqentry_state_t state = irqentry_enter(regs);
+
+	local_irq_enable();
+	current->thread.trap_nr = read_csr_excode();
+
+	if (notify_die(DIE_RI, "RI Fault", regs, 0, current->thread.trap_nr,
+		       SIGILL) == NOTIFY_STOP)
+		goto out;
+
+	die_if_kernel("Reserved instruction in kernel code", regs);
+
+	if (unlikely(compute_return_era(regs) < 0))
+		goto out;
+
+	if (unlikely(get_user(opcode, era) < 0)) {
+		status = SIGSEGV;
+		current->thread.error_code = 1;
+	}
+
+	if (status < 0)
+		status = SIGILL;
+
+	if (unlikely(status > 0)) {
+		regs->csr_era = old_era;		/* Undo skip-over.  */
+		regs->regs[1] = old_ra;
+		force_sig(status);
+	}
+
+out:
+	local_irq_disable();
+	irqentry_exit(regs, state);
+}
+
+static void init_restore_fp(void)
+{
+	if (!used_math()) {
+		/* First time FP context user. */
+		init_fpu();
+	} else {
+		/* This task has formerly used the FP context */
+		if (!is_fpu_owner())
+			own_fpu_inatomic(1);
+	}
+
+	BUG_ON(!is_fp_enabled());
+}
+
+asmlinkage void noinstr do_fpu(struct pt_regs *regs)
+{
+	irqentry_state_t state = irqentry_enter(regs);
+
+	local_irq_enable();
+	die_if_kernel("do_fpu invoked from kernel context!", regs);
+
+	preempt_disable();
+	init_restore_fp();
+	preempt_enable();
+
+	local_irq_disable();
+	irqentry_exit(regs, state);
+}
+
+asmlinkage void noinstr do_lsx(struct pt_regs *regs)
+{
+	irqentry_state_t state = irqentry_enter(regs);
+
+	local_irq_enable();
+	force_sig(SIGILL);
+	local_irq_disable();
+
+	irqentry_exit(regs, state);
+}
+
+asmlinkage void noinstr do_lasx(struct pt_regs *regs)
+{
+	irqentry_state_t state = irqentry_enter(regs);
+
+	local_irq_enable();
+	force_sig(SIGILL);
+	local_irq_disable();
+
+	irqentry_exit(regs, state);
+}
+
+asmlinkage void noinstr do_lbt(struct pt_regs *regs)
+{
+	irqentry_state_t state = irqentry_enter(regs);
+
+	local_irq_enable();
+	force_sig(SIGILL);
+	local_irq_disable();
+
+	irqentry_exit(regs, state);
+}
+
+asmlinkage void noinstr do_reserved(struct pt_regs *regs)
+{
+	irqentry_state_t state = irqentry_enter(regs);
+
+	/*
+	 * Game over - no way to handle this if it ever occurs.	 Most probably
+	 * caused by a new unknown cpu type or after another deadly
+	 * hard/software error.
+	 */
+	local_irq_enable();
+	show_regs(regs);
+	panic("Caught reserved exception %u - should not happen.", read_csr_excode());
+	local_irq_disable();
+
+	irqentry_exit(regs, state);
+}
+
+asmlinkage void cache_parity_error(void)
+{
+	const int field = 2 * sizeof(unsigned long);
+	unsigned int reg_val;
+
+	/* For the moment, report the problem and hang. */
+	pr_err("Cache error exception:\n");
+	pr_err("csr_merrera == %0*llx\n", field, csr_readq(LOONGARCH_CSR_MERRERA));
+	reg_val = csr_readl(LOONGARCH_CSR_MERRCTL);
+	pr_err("csr_merrctl == %08x\n", reg_val);
+
+	pr_err("Decoded c0_cacheerr: %s cache fault in %s reference.\n",
+	       reg_val & (1<<30) ? "secondary" : "primary",
+	       reg_val & (1<<31) ? "data" : "insn");
+	if (((current_cpu_data.processor_id & 0xff0000) == PRID_COMP_LOONGSON)) {
+		pr_err("Error bits: %s%s%s%s%s%s%s%s\n",
+			reg_val & (1<<29) ? "ED " : "",
+			reg_val & (1<<28) ? "ET " : "",
+			reg_val & (1<<27) ? "ES " : "",
+			reg_val & (1<<26) ? "EE " : "",
+			reg_val & (1<<25) ? "EB " : "",
+			reg_val & (1<<24) ? "EI " : "",
+			reg_val & (1<<23) ? "E1 " : "",
+			reg_val & (1<<22) ? "E0 " : "");
+	} else {
+		pr_err("Error bits: %s%s%s%s%s%s%s\n",
+			reg_val & (1<<29) ? "ED " : "",
+			reg_val & (1<<28) ? "ET " : "",
+			reg_val & (1<<26) ? "EE " : "",
+			reg_val & (1<<25) ? "EB " : "",
+			reg_val & (1<<24) ? "EI " : "",
+			reg_val & (1<<23) ? "E1 " : "",
+			reg_val & (1<<22) ? "E0 " : "");
+	}
+	pr_err("IDX: 0x%08x\n", reg_val & ((1<<22)-1));
+
+	panic("Can't handle the cache error!");
+}
+
+asmlinkage void noinstr handle_loongarch_irq(struct pt_regs *regs)
+{
+	struct pt_regs *old_regs;
+
+	irq_enter_rcu();
+	old_regs = set_irq_regs(regs);
+	handle_arch_irq(regs);
+	set_irq_regs(old_regs);
+	irq_exit_rcu();
+}
+
+asmlinkage void noinstr do_vint(struct pt_regs *regs, unsigned long sp)
+{
+	register int cpu;
+	register unsigned long stack;
+	irqentry_state_t state = irqentry_enter(regs);
+
+	cpu = smp_processor_id();
+
+	if (on_irq_stack(cpu, sp))
+		handle_loongarch_irq(regs);
+	else {
+		stack = per_cpu(irq_stack, cpu) + IRQ_STACK_START;
+
+		/* Save task's sp on IRQ stack for unwinding */
+		*(unsigned long *)stack = sp;
+
+		__asm__ __volatile__(
+		"move	$s0, $sp		\n" /* Preserve sp */
+		"move	$sp, %[stk]		\n" /* Switch stack */
+		"move	$a0, %[regs]		\n"
+		"bl	handle_loongarch_irq	\n"
+		"move	$sp, $s0		\n" /* Restore sp */
+		: /* No outputs */
+		: [stk] "r" (stack), [regs] "r" (regs)
+		: "$a0", "$a1", "$a2", "$a3", "$a4", "$a5", "$a6", "$a7", "$s0",
+		  "$t0", "$t1", "$t2", "$t3", "$t4", "$t5", "$t6", "$t7", "$t8",
+		  "memory");
+	}
+
+	irqentry_exit(regs, state);
+}
+
+extern void tlb_init(void);
+extern void cache_error_setup(void);
+
+unsigned long eentry;
+unsigned long tlbrentry;
+
+long exception_handlers[VECSIZE * 128 / sizeof(long)] __aligned(SZ_64K);
+
+static void configure_exception_vector(void)
+{
+	eentry    = (unsigned long)exception_handlers;
+	tlbrentry = (unsigned long)exception_handlers + 80*VECSIZE;
+
+	csr_writeq(eentry, LOONGARCH_CSR_EENTRY);
+	csr_writeq(eentry, LOONGARCH_CSR_MERRENTRY);
+	csr_writeq(tlbrentry, LOONGARCH_CSR_TLBRENTRY);
+}
+
+void per_cpu_trap_init(int cpu)
+{
+	unsigned int i;
+
+	setup_vint_size(VECSIZE);
+
+	configure_exception_vector();
+
+	if (!cpu_data[cpu].asid_cache)
+		cpu_data[cpu].asid_cache = asid_first_version(cpu);
+
+	mmgrab(&init_mm);
+	current->active_mm = &init_mm;
+	BUG_ON(current->mm);
+	enter_lazy_tlb(&init_mm, current);
+
+	/* Initialise exception handlers */
+	if (cpu == 0)
+		for (i = 0; i < 64; i++)
+			set_handler(i * VECSIZE, handle_reserved, VECSIZE);
+
+	tlb_init();
+	cpu_cache_init();
+}
+
+/* Install CPU exception handler */
+void set_handler(unsigned long offset, void *addr, unsigned long size)
+{
+	memcpy((void *)(eentry + offset), addr, size);
+	local_flush_icache_range(eentry + offset, eentry + offset + size);
+}
+
+static const char panic_null_cerr[] =
+	"Trying to set NULL cache error exception handler\n";
+
+/*
+ * Install uncached CPU exception handler.
+ * This is suitable only for the cache error exception which is the only
+ * exception handler that is being run uncached.
+ */
+void set_merr_handler(unsigned long offset, void *addr, unsigned long size)
+{
+	unsigned long uncached_eentry = TO_UNCAC(__pa(eentry));
+
+	if (!addr)
+		panic(panic_null_cerr);
+
+	memcpy((void *)(uncached_eentry + offset), addr, size);
+}
+
+void __init trap_init(void)
+{
+	long i;
+
+	/* Set interrupt vector handler */
+	for (i = EXCCODE_INT_START; i < EXCCODE_INT_END; i++)
+		set_handler(i * VECSIZE, handle_vint, VECSIZE);
+
+	set_handler(EXCCODE_ADE * VECSIZE, handle_ade, VECSIZE);
+	set_handler(EXCCODE_ALE * VECSIZE, handle_ale, VECSIZE);
+	set_handler(EXCCODE_SYS * VECSIZE, handle_sys, VECSIZE);
+	set_handler(EXCCODE_BP * VECSIZE, handle_bp, VECSIZE);
+	set_handler(EXCCODE_INE * VECSIZE, handle_ri, VECSIZE);
+	set_handler(EXCCODE_IPE * VECSIZE, handle_ri, VECSIZE);
+	set_handler(EXCCODE_FPDIS * VECSIZE, handle_fpu, VECSIZE);
+	set_handler(EXCCODE_LSXDIS * VECSIZE, handle_lsx, VECSIZE);
+	set_handler(EXCCODE_LASXDIS * VECSIZE, handle_lasx, VECSIZE);
+	set_handler(EXCCODE_FPE * VECSIZE, handle_fpe, VECSIZE);
+	set_handler(EXCCODE_BTDIS * VECSIZE, handle_lbt, VECSIZE);
+	set_handler(EXCCODE_WATCH * VECSIZE, handle_watch, VECSIZE);
+
+	cache_error_setup();
+
+	local_flush_icache_range(eentry, eentry + 0x400);
+}
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 11/24] LoongArch: Add process management
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (9 preceding siblings ...)
  2022-04-30  9:05 ` [PATCH V9 10/24] LoongArch: Add exception/interrupt handling Huacai Chen
@ 2022-04-30  9:05 ` Huacai Chen
  2022-04-30  9:05 ` [PATCH V9 12/24] LoongArch: Add memory management Huacai Chen
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:05 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen

This patch adds process management support for LoongArch, including:
thread info definition, context switch and process tracing.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/include/asm/fpu.h         | 129 +++++++
 arch/loongarch/include/asm/idle.h        |   9 +
 arch/loongarch/include/asm/mmu.h         |  16 +
 arch/loongarch/include/asm/mmu_context.h | 152 ++++++++
 arch/loongarch/include/asm/processor.h   | 207 +++++++++++
 arch/loongarch/include/asm/ptrace.h      | 152 ++++++++
 arch/loongarch/include/asm/switch_to.h   |  37 ++
 arch/loongarch/include/asm/thread_info.h | 106 ++++++
 arch/loongarch/include/uapi/asm/ptrace.h |  52 +++
 arch/loongarch/kernel/fpu.S              | 264 ++++++++++++++
 arch/loongarch/kernel/idle.c             |  16 +
 arch/loongarch/kernel/process.c          | 260 ++++++++++++++
 arch/loongarch/kernel/ptrace.c           | 431 +++++++++++++++++++++++
 arch/loongarch/kernel/switch.S           |  35 ++
 14 files changed, 1866 insertions(+)
 create mode 100644 arch/loongarch/include/asm/fpu.h
 create mode 100644 arch/loongarch/include/asm/idle.h
 create mode 100644 arch/loongarch/include/asm/mmu.h
 create mode 100644 arch/loongarch/include/asm/mmu_context.h
 create mode 100644 arch/loongarch/include/asm/processor.h
 create mode 100644 arch/loongarch/include/asm/ptrace.h
 create mode 100644 arch/loongarch/include/asm/switch_to.h
 create mode 100644 arch/loongarch/include/asm/thread_info.h
 create mode 100644 arch/loongarch/include/uapi/asm/ptrace.h
 create mode 100644 arch/loongarch/kernel/fpu.S
 create mode 100644 arch/loongarch/kernel/idle.c
 create mode 100644 arch/loongarch/kernel/process.c
 create mode 100644 arch/loongarch/kernel/ptrace.c
 create mode 100644 arch/loongarch/kernel/switch.S

diff --git a/arch/loongarch/include/asm/fpu.h b/arch/loongarch/include/asm/fpu.h
new file mode 100644
index 000000000000..3e1c6f01aec5
--- /dev/null
+++ b/arch/loongarch/include/asm/fpu.h
@@ -0,0 +1,129 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Author: Huacai Chen <chenhuacai@loongson.cn>
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_FPU_H
+#define _ASM_FPU_H
+
+#include <linux/sched.h>
+#include <linux/sched/task_stack.h>
+#include <linux/ptrace.h>
+#include <linux/thread_info.h>
+#include <linux/bitops.h>
+
+#include <asm/cpu.h>
+#include <asm/cpu-features.h>
+#include <asm/current.h>
+#include <asm/loongarch.h>
+#include <asm/processor.h>
+#include <asm/ptrace.h>
+
+struct sigcontext;
+
+extern void _init_fpu(unsigned int);
+extern void _save_fp(struct loongarch_fpu *);
+extern void _restore_fp(struct loongarch_fpu *);
+
+/*
+ * Mask the FCSR Cause bits according to the Enable bits, observing
+ * that Unimplemented is always enabled.
+ */
+static inline unsigned long mask_fcsr_x(unsigned long fcsr)
+{
+	return fcsr & ((fcsr & FPU_CSR_ALL_E) <<
+			(ffs(FPU_CSR_ALL_X) - ffs(FPU_CSR_ALL_E)));
+}
+
+static inline int is_fp_enabled(void)
+{
+	return (csr_readl(LOONGARCH_CSR_EUEN) & CSR_EUEN_FPEN) ?
+		1 : 0;
+}
+
+#define enable_fpu()		set_csr_euen(CSR_EUEN_FPEN)
+
+#define disable_fpu()		clear_csr_euen(CSR_EUEN_FPEN)
+
+#define clear_fpu_owner()	clear_thread_flag(TIF_USEDFPU)
+
+static inline int is_fpu_owner(void)
+{
+	return test_thread_flag(TIF_USEDFPU);
+}
+
+static inline void __own_fpu(void)
+{
+	enable_fpu();
+	set_thread_flag(TIF_USEDFPU);
+	KSTK_EUEN(current) |= CSR_EUEN_FPEN;
+}
+
+static inline void own_fpu_inatomic(int restore)
+{
+	if (cpu_has_fpu && !is_fpu_owner()) {
+		__own_fpu();
+		if (restore)
+			_restore_fp(&current->thread.fpu);
+	}
+}
+
+static inline void own_fpu(int restore)
+{
+	preempt_disable();
+	own_fpu_inatomic(restore);
+	preempt_enable();
+}
+
+static inline void lose_fpu_inatomic(int save, struct task_struct *tsk)
+{
+	if (is_fpu_owner()) {
+		if (save)
+			_save_fp(&tsk->thread.fpu);
+		disable_fpu();
+		clear_tsk_thread_flag(tsk, TIF_USEDFPU);
+	}
+	KSTK_EUEN(tsk) &= ~(CSR_EUEN_FPEN | CSR_EUEN_LSXEN | CSR_EUEN_LASXEN);
+}
+
+static inline void lose_fpu(int save)
+{
+	preempt_disable();
+	lose_fpu_inatomic(save, current);
+	preempt_enable();
+}
+
+static inline void init_fpu(void)
+{
+	unsigned int fcsr = current->thread.fpu.fcsr;
+
+	__own_fpu();
+	_init_fpu(fcsr);
+	set_used_math();
+}
+
+static inline void save_fp(struct task_struct *tsk)
+{
+	if (cpu_has_fpu)
+		_save_fp(&tsk->thread.fpu);
+}
+
+static inline void restore_fp(struct task_struct *tsk)
+{
+	if (cpu_has_fpu)
+		_restore_fp(&tsk->thread.fpu);
+}
+
+static inline union fpureg *get_fpu_regs(struct task_struct *tsk)
+{
+	if (tsk == current) {
+		preempt_disable();
+		if (is_fpu_owner())
+			_save_fp(&current->thread.fpu);
+		preempt_enable();
+	}
+
+	return tsk->thread.fpu.fpr;
+}
+
+#endif /* _ASM_FPU_H */
diff --git a/arch/loongarch/include/asm/idle.h b/arch/loongarch/include/asm/idle.h
new file mode 100644
index 000000000000..f7f2b7dbf958
--- /dev/null
+++ b/arch/loongarch/include/asm/idle.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_IDLE_H
+#define __ASM_IDLE_H
+
+#include <linux/linkage.h>
+
+extern asmlinkage void __arch_cpu_idle(void);
+
+#endif /* __ASM_IDLE_H  */
diff --git a/arch/loongarch/include/asm/mmu.h b/arch/loongarch/include/asm/mmu.h
new file mode 100644
index 000000000000..0cc2d0803537
--- /dev/null
+++ b/arch/loongarch/include/asm/mmu.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM_MMU_H
+#define __ASM_MMU_H
+
+#include <linux/atomic.h>
+#include <linux/spinlock.h>
+
+typedef struct {
+	u64 asid[NR_CPUS];
+	void *vdso;
+} mm_context_t;
+
+#endif /* __ASM_MMU_H */
diff --git a/arch/loongarch/include/asm/mmu_context.h b/arch/loongarch/include/asm/mmu_context.h
new file mode 100644
index 000000000000..ccd3c8a47cad
--- /dev/null
+++ b/arch/loongarch/include/asm/mmu_context.h
@@ -0,0 +1,152 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Switch a MMU context.
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_MMU_CONTEXT_H
+#define _ASM_MMU_CONTEXT_H
+
+#include <linux/errno.h>
+#include <linux/sched.h>
+#include <linux/mm_types.h>
+#include <linux/smp.h>
+#include <linux/slab.h>
+
+#include <asm/cacheflush.h>
+#include <asm/tlbflush.h>
+#include <asm-generic/mm_hooks.h>
+
+/*
+ *  All unused by hardware upper bits will be considered
+ *  as a software asid extension.
+ */
+static inline u64 asid_version_mask(unsigned int cpu)
+{
+	return ~(u64)(cpu_asid_mask(&cpu_data[cpu]));
+}
+
+static inline u64 asid_first_version(unsigned int cpu)
+{
+	return cpu_asid_mask(&cpu_data[cpu]) + 1;
+}
+
+#define cpu_context(cpu, mm)	((mm)->context.asid[cpu])
+#define asid_cache(cpu)		(cpu_data[cpu].asid_cache)
+#define cpu_asid(cpu, mm)	(cpu_context((cpu), (mm)) & cpu_asid_mask(&cpu_data[cpu]))
+
+static inline int asid_valid(struct mm_struct *mm, unsigned int cpu)
+{
+	if ((cpu_context(cpu, mm) ^ asid_cache(cpu)) & asid_version_mask(cpu))
+		return 0;
+
+	return 1;
+}
+
+static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk)
+{
+}
+
+/* Normal, classic get_new_mmu_context */
+static inline void
+get_new_mmu_context(struct mm_struct *mm, unsigned long cpu)
+{
+	u64 asid = asid_cache(cpu);
+
+	if (!((++asid) & cpu_asid_mask(&cpu_data[cpu])))
+		local_flush_tlb_user();	/* start new asid cycle */
+
+	cpu_context(cpu, mm) = asid_cache(cpu) = asid;
+}
+
+/*
+ * Initialize the context related info for a new mm_struct
+ * instance.
+ */
+static inline int
+init_new_context(struct task_struct *tsk, struct mm_struct *mm)
+{
+	int i;
+
+	for_each_possible_cpu(i)
+		cpu_context(i, mm) = 0;
+
+	return 0;
+}
+
+static inline void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
+				      struct task_struct *tsk)
+{
+	unsigned int cpu = smp_processor_id();
+
+	/* Check if our ASID is of an older version and thus invalid */
+	if (!asid_valid(next, cpu))
+		get_new_mmu_context(next, cpu);
+
+	write_csr_asid(cpu_asid(cpu, next));
+
+	if (next != &init_mm)
+		csr_writeq((unsigned long)next->pgd, LOONGARCH_CSR_PGDL);
+	else
+		csr_writeq((unsigned long)invalid_pg_dir, LOONGARCH_CSR_PGDL);
+
+	/*
+	 * Mark current->active_mm as not "active" anymore.
+	 * We don't want to mislead possible IPI tlb flush routines.
+	 */
+	cpumask_set_cpu(cpu, mm_cpumask(next));
+}
+
+#define switch_mm_irqs_off switch_mm_irqs_off
+
+static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
+			     struct task_struct *tsk)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	switch_mm_irqs_off(prev, next, tsk);
+	local_irq_restore(flags);
+}
+
+/*
+ * Destroy context related info for an mm_struct that is about
+ * to be put to rest.
+ */
+static inline void destroy_context(struct mm_struct *mm)
+{
+}
+
+#define activate_mm(prev, next)	switch_mm(prev, next, current)
+#define deactivate_mm(task, mm)	do { } while (0)
+
+/*
+ * If mm is currently active, we can't really drop it.
+ * Instead, we will get a new one for it.
+ */
+static inline void
+drop_mmu_context(struct mm_struct *mm, unsigned int cpu)
+{
+	int asid;
+	unsigned long flags;
+
+	local_irq_save(flags);
+
+	asid = read_csr_asid() & cpu_asid_mask(&current_cpu_data);
+
+	if (asid == cpu_asid(cpu, mm)) {
+		if (!current->mm || (current->mm == mm)) {
+			get_new_mmu_context(mm, cpu);
+			write_csr_asid(cpu_asid(cpu, mm));
+			goto out;
+		}
+	}
+
+	/* Will get a new context next time */
+	cpu_context(cpu, mm) = 0;
+	cpumask_clear_cpu(cpu, mm_cpumask(mm));
+out:
+	local_irq_restore(flags);
+}
+
+#endif /* _ASM_MMU_CONTEXT_H */
diff --git a/arch/loongarch/include/asm/processor.h b/arch/loongarch/include/asm/processor.h
new file mode 100644
index 000000000000..2631a089545f
--- /dev/null
+++ b/arch/loongarch/include/asm/processor.h
@@ -0,0 +1,207 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_PROCESSOR_H
+#define _ASM_PROCESSOR_H
+
+#include <linux/atomic.h>
+#include <linux/cpumask.h>
+#include <linux/sizes.h>
+
+#include <asm/cpu.h>
+#include <asm/cpu-info.h>
+#include <asm/loongarch.h>
+#include <asm/vdso/processor.h>
+#include <uapi/asm/ptrace.h>
+#include <uapi/asm/sigcontext.h>
+
+#ifdef CONFIG_32BIT
+
+#define TASK_SIZE	0x80000000UL
+#define STACK_TOP_MAX	TASK_SIZE
+
+#define TASK_IS_32BIT_ADDR 1
+
+#endif
+
+#ifdef CONFIG_64BIT
+
+#define TASK_SIZE32	0x100000000UL
+#define TASK_SIZE64     (0x1UL << ((cpu_vabits > VA_BITS) ? VA_BITS : cpu_vabits))
+
+#define TASK_SIZE	(test_thread_flag(TIF_32BIT_ADDR) ? TASK_SIZE32 : TASK_SIZE64)
+#define STACK_TOP_MAX	TASK_SIZE64
+
+#define TASK_SIZE_OF(tsk)						\
+	(test_tsk_thread_flag(tsk, TIF_32BIT_ADDR) ? TASK_SIZE32 : TASK_SIZE64)
+
+#define TASK_IS_32BIT_ADDR test_thread_flag(TIF_32BIT_ADDR)
+
+#endif
+
+#define VDSO_RANDOMIZE_SIZE	(TASK_IS_32BIT_ADDR ? SZ_1M : SZ_64M)
+
+unsigned long stack_top(void);
+#define STACK_TOP stack_top()
+
+/*
+ * This decides where the kernel will search for a free chunk of vm
+ * space during mmap's.
+ */
+#define TASK_UNMAPPED_BASE PAGE_ALIGN(TASK_SIZE / 3)
+
+#define FPU_REG_WIDTH		256
+#define FPU_ALIGN		__attribute__((aligned(32)))
+
+union fpureg {
+	__u32	val32[FPU_REG_WIDTH / 32];
+	__u64	val64[FPU_REG_WIDTH / 64];
+};
+
+#define FPR_IDX(width, idx)	(idx)
+
+#define BUILD_FPR_ACCESS(width) \
+static inline u##width get_fpr##width(union fpureg *fpr, unsigned idx)	\
+{									\
+	return fpr->val##width[FPR_IDX(width, idx)];			\
+}									\
+									\
+static inline void set_fpr##width(union fpureg *fpr, unsigned int idx,	\
+				  u##width val)				\
+{									\
+	fpr->val##width[FPR_IDX(width, idx)] = val;			\
+}
+
+BUILD_FPR_ACCESS(32)
+BUILD_FPR_ACCESS(64)
+
+struct loongarch_fpu {
+	unsigned int	fcsr;
+	unsigned int	vcsr;
+	uint64_t	fcc;	/* 8x8 */
+	union fpureg	fpr[NUM_FPU_REGS];
+};
+
+#define INIT_CPUMASK { \
+	{0,} \
+}
+
+#define ARCH_MIN_TASKALIGN	32
+
+struct loongarch_vdso_info;
+
+/*
+ * If you change thread_struct remember to change the #defines below too!
+ */
+struct thread_struct {
+	/* Main processor registers. */
+	unsigned long reg01, reg03, reg22; /* ra sp fp */
+	unsigned long reg23, reg24, reg25, reg26; /* s0-s3 */
+	unsigned long reg27, reg28, reg29, reg30, reg31; /* s4-s8 */
+
+	/* CSR registers */
+	unsigned long csr_prmd;
+	unsigned long csr_crmd;
+	unsigned long csr_euen;
+	unsigned long csr_ecfg;
+	unsigned long csr_badvaddr;	/* Last user fault */
+
+	/* Scratch registers */
+	unsigned long scr0;
+	unsigned long scr1;
+	unsigned long scr2;
+	unsigned long scr3;
+
+	/* Eflags register */
+	unsigned long eflags;
+
+	/* Other stuff associated with the thread. */
+	unsigned long trap_nr;
+	unsigned long error_code;
+	struct loongarch_vdso_info *vdso;
+
+	/*
+	 * FPU & vector registers, must be at last because
+	 * they are conditionally copied at fork().
+	 */
+	struct loongarch_fpu fpu FPU_ALIGN;
+};
+
+#define INIT_THREAD  {						\
+	/*							\
+	 * Main processor registers				\
+	 */							\
+	.reg01			= 0,				\
+	.reg03			= 0,				\
+	.reg22			= 0,				\
+	.reg23			= 0,				\
+	.reg24			= 0,				\
+	.reg25			= 0,				\
+	.reg26			= 0,				\
+	.reg27			= 0,				\
+	.reg28			= 0,				\
+	.reg29			= 0,				\
+	.reg30			= 0,				\
+	.reg31			= 0,				\
+	.csr_crmd		= 0,				\
+	.csr_prmd		= 0,				\
+	.csr_euen		= 0,				\
+	.csr_ecfg		= 0,				\
+	.csr_badvaddr		= 0,				\
+	/*							\
+	 * Other stuff associated with the process		\
+	 */							\
+	.trap_nr		= 0,				\
+	.error_code		= 0,				\
+	/*							\
+	 * FPU & vector registers				\
+	 */							\
+	.fpu			= {				\
+		.fcsr		= 0,				\
+		.vcsr		= 0,				\
+		.fcc		= 0,				\
+		.fpr		= {{{0,},},},			\
+	},							\
+}
+
+struct task_struct;
+
+/* Free all resources held by a thread. */
+#define release_thread(thread) do { } while (0)
+
+enum idle_boot_override {IDLE_NO_OVERRIDE = 0, IDLE_HALT, IDLE_NOMWAIT, IDLE_POLL};
+
+extern unsigned long		boot_option_idle_override;
+/*
+ * Do necessary setup to start up a newly executed thread.
+ */
+extern void start_thread(struct pt_regs *regs, unsigned long pc, unsigned long sp);
+
+static inline void flush_thread(void)
+{
+}
+
+unsigned long __get_wchan(struct task_struct *p);
+
+#define __KSTK_TOS(tsk) ((unsigned long)task_stack_page(tsk) + \
+			 THREAD_SIZE - 32 - sizeof(struct pt_regs))
+#define task_pt_regs(tsk) ((struct pt_regs *)__KSTK_TOS(tsk))
+#define KSTK_EIP(tsk) (task_pt_regs(tsk)->csr_era)
+#define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[3])
+#define KSTK_EUEN(tsk) (task_pt_regs(tsk)->csr_euen)
+#define KSTK_ECFG(tsk) (task_pt_regs(tsk)->csr_ecfg)
+
+#define return_address() ({__asm__ __volatile__("":::"$1"); __builtin_return_address(0);})
+
+#ifdef CONFIG_CPU_HAS_PREFETCH
+
+#define ARCH_HAS_PREFETCH
+#define prefetch(x) __builtin_prefetch((x), 0, 1)
+
+#define ARCH_HAS_PREFETCHW
+#define prefetchw(x) __builtin_prefetch((x), 1, 1)
+
+#endif
+
+#endif /* _ASM_PROCESSOR_H */
diff --git a/arch/loongarch/include/asm/ptrace.h b/arch/loongarch/include/asm/ptrace.h
new file mode 100644
index 000000000000..17838c6b7ccd
--- /dev/null
+++ b/arch/loongarch/include/asm/ptrace.h
@@ -0,0 +1,152 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_PTRACE_H
+#define _ASM_PTRACE_H
+
+#include <asm/page.h>
+#include <asm/thread_info.h>
+#include <uapi/asm/ptrace.h>
+
+/*
+ * This struct defines the way the registers are stored on the stack during
+ * a system call/exception. If you add a register here, please also add it to
+ * regoffset_table[] in arch/loongarch/kernel/ptrace.c.
+ */
+struct pt_regs {
+	/* Main processor registers. */
+	unsigned long regs[32];
+
+	/* Original syscall arg0. */
+	unsigned long orig_a0;
+
+	/* Special CSR registers. */
+	unsigned long csr_era;
+	unsigned long csr_badvaddr;
+	unsigned long csr_crmd;
+	unsigned long csr_prmd;
+	unsigned long csr_euen;
+	unsigned long csr_ecfg;
+	unsigned long csr_estat;
+	unsigned long __last[0];
+} __aligned(8);
+
+static inline int regs_irqs_disabled(struct pt_regs *regs)
+{
+	return arch_irqs_disabled_flags(regs->csr_prmd);
+}
+
+static inline unsigned long kernel_stack_pointer(struct pt_regs *regs)
+{
+	return regs->regs[3];
+}
+
+/*
+ * Don't use asm-generic/ptrace.h it defines FP accessors that don't make
+ * sense on LoongArch.  We rather want an error if they get invoked.
+ */
+
+static inline void instruction_pointer_set(struct pt_regs *regs, unsigned long val)
+{
+	regs->csr_era = val;
+}
+
+/* Query offset/name of register from its name/offset */
+extern int regs_query_register_offset(const char *name);
+#define MAX_REG_OFFSET (offsetof(struct pt_regs, __last))
+
+/**
+ * regs_get_register() - get register value from its offset
+ * @regs:       pt_regs from which register value is gotten.
+ * @offset:     offset number of the register.
+ *
+ * regs_get_register returns the value of a register. The @offset is the
+ * offset of the register in struct pt_regs address which specified by @regs.
+ * If @offset is bigger than MAX_REG_OFFSET, this returns 0.
+ */
+static inline unsigned long regs_get_register(struct pt_regs *regs, unsigned int offset)
+{
+	if (unlikely(offset > MAX_REG_OFFSET))
+		return 0;
+
+	return *(unsigned long *)((unsigned long)regs + offset);
+}
+
+/**
+ * regs_within_kernel_stack() - check the address in the stack
+ * @regs:       pt_regs which contains kernel stack pointer.
+ * @addr:       address which is checked.
+ *
+ * regs_within_kernel_stack() checks @addr is within the kernel stack page(s).
+ * If @addr is within the kernel stack, it returns true. If not, returns false.
+ */
+static inline int regs_within_kernel_stack(struct pt_regs *regs, unsigned long addr)
+{
+	return ((addr & ~(THREAD_SIZE - 1))  ==
+		(kernel_stack_pointer(regs) & ~(THREAD_SIZE - 1)));
+}
+
+/**
+ * regs_get_kernel_stack_nth() - get Nth entry of the stack
+ * @regs:       pt_regs which contains kernel stack pointer.
+ * @n:          stack entry number.
+ *
+ * regs_get_kernel_stack_nth() returns @n th entry of the kernel stack which
+ * is specified by @regs. If the @n th entry is NOT in the kernel stack,
+ * this returns 0.
+ */
+static inline unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs, unsigned int n)
+{
+	unsigned long *addr = (unsigned long *)kernel_stack_pointer(regs);
+
+	addr += n;
+	if (regs_within_kernel_stack(regs, (unsigned long)addr))
+		return *addr;
+	else
+		return 0;
+}
+
+struct task_struct;
+
+/*
+ * Does the process account for user or for system time?
+ */
+#define user_mode(regs) (((regs)->csr_prmd & PLV_MASK) == PLV_USER)
+
+static inline long regs_return_value(struct pt_regs *regs)
+{
+	return regs->regs[4];
+}
+
+#define instruction_pointer(regs) ((regs)->csr_era)
+#define profile_pc(regs) instruction_pointer(regs)
+
+extern void die(const char *, struct pt_regs *) __noreturn;
+
+static inline void die_if_kernel(const char *str, struct pt_regs *regs)
+{
+	if (unlikely(!user_mode(regs)))
+		die(str, regs);
+}
+
+#define current_pt_regs()						\
+({									\
+	unsigned long sp = (unsigned long)__builtin_frame_address(0);	\
+	(struct pt_regs *)((sp | (THREAD_SIZE - 1)) + 1 - 32) - 1;	\
+})
+
+/* Helpers for working with the user stack pointer */
+
+static inline unsigned long user_stack_pointer(struct pt_regs *regs)
+{
+	return regs->regs[3];
+}
+
+static inline void user_stack_pointer_set(struct pt_regs *regs,
+	unsigned long val)
+{
+	regs->regs[3] = val;
+}
+
+#endif /* _ASM_PTRACE_H */
diff --git a/arch/loongarch/include/asm/switch_to.h b/arch/loongarch/include/asm/switch_to.h
new file mode 100644
index 000000000000..2a8d04375574
--- /dev/null
+++ b/arch/loongarch/include/asm/switch_to.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_SWITCH_TO_H
+#define _ASM_SWITCH_TO_H
+
+#include <asm/cpu-features.h>
+#include <asm/fpu.h>
+
+struct task_struct;
+
+/**
+ * __switch_to - switch execution of a task
+ * @prev:	The task previously executed.
+ * @next:	The task to begin executing.
+ * @next_ti:	task_thread_info(next).
+ *
+ * This function is used whilst scheduling to save the context of prev & load
+ * the context of next. Returns prev.
+ */
+extern asmlinkage struct task_struct *__switch_to(struct task_struct *prev,
+			struct task_struct *next, struct thread_info *next_ti);
+
+/*
+ * For newly created kernel threads switch_to() will return to
+ * ret_from_kernel_thread, newly created user threads to ret_from_fork.
+ * That is, everything following __switch_to() will be skipped for new threads.
+ * So everything that matters to new threads should be placed before __switch_to().
+ */
+#define switch_to(prev, next, last)					\
+do {									\
+	lose_fpu_inatomic(1, prev);					\
+	(last) = __switch_to(prev, next, task_thread_info(next));	\
+} while (0)
+
+#endif /* _ASM_SWITCH_TO_H */
diff --git a/arch/loongarch/include/asm/thread_info.h b/arch/loongarch/include/asm/thread_info.h
new file mode 100644
index 000000000000..99beb11c2fa8
--- /dev/null
+++ b/arch/loongarch/include/asm/thread_info.h
@@ -0,0 +1,106 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * thread_info.h: LoongArch low-level thread information
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#ifndef _ASM_THREAD_INFO_H
+#define _ASM_THREAD_INFO_H
+
+#ifdef __KERNEL__
+
+#ifndef __ASSEMBLY__
+
+#include <asm/processor.h>
+
+/*
+ * low level task data that entry.S needs immediate access to
+ * - this struct should fit entirely inside of one cache line
+ * - this struct shares the supervisor stack pages
+ * - if the contents of this structure are changed, the assembly constants
+ *   must also be changed
+ */
+struct thread_info {
+	struct task_struct	*task;		/* main task structure */
+	unsigned long		flags;		/* low level flags */
+	unsigned long		tp_value;	/* thread pointer */
+	__u32			cpu;		/* current CPU */
+	int			preempt_count;	/* 0 => preemptible, <0 => BUG */
+	struct pt_regs		*regs;
+	unsigned long		syscall;	/* syscall number */
+	unsigned long		syscall_work;	/* SYSCALL_WORK_ flags */
+};
+
+/*
+ * macros/functions for gaining access to the thread information structure
+ */
+#define INIT_THREAD_INFO(tsk)			\
+{						\
+	.task		= &tsk,			\
+	.flags		= 0,			\
+	.cpu		= 0,			\
+	.preempt_count	= INIT_PREEMPT_COUNT,	\
+}
+
+/* How to get the thread information struct from C. */
+register struct thread_info *__current_thread_info __asm__("$r2");
+
+static inline struct thread_info *current_thread_info(void)
+{
+	return __current_thread_info;
+}
+
+register unsigned long current_stack_pointer __asm__("$r3");
+
+#endif /* !__ASSEMBLY__ */
+
+/* thread information allocation */
+#define THREAD_SIZE		SZ_16K
+#define THREAD_MASK		(THREAD_SIZE - 1UL)
+#define THREAD_SIZE_ORDER	ilog2(THREAD_SIZE / PAGE_SIZE)
+/*
+ * thread information flags
+ * - these are process state flags that various assembly files may need to
+ *   access
+ * - pending work-to-be-done flags are in LSW
+ * - other flags in MSW
+ */
+#define TIF_SIGPENDING		1	/* signal pending */
+#define TIF_NEED_RESCHED	2	/* rescheduling necessary */
+#define TIF_NOTIFY_RESUME	3	/* callback before returning to user */
+#define TIF_NOTIFY_SIGNAL	4	/* signal notifications exist */
+#define TIF_RESTORE_SIGMASK	5	/* restore signal mask in do_signal() */
+#define TIF_NOHZ		6	/* in adaptive nohz mode */
+#define TIF_UPROBE		7	/* breakpointed or singlestepping */
+#define TIF_USEDFPU		8	/* FPU was used by this task this quantum (SMP) */
+#define TIF_USEDSIMD		9	/* SIMD has been used this quantum */
+#define TIF_MEMDIE		10	/* is terminating due to OOM killer */
+#define TIF_FIXADE		11	/* Fix address errors in software */
+#define TIF_LOGADE		12	/* Log address errors to syslog */
+#define TIF_32BIT_REGS		13	/* 32-bit general purpose registers */
+#define TIF_32BIT_ADDR		14	/* 32-bit address space */
+#define TIF_LOAD_WATCH		15	/* If set, load watch registers */
+#define TIF_SINGLESTEP		16	/* Single Step */
+#define TIF_LSX_CTX_LIVE	17	/* LSX context must be preserved */
+#define TIF_LASX_CTX_LIVE	18	/* LASX context must be preserved */
+
+#define _TIF_SIGPENDING		(1<<TIF_SIGPENDING)
+#define _TIF_NEED_RESCHED	(1<<TIF_NEED_RESCHED)
+#define _TIF_NOTIFY_RESUME	(1<<TIF_NOTIFY_RESUME)
+#define _TIF_NOTIFY_SIGNAL	(1<<TIF_NOTIFY_SIGNAL)
+#define _TIF_NOHZ		(1<<TIF_NOHZ)
+#define _TIF_UPROBE		(1<<TIF_UPROBE)
+#define _TIF_USEDFPU		(1<<TIF_USEDFPU)
+#define _TIF_USEDSIMD		(1<<TIF_USEDSIMD)
+#define _TIF_FIXADE		(1<<TIF_FIXADE)
+#define _TIF_LOGADE		(1<<TIF_LOGADE)
+#define _TIF_32BIT_REGS		(1<<TIF_32BIT_REGS)
+#define _TIF_32BIT_ADDR		(1<<TIF_32BIT_ADDR)
+#define _TIF_LOAD_WATCH		(1<<TIF_LOAD_WATCH)
+#define _TIF_SINGLESTEP		(1<<TIF_SINGLESTEP)
+#define _TIF_LSX_CTX_LIVE	(1<<TIF_LSX_CTX_LIVE)
+#define _TIF_LASX_CTX_LIVE	(1<<TIF_LASX_CTX_LIVE)
+
+#endif /* __KERNEL__ */
+#endif /* _ASM_THREAD_INFO_H */
diff --git a/arch/loongarch/include/uapi/asm/ptrace.h b/arch/loongarch/include/uapi/asm/ptrace.h
new file mode 100644
index 000000000000..083193f4a5d5
--- /dev/null
+++ b/arch/loongarch/include/uapi/asm/ptrace.h
@@ -0,0 +1,52 @@
+/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
+/*
+ * Author: Hanlu Li <lihanlu@loongson.cn>
+ *         Huacai Chen <chenhuacai@loongson.cn>
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _UAPI_ASM_PTRACE_H
+#define _UAPI_ASM_PTRACE_H
+
+#include <linux/types.h>
+
+#ifndef __KERNEL__
+#include <stdint.h>
+#endif
+
+/*
+ * For PTRACE_{POKE,PEEK}USR. 0 - 31 are GPRs,
+ * 32 is syscall's original ARG0, 33 is PC, 34 is BADVADDR.
+ */
+#define GPR_BASE	0
+#define GPR_NUM		32
+#define GPR_END		(GPR_BASE + GPR_NUM - 1)
+#define ARG0		(GPR_END + 1)
+#define PC		(GPR_END + 2)
+#define BADVADDR	(GPR_END + 3)
+
+#define NUM_FPU_REGS	32
+
+struct user_pt_regs {
+	/* Main processor registers. */
+	unsigned long regs[32];
+
+	/* Original syscall arg0. */
+	unsigned long orig_a0;
+
+	/* Special CSR registers. */
+	unsigned long csr_era;
+	unsigned long csr_badv;
+	unsigned long reserved[10];
+} __attribute__((aligned(8)));
+
+struct user_fp_state {
+	uint64_t    fpr[32];
+	uint64_t    fcc;
+	uint32_t    fcsr;
+};
+
+#define PTRACE_SYSEMU			0x1f
+#define PTRACE_SYSEMU_SINGLESTEP	0x20
+
+#endif /* _UAPI_ASM_PTRACE_H */
diff --git a/arch/loongarch/kernel/fpu.S b/arch/loongarch/kernel/fpu.S
new file mode 100644
index 000000000000..7ff9b91043b1
--- /dev/null
+++ b/arch/loongarch/kernel/fpu.S
@@ -0,0 +1,264 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Author: Lu Zeng <zenglu@loongson.cn>
+ *         Pei Huang <huangpei@loongson.cn>
+ *         Huacai Chen <chenhuacai@loongson.cn>
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <asm/asm.h>
+#include <asm/asmmacro.h>
+#include <asm/asm-offsets.h>
+#include <asm/errno.h>
+#include <asm/export.h>
+#include <asm/fpregdef.h>
+#include <asm/loongarch.h>
+#include <asm/regdef.h>
+
+#undef v0
+#undef v1
+
+#define FPU_REG_WIDTH		8
+#define LSX_REG_WIDTH		16
+#define LASX_REG_WIDTH		32
+
+	.macro	EX insn, reg, src, offs
+.ex\@:	\insn	\reg, \src, \offs
+	.section __ex_table,"a"
+	PTR	.ex\@, fault
+	.previous
+	.endm
+
+	.macro sc_save_fp base
+	EX	fst.d $f0,  \base, (0 * FPU_REG_WIDTH)
+	EX	fst.d $f1,  \base, (1 * FPU_REG_WIDTH)
+	EX	fst.d $f2,  \base, (2 * FPU_REG_WIDTH)
+	EX	fst.d $f3,  \base, (3 * FPU_REG_WIDTH)
+	EX	fst.d $f4,  \base, (4 * FPU_REG_WIDTH)
+	EX	fst.d $f5,  \base, (5 * FPU_REG_WIDTH)
+	EX	fst.d $f6,  \base, (6 * FPU_REG_WIDTH)
+	EX	fst.d $f7,  \base, (7 * FPU_REG_WIDTH)
+	EX	fst.d $f8,  \base, (8 * FPU_REG_WIDTH)
+	EX	fst.d $f9,  \base, (9 * FPU_REG_WIDTH)
+	EX	fst.d $f10, \base, (10 * FPU_REG_WIDTH)
+	EX	fst.d $f11, \base, (11 * FPU_REG_WIDTH)
+	EX	fst.d $f12, \base, (12 * FPU_REG_WIDTH)
+	EX	fst.d $f13, \base, (13 * FPU_REG_WIDTH)
+	EX	fst.d $f14, \base, (14 * FPU_REG_WIDTH)
+	EX	fst.d $f15, \base, (15 * FPU_REG_WIDTH)
+	EX	fst.d $f16, \base, (16 * FPU_REG_WIDTH)
+	EX	fst.d $f17, \base, (17 * FPU_REG_WIDTH)
+	EX	fst.d $f18, \base, (18 * FPU_REG_WIDTH)
+	EX	fst.d $f19, \base, (19 * FPU_REG_WIDTH)
+	EX	fst.d $f20, \base, (20 * FPU_REG_WIDTH)
+	EX	fst.d $f21, \base, (21 * FPU_REG_WIDTH)
+	EX	fst.d $f22, \base, (22 * FPU_REG_WIDTH)
+	EX	fst.d $f23, \base, (23 * FPU_REG_WIDTH)
+	EX	fst.d $f24, \base, (24 * FPU_REG_WIDTH)
+	EX	fst.d $f25, \base, (25 * FPU_REG_WIDTH)
+	EX	fst.d $f26, \base, (26 * FPU_REG_WIDTH)
+	EX	fst.d $f27, \base, (27 * FPU_REG_WIDTH)
+	EX	fst.d $f28, \base, (28 * FPU_REG_WIDTH)
+	EX	fst.d $f29, \base, (29 * FPU_REG_WIDTH)
+	EX	fst.d $f30, \base, (30 * FPU_REG_WIDTH)
+	EX	fst.d $f31, \base, (31 * FPU_REG_WIDTH)
+	.endm
+
+	.macro sc_restore_fp base
+	EX	fld.d $f0,  \base, (0 * FPU_REG_WIDTH)
+	EX	fld.d $f1,  \base, (1 * FPU_REG_WIDTH)
+	EX	fld.d $f2,  \base, (2 * FPU_REG_WIDTH)
+	EX	fld.d $f3,  \base, (3 * FPU_REG_WIDTH)
+	EX	fld.d $f4,  \base, (4 * FPU_REG_WIDTH)
+	EX	fld.d $f5,  \base, (5 * FPU_REG_WIDTH)
+	EX	fld.d $f6,  \base, (6 * FPU_REG_WIDTH)
+	EX	fld.d $f7,  \base, (7 * FPU_REG_WIDTH)
+	EX	fld.d $f8,  \base, (8 * FPU_REG_WIDTH)
+	EX	fld.d $f9,  \base, (9 * FPU_REG_WIDTH)
+	EX	fld.d $f10, \base, (10 * FPU_REG_WIDTH)
+	EX	fld.d $f11, \base, (11 * FPU_REG_WIDTH)
+	EX	fld.d $f12, \base, (12 * FPU_REG_WIDTH)
+	EX	fld.d $f13, \base, (13 * FPU_REG_WIDTH)
+	EX	fld.d $f14, \base, (14 * FPU_REG_WIDTH)
+	EX	fld.d $f15, \base, (15 * FPU_REG_WIDTH)
+	EX	fld.d $f16, \base, (16 * FPU_REG_WIDTH)
+	EX	fld.d $f17, \base, (17 * FPU_REG_WIDTH)
+	EX	fld.d $f18, \base, (18 * FPU_REG_WIDTH)
+	EX	fld.d $f19, \base, (19 * FPU_REG_WIDTH)
+	EX	fld.d $f20, \base, (20 * FPU_REG_WIDTH)
+	EX	fld.d $f21, \base, (21 * FPU_REG_WIDTH)
+	EX	fld.d $f22, \base, (22 * FPU_REG_WIDTH)
+	EX	fld.d $f23, \base, (23 * FPU_REG_WIDTH)
+	EX	fld.d $f24, \base, (24 * FPU_REG_WIDTH)
+	EX	fld.d $f25, \base, (25 * FPU_REG_WIDTH)
+	EX	fld.d $f26, \base, (26 * FPU_REG_WIDTH)
+	EX	fld.d $f27, \base, (27 * FPU_REG_WIDTH)
+	EX	fld.d $f28, \base, (28 * FPU_REG_WIDTH)
+	EX	fld.d $f29, \base, (29 * FPU_REG_WIDTH)
+	EX	fld.d $f30, \base, (30 * FPU_REG_WIDTH)
+	EX	fld.d $f31, \base, (31 * FPU_REG_WIDTH)
+	.endm
+
+	.macro sc_save_fcc base, tmp0, tmp1
+	movcf2gr	\tmp0, $fcc0
+	move	\tmp1, \tmp0
+	movcf2gr	\tmp0, $fcc1
+	bstrins.d	\tmp1, \tmp0, 15, 8
+	movcf2gr	\tmp0, $fcc2
+	bstrins.d	\tmp1, \tmp0, 23, 16
+	movcf2gr	\tmp0, $fcc3
+	bstrins.d	\tmp1, \tmp0, 31, 24
+	movcf2gr	\tmp0, $fcc4
+	bstrins.d	\tmp1, \tmp0, 39, 32
+	movcf2gr	\tmp0, $fcc5
+	bstrins.d	\tmp1, \tmp0, 47, 40
+	movcf2gr	\tmp0, $fcc6
+	bstrins.d	\tmp1, \tmp0, 55, 48
+	movcf2gr	\tmp0, $fcc7
+	bstrins.d	\tmp1, \tmp0, 63, 56
+	EX	st.d \tmp1, \base, 0
+	.endm
+
+	.macro sc_restore_fcc base, tmp0, tmp1
+	EX	ld.d \tmp0, \base, 0
+	bstrpick.d	\tmp1, \tmp0, 7, 0
+	movgr2cf	$fcc0, \tmp1
+	bstrpick.d	\tmp1, \tmp0, 15, 8
+	movgr2cf	$fcc1, \tmp1
+	bstrpick.d	\tmp1, \tmp0, 23, 16
+	movgr2cf	$fcc2, \tmp1
+	bstrpick.d	\tmp1, \tmp0, 31, 24
+	movgr2cf	$fcc3, \tmp1
+	bstrpick.d	\tmp1, \tmp0, 39, 32
+	movgr2cf	$fcc4, \tmp1
+	bstrpick.d	\tmp1, \tmp0, 47, 40
+	movgr2cf	$fcc5, \tmp1
+	bstrpick.d	\tmp1, \tmp0, 55, 48
+	movgr2cf	$fcc6, \tmp1
+	bstrpick.d	\tmp1, \tmp0, 63, 56
+	movgr2cf	$fcc7, \tmp1
+	.endm
+
+	.macro sc_save_fcsr base, tmp0
+	movfcsr2gr	\tmp0, fcsr0
+	EX	st.w \tmp0, \base, 0
+	.endm
+
+	.macro sc_restore_fcsr base, tmp0
+	EX	ld.w \tmp0, \base, 0
+	movgr2fcsr	fcsr0, \tmp0
+	.endm
+
+	.macro sc_save_vcsr base, tmp0
+	movfcsr2gr	\tmp0, vcsr16
+	EX	st.w \tmp0, \base, 0
+	.endm
+
+	.macro sc_restore_vcsr base, tmp0
+	EX	ld.w \tmp0, \base, 0
+	movgr2fcsr	vcsr16, \tmp0
+	.endm
+
+/*
+ * Save a thread's fp context.
+ */
+SYM_FUNC_START(_save_fp)
+	fpu_save_csr	a0 t1
+	fpu_save_double a0 t1			# clobbers t1
+	fpu_save_cc	a0 t1 t2		# clobbers t1, t2
+	jirl zero, ra, 0
+SYM_FUNC_END(_save_fp)
+EXPORT_SYMBOL(_save_fp)
+
+/*
+ * Restore a thread's fp context.
+ */
+SYM_FUNC_START(_restore_fp)
+	fpu_restore_double a0 t1		# clobbers t1
+	fpu_restore_csr	a0 t1
+	fpu_restore_cc	a0 t1 t2		# clobbers t1, t2
+	jirl zero, ra, 0
+SYM_FUNC_END(_restore_fp)
+
+/*
+ * Load the FPU with signalling NANS.  This bit pattern we're using has
+ * the property that no matter whether considered as single or as double
+ * precision represents signaling NANS.
+ *
+ * The value to initialize fcsr0 to comes in $a0.
+ */
+
+SYM_FUNC_START(_init_fpu)
+	li.w	t1, CSR_EUEN_FPEN
+	csrxchg	t1, t1, LOONGARCH_CSR_EUEN
+
+	movgr2fcsr	fcsr0, a0
+
+	li.w	t1, -1				# SNaN
+
+	movgr2fr.d	$f0, t1
+	movgr2fr.d	$f1, t1
+	movgr2fr.d	$f2, t1
+	movgr2fr.d	$f3, t1
+	movgr2fr.d	$f4, t1
+	movgr2fr.d	$f5, t1
+	movgr2fr.d	$f6, t1
+	movgr2fr.d	$f7, t1
+	movgr2fr.d	$f8, t1
+	movgr2fr.d	$f9, t1
+	movgr2fr.d	$f10, t1
+	movgr2fr.d	$f11, t1
+	movgr2fr.d	$f12, t1
+	movgr2fr.d	$f13, t1
+	movgr2fr.d	$f14, t1
+	movgr2fr.d	$f15, t1
+	movgr2fr.d	$f16, t1
+	movgr2fr.d	$f17, t1
+	movgr2fr.d	$f18, t1
+	movgr2fr.d	$f19, t1
+	movgr2fr.d	$f20, t1
+	movgr2fr.d	$f21, t1
+	movgr2fr.d	$f22, t1
+	movgr2fr.d	$f23, t1
+	movgr2fr.d	$f24, t1
+	movgr2fr.d	$f25, t1
+	movgr2fr.d	$f26, t1
+	movgr2fr.d	$f27, t1
+	movgr2fr.d	$f28, t1
+	movgr2fr.d	$f29, t1
+	movgr2fr.d	$f30, t1
+	movgr2fr.d	$f31, t1
+
+	jirl zero, ra, 0
+SYM_FUNC_END(_init_fpu)
+
+/*
+ * a0: fpregs
+ * a1: fcc
+ * a2: fcsr
+ */
+SYM_FUNC_START(_save_fp_context)
+	sc_save_fcc a1 t1 t2
+	sc_save_fcsr a2 t1
+	sc_save_fp a0
+	li.w	a0, 0					# success
+	jirl zero, ra, 0
+SYM_FUNC_END(_save_fp_context)
+
+/*
+ * a0: fpregs
+ * a1: fcc
+ * a2: fcsr
+ */
+SYM_FUNC_START(_restore_fp_context)
+	sc_restore_fp a0
+	sc_restore_fcc a1 t1 t2
+	sc_restore_fcsr a2 t1
+	li.w	a0, 0					# success
+	jirl zero, ra, 0
+SYM_FUNC_END(_restore_fp_context)
+
+SYM_FUNC_START(fault)
+	li.w	a0, -EFAULT				# failure
+	jirl zero, ra, 0
+SYM_FUNC_END(fault)
diff --git a/arch/loongarch/kernel/idle.c b/arch/loongarch/kernel/idle.c
new file mode 100644
index 000000000000..1a65d0527d25
--- /dev/null
+++ b/arch/loongarch/kernel/idle.c
@@ -0,0 +1,16 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * LoongArch idle loop support.
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/cpu.h>
+#include <linux/irqflags.h>
+#include <asm/cpu.h>
+#include <asm/idle.h>
+
+void __cpuidle arch_cpu_idle(void)
+{
+	raw_local_irq_enable();
+	__arch_cpu_idle(); /* idle instruction needs irq enabled */
+}
diff --git a/arch/loongarch/kernel/process.c b/arch/loongarch/kernel/process.c
new file mode 100644
index 000000000000..8ac0dcf18be3
--- /dev/null
+++ b/arch/loongarch/kernel/process.c
@@ -0,0 +1,260 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Author: Huacai Chen <chenhuacai@loongson.cn>
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ *
+ * Derived from MIPS:
+ * Copyright (C) 1994 - 1999, 2000 by Ralf Baechle and others.
+ * Copyright (C) 2005, 2006 by Ralf Baechle (ralf@linux-mips.org)
+ * Copyright (C) 1999, 2000 Silicon Graphics, Inc.
+ * Copyright (C) 2004 Thiemo Seufer
+ * Copyright (C) 2013  Imagination Technologies Ltd.
+ */
+#include <linux/cpu.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/sched.h>
+#include <linux/sched/debug.h>
+#include <linux/sched/task.h>
+#include <linux/sched/task_stack.h>
+#include <linux/mm.h>
+#include <linux/stddef.h>
+#include <linux/unistd.h>
+#include <linux/export.h>
+#include <linux/ptrace.h>
+#include <linux/mman.h>
+#include <linux/personality.h>
+#include <linux/sys.h>
+#include <linux/completion.h>
+#include <linux/kallsyms.h>
+#include <linux/random.h>
+#include <linux/prctl.h>
+#include <linux/nmi.h>
+
+#include <asm/asm.h>
+#include <asm/bootinfo.h>
+#include <asm/cpu.h>
+#include <asm/elf.h>
+#include <asm/fpu.h>
+#include <asm/io.h>
+#include <asm/irq.h>
+#include <asm/irq_regs.h>
+#include <asm/loongarch.h>
+#include <asm/pgtable.h>
+#include <asm/processor.h>
+#include <asm/reg.h>
+#include <asm/vdso.h>
+
+/*
+ * Idle related variables and functions
+ */
+
+unsigned long boot_option_idle_override = IDLE_NO_OVERRIDE;
+EXPORT_SYMBOL(boot_option_idle_override);
+
+asmlinkage void ret_from_fork(void);
+asmlinkage void ret_from_kernel_thread(void);
+
+void start_thread(struct pt_regs *regs, unsigned long pc, unsigned long sp)
+{
+	unsigned long crmd;
+	unsigned long prmd;
+	unsigned long euen;
+
+	/* New thread loses kernel privileges. */
+	crmd = regs->csr_crmd & ~(PLV_MASK);
+	crmd |= PLV_USER;
+	regs->csr_crmd = crmd;
+
+	prmd = regs->csr_prmd & ~(PLV_MASK);
+	prmd |= PLV_USER;
+	regs->csr_prmd = prmd;
+
+	euen = regs->csr_euen & ~(CSR_EUEN_FPEN);
+	regs->csr_euen = euen;
+	lose_fpu(0);
+
+	clear_thread_flag(TIF_LSX_CTX_LIVE);
+	clear_thread_flag(TIF_LASX_CTX_LIVE);
+	clear_used_math();
+	regs->csr_era = pc;
+	regs->regs[3] = sp;
+}
+
+void exit_thread(struct task_struct *tsk)
+{
+}
+
+int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
+{
+	/*
+	 * Save any process state which is live in hardware registers to the
+	 * parent context prior to duplication. This prevents the new child
+	 * state becoming stale if the parent is preempted before copy_thread()
+	 * gets a chance to save the parent's live hardware registers to the
+	 * child context.
+	 */
+	preempt_disable();
+
+	if (is_fpu_owner())
+		save_fp(current);
+
+	preempt_enable();
+
+	if (used_math())
+		memcpy(dst, src, sizeof(struct task_struct));
+	else
+		memcpy(dst, src, offsetof(struct task_struct, thread.fpu.fpr));
+
+	return 0;
+}
+
+/*
+ * Copy architecture-specific thread state
+ */
+int copy_thread(unsigned long clone_flags, unsigned long usp,
+	unsigned long kthread_arg, struct task_struct *p, unsigned long tls)
+{
+	unsigned long childksp;
+	struct pt_regs *childregs, *regs = current_pt_regs();
+
+	childksp = (unsigned long)task_stack_page(p) + THREAD_SIZE - 32;
+
+	/* set up new TSS. */
+	childregs = (struct pt_regs *) childksp - 1;
+	/*  Put the stack after the struct pt_regs.  */
+	childksp = (unsigned long) childregs;
+	p->thread.csr_euen = 0;
+	p->thread.csr_crmd = csr_readl(LOONGARCH_CSR_CRMD);
+	p->thread.csr_prmd = csr_readl(LOONGARCH_CSR_PRMD);
+	p->thread.csr_ecfg = csr_readl(LOONGARCH_CSR_ECFG);
+	if (unlikely(p->flags & (PF_KTHREAD | PF_IO_WORKER))) {
+		/* kernel thread */
+		p->thread.reg23 = usp; /* fn */
+		p->thread.reg24 = kthread_arg;
+		p->thread.reg03 = childksp;
+		p->thread.reg01 = (unsigned long) ret_from_kernel_thread;
+		memset(childregs, 0, sizeof(struct pt_regs));
+		childregs->csr_euen = p->thread.csr_euen;
+		childregs->csr_crmd = p->thread.csr_crmd;
+		childregs->csr_prmd = p->thread.csr_prmd;
+		childregs->csr_ecfg = p->thread.csr_ecfg;
+		return 0;
+	}
+
+	/* user thread */
+	*childregs = *regs;
+	childregs->regs[4] = 0; /* Child gets zero as return value */
+	if (usp)
+		childregs->regs[3] = usp;
+
+	p->thread.reg03 = (unsigned long) childregs;
+	p->thread.reg01 = (unsigned long) ret_from_fork;
+
+	/*
+	 * New tasks lose permission to use the fpu. This accelerates context
+	 * switching for most programs since they don't use the fpu.
+	 */
+	childregs->csr_euen = 0;
+
+	clear_tsk_thread_flag(p, TIF_USEDFPU);
+	clear_tsk_thread_flag(p, TIF_USEDSIMD);
+	clear_tsk_thread_flag(p, TIF_LSX_CTX_LIVE);
+	clear_tsk_thread_flag(p, TIF_LASX_CTX_LIVE);
+
+	if (clone_flags & CLONE_SETTLS)
+		childregs->regs[2] = tls;
+
+	return 0;
+}
+
+unsigned long __get_wchan(struct task_struct *task)
+{
+	return 0;
+}
+
+unsigned long stack_top(void)
+{
+	unsigned long top = TASK_SIZE & PAGE_MASK;
+
+	/* Space for the VDSO & data page */
+	top -= PAGE_ALIGN(current->thread.vdso->size);
+	top -= PAGE_SIZE;
+
+	/* Space to randomize the VDSO base */
+	if (current->flags & PF_RANDOMIZE)
+		top -= VDSO_RANDOMIZE_SIZE;
+
+	return top;
+}
+
+/*
+ * Don't forget that the stack pointer must be aligned on a 8 bytes
+ * boundary for 32-bits ABI and 16 bytes for 64-bits ABI.
+ */
+unsigned long arch_align_stack(unsigned long sp)
+{
+	if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space)
+		sp -= get_random_int() & ~PAGE_MASK;
+
+	return sp & ALMASK;
+}
+
+static DEFINE_PER_CPU(call_single_data_t, backtrace_csd);
+static struct cpumask backtrace_csd_busy;
+
+static void handle_backtrace(void *info)
+{
+	nmi_cpu_backtrace(get_irq_regs());
+	cpumask_clear_cpu(smp_processor_id(), &backtrace_csd_busy);
+}
+
+static void raise_backtrace(cpumask_t *mask)
+{
+	call_single_data_t *csd;
+	int cpu;
+
+	for_each_cpu(cpu, mask) {
+		/*
+		 * If we previously sent an IPI to the target CPU & it hasn't
+		 * cleared its bit in the busy cpumask then it didn't handle
+		 * our previous IPI & it's not safe for us to reuse the
+		 * call_single_data_t.
+		 */
+		if (cpumask_test_and_set_cpu(cpu, &backtrace_csd_busy)) {
+			pr_warn("Unable to send backtrace IPI to CPU%u - perhaps it hung?\n",
+				cpu);
+			continue;
+		}
+
+		csd = &per_cpu(backtrace_csd, cpu);
+		csd->func = handle_backtrace;
+		smp_call_function_single_async(cpu, csd);
+	}
+}
+
+void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+{
+	nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace);
+}
+
+#ifdef CONFIG_64BIT
+void loongarch_dump_regs64(u64 *uregs, const struct pt_regs *regs)
+{
+	unsigned int i;
+
+	for (i = LOONGARCH_EF_R1; i <= LOONGARCH_EF_R31; i++) {
+		uregs[i] = regs->regs[i - LOONGARCH_EF_R0];
+	}
+
+	uregs[LOONGARCH_EF_ORIG_A0] = regs->orig_a0;
+	uregs[LOONGARCH_EF_CSR_ERA] = regs->csr_era;
+	uregs[LOONGARCH_EF_CSR_BADV] = regs->csr_badvaddr;
+	uregs[LOONGARCH_EF_CSR_CRMD] = regs->csr_crmd;
+	uregs[LOONGARCH_EF_CSR_PRMD] = regs->csr_prmd;
+	uregs[LOONGARCH_EF_CSR_EUEN] = regs->csr_euen;
+	uregs[LOONGARCH_EF_CSR_ECFG] = regs->csr_ecfg;
+	uregs[LOONGARCH_EF_CSR_ESTAT] = regs->csr_estat;
+}
+#endif /* CONFIG_64BIT */
diff --git a/arch/loongarch/kernel/ptrace.c b/arch/loongarch/kernel/ptrace.c
new file mode 100644
index 000000000000..e6ab87948e1d
--- /dev/null
+++ b/arch/loongarch/kernel/ptrace.c
@@ -0,0 +1,431 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Author: Hanlu Li <lihanlu@loongson.cn>
+ *         Huacai Chen <chenhuacai@loongson.cn>
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ *
+ * Derived from MIPS:
+ * Copyright (C) 1992 Ross Biro
+ * Copyright (C) Linus Torvalds
+ * Copyright (C) 1994, 95, 96, 97, 98, 2000 Ralf Baechle
+ * Copyright (C) 1996 David S. Miller
+ * Kevin D. Kissell, kevink@mips.com and Carsten Langgaard, carstenl@mips.com
+ * Copyright (C) 1999 MIPS Technologies, Inc.
+ * Copyright (C) 2000 Ulf Carlsson
+ */
+#include <linux/kernel.h>
+#include <linux/audit.h>
+#include <linux/compiler.h>
+#include <linux/context_tracking.h>
+#include <linux/elf.h>
+#include <linux/errno.h>
+#include <linux/mm.h>
+#include <linux/ptrace.h>
+#include <linux/regset.h>
+#include <linux/sched.h>
+#include <linux/sched/task_stack.h>
+#include <linux/security.h>
+#include <linux/smp.h>
+#include <linux/stddef.h>
+#include <linux/seccomp.h>
+#include <linux/uaccess.h>
+
+#include <asm/byteorder.h>
+#include <asm/cpu.h>
+#include <asm/cpu-info.h>
+#include <asm/fpu.h>
+#include <asm/loongarch.h>
+#include <asm/page.h>
+#include <asm/pgtable.h>
+#include <asm/processor.h>
+#include <asm/reg.h>
+#include <asm/syscall.h>
+
+static void init_fp_ctx(struct task_struct *target)
+{
+	/* The target already has context */
+	if (tsk_used_math(target))
+		return;
+
+	/* Begin with data registers set to all 1s... */
+	memset(&target->thread.fpu.fpr, ~0, sizeof(target->thread.fpu.fpr));
+	set_stopped_child_used_math(target);
+}
+
+/*
+ * Called by kernel/ptrace.c when detaching..
+ *
+ * Make sure single step bits etc are not set.
+ */
+void ptrace_disable(struct task_struct *child)
+{
+	/* Don't load the watchpoint registers for the ex-child. */
+	clear_tsk_thread_flag(child, TIF_LOAD_WATCH);
+	clear_tsk_thread_flag(child, TIF_SINGLESTEP);
+}
+
+/* regset get/set implementations */
+
+static int gpr_get(struct task_struct *target,
+		   const struct user_regset *regset,
+		   struct membuf to)
+{
+	int r;
+	struct pt_regs *regs = task_pt_regs(target);
+
+	r = membuf_write(&to, &regs->regs, sizeof(u64) * GPR_NUM);
+	r = membuf_write(&to, &regs->orig_a0, sizeof(u64));
+	r = membuf_write(&to, &regs->csr_era, sizeof(u64));
+	r = membuf_write(&to, &regs->csr_badvaddr, sizeof(u64));
+
+	return r;
+}
+
+static int gpr_set(struct task_struct *target,
+		   const struct user_regset *regset,
+		   unsigned int pos, unsigned int count,
+		   const void *kbuf, const void __user *ubuf)
+{
+	int err;
+	int a0_start = sizeof(u64) * GPR_NUM;
+	int era_start = a0_start + sizeof(u64);
+	int badvaddr_start = era_start + sizeof(u64);
+	struct pt_regs *regs = task_pt_regs(target);
+
+	err = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+				 &regs->regs,
+				 0, a0_start);
+	err |= user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+				 &regs->orig_a0,
+				 a0_start, a0_start + sizeof(u64));
+	err |= user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+				 &regs->csr_era,
+				 era_start, era_start + sizeof(u64));
+	err |= user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+				 &regs->csr_badvaddr,
+				 badvaddr_start, badvaddr_start + sizeof(u64));
+
+	return err;
+}
+
+
+/*
+ * Get the general floating-point registers.
+ */
+static int gfpr_get(struct task_struct *target, struct membuf *to)
+{
+	return membuf_write(to, &target->thread.fpu.fpr,
+			    sizeof(elf_fpreg_t) * NUM_FPU_REGS);
+}
+
+static int gfpr_get_simd(struct task_struct *target, struct membuf *to)
+{
+	int i, r;
+	u64 fpr_val;
+
+	BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t));
+	for (i = 0; i < NUM_FPU_REGS; i++) {
+		fpr_val = get_fpr64(&target->thread.fpu.fpr[i], 0);
+		r = membuf_write(to, &fpr_val, sizeof(elf_fpreg_t));
+	}
+
+	return r;
+}
+
+/*
+ * Choose the appropriate helper for general registers, and then copy
+ * the FCC and FCSR registers separately.
+ */
+static int fpr_get(struct task_struct *target,
+		   const struct user_regset *regset,
+		   struct membuf to)
+{
+	int r;
+
+	if (sizeof(target->thread.fpu.fpr[0]) == sizeof(elf_fpreg_t))
+		r = gfpr_get(target, &to);
+	else
+		r = gfpr_get_simd(target, &to);
+
+	r = membuf_write(&to, &target->thread.fpu.fcc, sizeof(target->thread.fpu.fcc));
+	r = membuf_write(&to, &target->thread.fpu.fcsr, sizeof(target->thread.fpu.fcsr));
+
+	return r;
+}
+
+static int gfpr_set(struct task_struct *target,
+		    unsigned int *pos, unsigned int *count,
+		    const void **kbuf, const void __user **ubuf)
+{
+	return user_regset_copyin(pos, count, kbuf, ubuf,
+				  &target->thread.fpu.fpr,
+				  0, NUM_FPU_REGS * sizeof(elf_fpreg_t));
+}
+
+static int gfpr_set_simd(struct task_struct *target,
+		       unsigned int *pos, unsigned int *count,
+		       const void **kbuf, const void __user **ubuf)
+{
+	int i, err;
+	u64 fpr_val;
+
+	BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t));
+	for (i = 0; i < NUM_FPU_REGS && *count > 0; i++) {
+		err = user_regset_copyin(pos, count, kbuf, ubuf,
+					 &fpr_val, i * sizeof(elf_fpreg_t),
+					 (i + 1) * sizeof(elf_fpreg_t));
+		if (err)
+			return err;
+		set_fpr64(&target->thread.fpu.fpr[i], 0, fpr_val);
+	}
+
+	return 0;
+}
+
+/*
+ * Choose the appropriate helper for general registers, and then copy
+ * the FCC register separately.
+ */
+static int fpr_set(struct task_struct *target,
+		   const struct user_regset *regset,
+		   unsigned int pos, unsigned int count,
+		   const void *kbuf, const void __user *ubuf)
+{
+	const int fcc_start = NUM_FPU_REGS * sizeof(elf_fpreg_t);
+	const int fcc_end = fcc_start + sizeof(u64);
+	int err;
+
+	BUG_ON(count % sizeof(elf_fpreg_t));
+	if (pos + count > sizeof(elf_fpregset_t))
+		return -EIO;
+
+	init_fp_ctx(target);
+
+	if (sizeof(target->thread.fpu.fpr[0]) == sizeof(elf_fpreg_t))
+		err = gfpr_set(target, &pos, &count, &kbuf, &ubuf);
+	else
+		err = gfpr_set_simd(target, &pos, &count, &kbuf, &ubuf);
+	if (err)
+		return err;
+
+	if (count > 0)
+		err |= user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+					  &target->thread.fpu.fcc,
+					  fcc_start, fcc_end);
+
+	return err;
+}
+
+static int cfg_get(struct task_struct *target,
+		   const struct user_regset *regset,
+		   struct membuf to)
+{
+	int i, r;
+	u32 cfg_val;
+
+	i = 0;
+	while (to.left > 0) {
+		cfg_val = read_cpucfg(i++);
+		r = membuf_write(&to, &cfg_val, sizeof(u32));
+	}
+
+	return r;
+}
+
+/*
+ * CFG registers are read-only.
+ */
+static int cfg_set(struct task_struct *target,
+		   const struct user_regset *regset,
+		   unsigned int pos, unsigned int count,
+		   const void *kbuf, const void __user *ubuf)
+{
+	return 0;
+}
+
+struct pt_regs_offset {
+	const char *name;
+	int offset;
+};
+
+#define REG_OFFSET_NAME(n, r) {.name = #n, .offset = offsetof(struct pt_regs, r)}
+#define REG_OFFSET_END {.name = NULL, .offset = 0}
+
+static const struct pt_regs_offset regoffset_table[] = {
+	REG_OFFSET_NAME(r0, regs[0]),
+	REG_OFFSET_NAME(r1, regs[1]),
+	REG_OFFSET_NAME(r2, regs[2]),
+	REG_OFFSET_NAME(r3, regs[3]),
+	REG_OFFSET_NAME(r4, regs[4]),
+	REG_OFFSET_NAME(r5, regs[5]),
+	REG_OFFSET_NAME(r6, regs[6]),
+	REG_OFFSET_NAME(r7, regs[7]),
+	REG_OFFSET_NAME(r8, regs[8]),
+	REG_OFFSET_NAME(r9, regs[9]),
+	REG_OFFSET_NAME(r10, regs[10]),
+	REG_OFFSET_NAME(r11, regs[11]),
+	REG_OFFSET_NAME(r12, regs[12]),
+	REG_OFFSET_NAME(r13, regs[13]),
+	REG_OFFSET_NAME(r14, regs[14]),
+	REG_OFFSET_NAME(r15, regs[15]),
+	REG_OFFSET_NAME(r16, regs[16]),
+	REG_OFFSET_NAME(r17, regs[17]),
+	REG_OFFSET_NAME(r18, regs[18]),
+	REG_OFFSET_NAME(r19, regs[19]),
+	REG_OFFSET_NAME(r20, regs[20]),
+	REG_OFFSET_NAME(r21, regs[21]),
+	REG_OFFSET_NAME(r22, regs[22]),
+	REG_OFFSET_NAME(r23, regs[23]),
+	REG_OFFSET_NAME(r24, regs[24]),
+	REG_OFFSET_NAME(r25, regs[25]),
+	REG_OFFSET_NAME(r26, regs[26]),
+	REG_OFFSET_NAME(r27, regs[27]),
+	REG_OFFSET_NAME(r28, regs[28]),
+	REG_OFFSET_NAME(r29, regs[29]),
+	REG_OFFSET_NAME(r30, regs[30]),
+	REG_OFFSET_NAME(r31, regs[31]),
+	REG_OFFSET_NAME(orig_a0, orig_a0),
+	REG_OFFSET_NAME(csr_era, csr_era),
+	REG_OFFSET_NAME(csr_badvaddr, csr_badvaddr),
+	REG_OFFSET_NAME(csr_crmd, csr_crmd),
+	REG_OFFSET_NAME(csr_prmd, csr_prmd),
+	REG_OFFSET_NAME(csr_euen, csr_euen),
+	REG_OFFSET_NAME(csr_ecfg, csr_ecfg),
+	REG_OFFSET_NAME(csr_estat, csr_estat),
+	REG_OFFSET_END,
+};
+
+/**
+ * regs_query_register_offset() - query register offset from its name
+ * @name:       the name of a register
+ *
+ * regs_query_register_offset() returns the offset of a register in struct
+ * pt_regs from its name. If the name is invalid, this returns -EINVAL;
+ */
+int regs_query_register_offset(const char *name)
+{
+	const struct pt_regs_offset *roff;
+
+	for (roff = regoffset_table; roff->name != NULL; roff++)
+		if (!strcmp(roff->name, name))
+			return roff->offset;
+	return -EINVAL;
+}
+
+enum loongarch_regset {
+	REGSET_GPR,
+	REGSET_FPR,
+	REGSET_CPUCFG,
+};
+
+static const struct user_regset loongarch64_regsets[] = {
+	[REGSET_GPR] = {
+		.core_note_type	= NT_PRSTATUS,
+		.n		= ELF_NGREG,
+		.size		= sizeof(elf_greg_t),
+		.align		= sizeof(elf_greg_t),
+		.regset_get	= gpr_get,
+		.set		= gpr_set,
+	},
+	[REGSET_FPR] = {
+		.core_note_type	= NT_PRFPREG,
+		.n		= ELF_NFPREG,
+		.size		= sizeof(elf_fpreg_t),
+		.align		= sizeof(elf_fpreg_t),
+		.regset_get	= fpr_get,
+		.set		= fpr_set,
+	},
+	[REGSET_CPUCFG] = {
+		.core_note_type	= NT_LOONGARCH_CPUCFG,
+		.n		= 64,
+		.size		= sizeof(u32),
+		.align		= sizeof(u32),
+		.regset_get	= cfg_get,
+		.set		= cfg_set,
+	},
+};
+
+static const struct user_regset_view user_loongarch64_view = {
+	.name		= "loongarch64",
+	.e_machine	= ELF_ARCH,
+	.regsets	= loongarch64_regsets,
+	.n		= ARRAY_SIZE(loongarch64_regsets),
+};
+
+
+const struct user_regset_view *task_user_regset_view(struct task_struct *task)
+{
+	return &user_loongarch64_view;
+}
+
+static inline int read_user(struct task_struct *target, unsigned long addr,
+			    unsigned long __user *data)
+{
+	unsigned long tmp = 0;
+
+	switch (addr) {
+	case 0 ... 31:
+		tmp = task_pt_regs(target)->regs[addr];
+		break;
+	case ARG0:
+		tmp = task_pt_regs(target)->orig_a0;
+		break;
+	case PC:
+		tmp = task_pt_regs(target)->csr_era;
+		break;
+	case BADVADDR:
+		tmp = task_pt_regs(target)->csr_badvaddr;
+		break;
+	default:
+		return -EIO;
+	}
+
+	return put_user(tmp, data);
+}
+
+static inline int write_user(struct task_struct *target, unsigned long addr,
+			    unsigned long data)
+{
+	switch (addr) {
+	case 0 ... 31:
+		task_pt_regs(target)->regs[addr] = data;
+		break;
+	case ARG0:
+		task_pt_regs(target)->orig_a0 = data;
+		break;
+	case PC:
+		task_pt_regs(target)->csr_era = data;
+		break;
+	case BADVADDR:
+		task_pt_regs(target)->csr_badvaddr = data;
+		break;
+	default:
+		return -EIO;
+	}
+
+	return 0;
+}
+
+long arch_ptrace(struct task_struct *child, long request,
+		 unsigned long addr, unsigned long data)
+{
+	int ret;
+	unsigned long __user *datap = (void __user *) data;
+
+	switch (request) {
+	case PTRACE_PEEKUSR:
+		ret = read_user(child, addr, datap);
+		break;
+
+	case PTRACE_POKEUSR:
+		ret = write_user(child, addr, data);
+		break;
+
+	default:
+		ret = ptrace_request(child, request, addr, data);
+		break;
+	}
+
+	return ret;
+}
diff --git a/arch/loongarch/kernel/switch.S b/arch/loongarch/kernel/switch.S
new file mode 100644
index 000000000000..b864fde4a808
--- /dev/null
+++ b/arch/loongarch/kernel/switch.S
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <asm/asm.h>
+#include <asm/asmmacro.h>
+#include <asm/asm-offsets.h>
+#include <asm/loongarch.h>
+#include <asm/regdef.h>
+#include <asm/stackframe.h>
+#include <asm/thread_info.h>
+
+/*
+ * task_struct *__switch_to(task_struct *prev, task_struct *next,
+ *			    struct thread_info *next_ti)
+ */
+	.align	5
+SYM_FUNC_START(__switch_to)
+	csrrd	t1, LOONGARCH_CSR_PRMD
+	stptr.d	t1, a0, THREAD_CSRPRMD
+
+	cpu_save_nonscratch a0
+	stptr.d	ra, a0, THREAD_REG01
+	move	tp, a2
+	cpu_restore_nonscratch a1
+
+	li.w	t0, _THREAD_SIZE - 32
+	PTR_ADDU	t0, t0, tp
+	set_saved_sp	t0, t1, t2
+
+	ldptr.d	t1, a1, THREAD_CSRPRMD
+	csrwr	t1, LOONGARCH_CSR_PRMD
+
+	jr	ra
+SYM_FUNC_END(__switch_to)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 12/24] LoongArch: Add memory management
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (10 preceding siblings ...)
  2022-04-30  9:05 ` [PATCH V9 11/24] LoongArch: Add process management Huacai Chen
@ 2022-04-30  9:05 ` Huacai Chen
  2022-04-30  9:05 ` [PATCH V9 13/24] LoongArch: Add system call support Huacai Chen
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:05 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen

This patch adds memory management support for LoongArch, including:
cache and tlb management, page fault handling and ioremap/mmap support.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/include/asm/cache.h        |  13 +
 arch/loongarch/include/asm/cacheflush.h   |  80 ++++
 arch/loongarch/include/asm/cacheops.h     |  37 ++
 arch/loongarch/include/asm/fixmap.h       |  13 +
 arch/loongarch/include/asm/hugetlb.h      |  79 ++++
 arch/loongarch/include/asm/page.h         | 113 +++++
 arch/loongarch/include/asm/pgalloc.h      | 103 +++++
 arch/loongarch/include/asm/pgtable-bits.h | 139 ++++++
 arch/loongarch/include/asm/pgtable.h      | 539 ++++++++++++++++++++++
 arch/loongarch/include/asm/shmparam.h     |  12 +
 arch/loongarch/include/asm/sparsemem.h    |  23 +
 arch/loongarch/include/asm/tlb.h          | 216 +++++++++
 arch/loongarch/include/asm/tlbflush.h     |  35 ++
 arch/loongarch/include/asm/vmalloc.h      |   4 +
 arch/loongarch/mm/cache.c                 | 140 ++++++
 arch/loongarch/mm/extable.c               |  22 +
 arch/loongarch/mm/fault.c                 | 261 +++++++++++
 arch/loongarch/mm/hugetlbpage.c           |  87 ++++
 arch/loongarch/mm/init.c                  | 165 +++++++
 arch/loongarch/mm/ioremap.c               |  27 ++
 arch/loongarch/mm/maccess.c               |  10 +
 arch/loongarch/mm/mmap.c                  | 125 +++++
 arch/loongarch/mm/page.S                  |  84 ++++
 arch/loongarch/mm/pgtable.c               | 130 ++++++
 arch/loongarch/mm/tlb.c                   | 282 +++++++++++
 arch/loongarch/mm/tlbex.S                 | 477 +++++++++++++++++++
 26 files changed, 3216 insertions(+)
 create mode 100644 arch/loongarch/include/asm/cache.h
 create mode 100644 arch/loongarch/include/asm/cacheflush.h
 create mode 100644 arch/loongarch/include/asm/cacheops.h
 create mode 100644 arch/loongarch/include/asm/fixmap.h
 create mode 100644 arch/loongarch/include/asm/hugetlb.h
 create mode 100644 arch/loongarch/include/asm/page.h
 create mode 100644 arch/loongarch/include/asm/pgalloc.h
 create mode 100644 arch/loongarch/include/asm/pgtable-bits.h
 create mode 100644 arch/loongarch/include/asm/pgtable.h
 create mode 100644 arch/loongarch/include/asm/shmparam.h
 create mode 100644 arch/loongarch/include/asm/sparsemem.h
 create mode 100644 arch/loongarch/include/asm/tlb.h
 create mode 100644 arch/loongarch/include/asm/tlbflush.h
 create mode 100644 arch/loongarch/include/asm/vmalloc.h
 create mode 100644 arch/loongarch/mm/cache.c
 create mode 100644 arch/loongarch/mm/extable.c
 create mode 100644 arch/loongarch/mm/fault.c
 create mode 100644 arch/loongarch/mm/hugetlbpage.c
 create mode 100644 arch/loongarch/mm/init.c
 create mode 100644 arch/loongarch/mm/ioremap.c
 create mode 100644 arch/loongarch/mm/maccess.c
 create mode 100644 arch/loongarch/mm/mmap.c
 create mode 100644 arch/loongarch/mm/page.S
 create mode 100644 arch/loongarch/mm/pgtable.c
 create mode 100644 arch/loongarch/mm/tlb.c
 create mode 100644 arch/loongarch/mm/tlbex.S

diff --git a/arch/loongarch/include/asm/cache.h b/arch/loongarch/include/asm/cache.h
new file mode 100644
index 000000000000..1b6d09617199
--- /dev/null
+++ b/arch/loongarch/include/asm/cache.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_CACHE_H
+#define _ASM_CACHE_H
+
+#define L1_CACHE_SHIFT		CONFIG_L1_CACHE_SHIFT
+#define L1_CACHE_BYTES		(1 << L1_CACHE_SHIFT)
+
+#define __read_mostly __section(".data..read_mostly")
+
+#endif /* _ASM_CACHE_H */
diff --git a/arch/loongarch/include/asm/cacheflush.h b/arch/loongarch/include/asm/cacheflush.h
new file mode 100644
index 000000000000..670900141b7c
--- /dev/null
+++ b/arch/loongarch/include/asm/cacheflush.h
@@ -0,0 +1,80 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_CACHEFLUSH_H
+#define _ASM_CACHEFLUSH_H
+
+#include <linux/mm.h>
+#include <asm/cpu-features.h>
+#include <asm/cacheops.h>
+
+extern void local_flush_icache_range(unsigned long start, unsigned long end);
+
+#define flush_icache_range	local_flush_icache_range
+#define flush_icache_user_range	local_flush_icache_range
+
+#define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 0
+
+#define flush_cache_all()				do { } while (0)
+#define flush_cache_mm(mm)				do { } while (0)
+#define flush_cache_dup_mm(mm)				do { } while (0)
+#define flush_cache_range(vma, start, end)		do { } while (0)
+#define flush_cache_page(vma, vmaddr, pfn)		do { } while (0)
+#define flush_cache_vmap(start, end)			do { } while (0)
+#define flush_cache_vunmap(start, end)			do { } while (0)
+#define flush_icache_page(vma, page)			do { } while (0)
+#define flush_icache_user_page(vma, page, addr, len)	do { } while (0)
+#define flush_dcache_page(page)				do { } while (0)
+#define flush_dcache_mmap_lock(mapping)			do { } while (0)
+#define flush_dcache_mmap_unlock(mapping)		do { } while (0)
+
+#define cache_op(op, addr)						\
+	__asm__ __volatile__(						\
+	"	cacop	%0, %1					\n"	\
+	:								\
+	: "i" (op), "ZC" (*(unsigned char *)(addr)))
+
+static inline void flush_icache_line_indexed(unsigned long addr)
+{
+	cache_op(Index_Invalidate_I, addr);
+}
+
+static inline void flush_dcache_line_indexed(unsigned long addr)
+{
+	cache_op(Index_Writeback_Inv_D, addr);
+}
+
+static inline void flush_vcache_line_indexed(unsigned long addr)
+{
+	cache_op(Index_Writeback_Inv_V, addr);
+}
+
+static inline void flush_scache_line_indexed(unsigned long addr)
+{
+	cache_op(Index_Writeback_Inv_S, addr);
+}
+
+static inline void flush_icache_line(unsigned long addr)
+{
+	cache_op(Hit_Invalidate_I, addr);
+}
+
+static inline void flush_dcache_line(unsigned long addr)
+{
+	cache_op(Hit_Writeback_Inv_D, addr);
+}
+
+static inline void flush_vcache_line(unsigned long addr)
+{
+	cache_op(Hit_Writeback_Inv_V, addr);
+}
+
+static inline void flush_scache_line(unsigned long addr)
+{
+	cache_op(Hit_Writeback_Inv_S, addr);
+}
+
+#include <asm-generic/cacheflush.h>
+
+#endif /* _ASM_CACHEFLUSH_H */
diff --git a/arch/loongarch/include/asm/cacheops.h b/arch/loongarch/include/asm/cacheops.h
new file mode 100644
index 000000000000..dc280efecebd
--- /dev/null
+++ b/arch/loongarch/include/asm/cacheops.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Cache operations for the cache instruction.
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM_CACHEOPS_H
+#define __ASM_CACHEOPS_H
+
+/*
+ * Most cache ops are split into a 2 bit field identifying the cache, and a 3
+ * bit field identifying the cache operation.
+ */
+#define CacheOp_Cache			0x03
+#define CacheOp_Op			0x1c
+
+#define Cache_I				0x00
+#define Cache_D				0x01
+#define Cache_V				0x02
+#define Cache_S				0x03
+
+#define Index_Invalidate		0x08
+#define Index_Writeback_Inv		0x08
+#define Hit_Invalidate			0x10
+#define Hit_Writeback_Inv		0x10
+#define CacheOp_User_Defined		0x18
+
+#define Index_Invalidate_I		(Cache_I | Index_Invalidate)
+#define Index_Writeback_Inv_D		(Cache_D | Index_Writeback_Inv)
+#define Index_Writeback_Inv_V		(Cache_V | Index_Writeback_Inv)
+#define Index_Writeback_Inv_S		(Cache_S | Index_Writeback_Inv)
+#define Hit_Invalidate_I		(Cache_I | Hit_Invalidate)
+#define Hit_Writeback_Inv_D		(Cache_D | Hit_Writeback_Inv)
+#define Hit_Writeback_Inv_V		(Cache_V | Hit_Writeback_Inv)
+#define Hit_Writeback_Inv_S		(Cache_S | Hit_Writeback_Inv)
+
+#endif	/* __ASM_CACHEOPS_H */
diff --git a/arch/loongarch/include/asm/fixmap.h b/arch/loongarch/include/asm/fixmap.h
new file mode 100644
index 000000000000..b3541dfa2013
--- /dev/null
+++ b/arch/loongarch/include/asm/fixmap.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * fixmap.h: compile-time virtual memory allocation
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#ifndef _ASM_FIXMAP_H
+#define _ASM_FIXMAP_H
+
+#define NR_FIX_BTMAPS 64
+
+#endif
diff --git a/arch/loongarch/include/asm/hugetlb.h b/arch/loongarch/include/asm/hugetlb.h
new file mode 100644
index 000000000000..960ee06d7ffd
--- /dev/null
+++ b/arch/loongarch/include/asm/hugetlb.h
@@ -0,0 +1,79 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#ifndef __ASM_HUGETLB_H
+#define __ASM_HUGETLB_H
+
+#include <asm/page.h>
+
+uint64_t pmd_to_entrylo(unsigned long pmd_val);
+
+#define __HAVE_ARCH_PREPARE_HUGEPAGE_RANGE
+static inline int prepare_hugepage_range(struct file *file,
+					 unsigned long addr,
+					 unsigned long len)
+{
+	unsigned long task_size = STACK_TOP;
+	struct hstate *h = hstate_file(file);
+
+	if (len & ~huge_page_mask(h))
+		return -EINVAL;
+	if (addr & ~huge_page_mask(h))
+		return -EINVAL;
+	if (len > task_size)
+		return -ENOMEM;
+	if (task_size - len < addr)
+		return -EINVAL;
+	return 0;
+}
+
+#define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR
+static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
+					    unsigned long addr, pte_t *ptep)
+{
+	pte_t clear;
+	pte_t pte = *ptep;
+
+	pte_val(clear) = (unsigned long)invalid_pte_table;
+	set_pte_at(mm, addr, ptep, clear);
+	return pte;
+}
+
+#define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH
+static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
+					 unsigned long addr, pte_t *ptep)
+{
+	flush_tlb_page(vma, addr & huge_page_mask(hstate_vma(vma)));
+}
+
+#define __HAVE_ARCH_HUGE_PTE_NONE
+static inline int huge_pte_none(pte_t pte)
+{
+	unsigned long val = pte_val(pte) & ~_PAGE_GLOBAL;
+	return !val || (val == (unsigned long)invalid_pte_table);
+}
+
+#define __HAVE_ARCH_HUGE_PTEP_SET_ACCESS_FLAGS
+static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
+					     unsigned long addr,
+					     pte_t *ptep, pte_t pte,
+					     int dirty)
+{
+	int changed = !pte_same(*ptep, pte);
+
+	if (changed) {
+		set_pte_at(vma->vm_mm, addr, ptep, pte);
+		/*
+		 * There could be some standard sized pages in there,
+		 * get them all.
+		 */
+		flush_tlb_range(vma, addr, addr + HPAGE_SIZE);
+	}
+	return changed;
+}
+
+#include <asm-generic/hugetlb.h>
+
+#endif /* __ASM_HUGETLB_H */
diff --git a/arch/loongarch/include/asm/page.h b/arch/loongarch/include/asm/page.h
new file mode 100644
index 000000000000..5bd2330d9576
--- /dev/null
+++ b/arch/loongarch/include/asm/page.h
@@ -0,0 +1,113 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_PAGE_H
+#define _ASM_PAGE_H
+
+#include <linux/const.h>
+
+/*
+ * PAGE_SHIFT determines the page size
+ */
+#ifdef CONFIG_PAGE_SIZE_4KB
+#define PAGE_SHIFT	12
+#endif
+#ifdef CONFIG_PAGE_SIZE_16KB
+#define PAGE_SHIFT	14
+#endif
+#ifdef CONFIG_PAGE_SIZE_64KB
+#define PAGE_SHIFT	16
+#endif
+#define PAGE_SIZE	(_AC(1, UL) << PAGE_SHIFT)
+#define PAGE_MASK	(~(PAGE_SIZE - 1))
+
+#define HPAGE_SHIFT	(PAGE_SHIFT + PAGE_SHIFT - 3)
+#define HPAGE_SIZE	(_AC(1, UL) << HPAGE_SHIFT)
+#define HPAGE_MASK	(~(HPAGE_SIZE - 1))
+#define HUGETLB_PAGE_ORDER	(HPAGE_SHIFT - PAGE_SHIFT)
+
+#ifndef __ASSEMBLY__
+
+#include <linux/kernel.h>
+#include <linux/pfn.h>
+
+/*
+ * It's normally defined only for FLATMEM config but it's
+ * used in our early mem init code for all memory models.
+ * So always define it.
+ */
+#define ARCH_PFN_OFFSET	PFN_UP(PHYS_OFFSET)
+
+extern void clear_page(void *page);
+extern void copy_page(void *to, void *from);
+
+#define clear_user_page(page, vaddr, pg)	clear_page(page)
+#define copy_user_page(to, from, vaddr, pg)	copy_page(to, from)
+
+extern unsigned long shm_align_mask;
+
+struct page;
+struct vm_area_struct;
+void copy_user_highpage(struct page *to, struct page *from,
+	      unsigned long vaddr, struct vm_area_struct *vma);
+
+#define __HAVE_ARCH_COPY_USER_HIGHPAGE
+
+typedef struct { unsigned long pte; } pte_t;
+#define pte_val(x)	((x).pte)
+#define __pte(x)	((pte_t) { (x) })
+typedef struct page *pgtable_t;
+
+typedef struct { unsigned long pgd; } pgd_t;
+#define pgd_val(x)	((x).pgd)
+#define __pgd(x)	((pgd_t) { (x) })
+
+/*
+ * Manipulate page protection bits
+ */
+typedef struct { unsigned long pgprot; } pgprot_t;
+#define pgprot_val(x)	((x).pgprot)
+#define __pgprot(x)	((pgprot_t) { (x) })
+#define pte_pgprot(x)	__pgprot(pte_val(x) & ~_PFN_MASK)
+
+#define ptep_buddy(x)	((pte_t *)((unsigned long)(x) ^ sizeof(pte_t)))
+
+/*
+ * __pa()/__va() should be used only during mem init.
+ */
+#define __pa(x)		PHYSADDR(x)
+#define __va(x)		((void *)((unsigned long)(x) + PAGE_OFFSET - PHYS_OFFSET))
+
+#define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
+
+#ifdef CONFIG_FLATMEM
+
+static inline int pfn_valid(unsigned long pfn)
+{
+	/* avoid <linux/mm.h> include hell */
+	extern unsigned long max_mapnr;
+	unsigned long pfn_offset = ARCH_PFN_OFFSET;
+
+	return pfn >= pfn_offset && pfn < max_mapnr;
+}
+
+#endif
+
+#define virt_to_pfn(kaddr)	PFN_DOWN(virt_to_phys((void *)(kaddr)))
+#define virt_to_page(kaddr)	pfn_to_page(virt_to_pfn(kaddr))
+
+extern int __virt_addr_valid(volatile void *kaddr);
+#define virt_addr_valid(kaddr)	__virt_addr_valid((volatile void *)(kaddr))
+
+#define VM_DATA_DEFAULT_FLAGS \
+	(VM_READ | VM_WRITE | \
+	 ((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0) | \
+	 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+
+#include <asm-generic/memory_model.h>
+#include <asm-generic/getorder.h>
+
+#endif /* !__ASSEMBLY__ */
+
+#endif /* _ASM_PAGE_H */
diff --git a/arch/loongarch/include/asm/pgalloc.h b/arch/loongarch/include/asm/pgalloc.h
new file mode 100644
index 000000000000..b0a57b25c131
--- /dev/null
+++ b/arch/loongarch/include/asm/pgalloc.h
@@ -0,0 +1,103 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_PGALLOC_H
+#define _ASM_PGALLOC_H
+
+#include <linux/mm.h>
+#include <linux/sched.h>
+
+#define __HAVE_ARCH_PMD_ALLOC_ONE
+#define __HAVE_ARCH_PUD_ALLOC_ONE
+#include <asm-generic/pgalloc.h>
+
+static inline void pmd_populate_kernel(struct mm_struct *mm,
+				       pmd_t *pmd, pte_t *pte)
+{
+	set_pmd(pmd, __pmd((unsigned long)pte));
+}
+
+static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd, pgtable_t pte)
+{
+	set_pmd(pmd, __pmd((unsigned long)page_address(pte)));
+}
+
+#ifndef __PAGETABLE_PMD_FOLDED
+
+static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
+{
+	set_pud(pud, __pud((unsigned long)pmd));
+}
+#endif
+
+#ifndef __PAGETABLE_PUD_FOLDED
+
+static inline void p4d_populate(struct mm_struct *mm, p4d_t *p4d, pud_t *pud)
+{
+	set_p4d(p4d, __p4d((unsigned long)pud));
+}
+
+#endif /* __PAGETABLE_PUD_FOLDED */
+
+extern void pagetable_init(void);
+
+/*
+ * Initialize a new pmd table with invalid pointers.
+ */
+extern void pmd_init(unsigned long page, unsigned long pagetable);
+
+/*
+ * Initialize a new pgd / pmd table with invalid pointers.
+ */
+extern void pgd_init(unsigned long page);
+extern pgd_t *pgd_alloc(struct mm_struct *mm);
+
+#define __pte_free_tlb(tlb, pte, address)			\
+do {							\
+	pgtable_pte_page_dtor(pte);			\
+	tlb_remove_page((tlb), pte);			\
+} while (0)
+
+#ifndef __PAGETABLE_PMD_FOLDED
+
+static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long address)
+{
+	pmd_t *pmd;
+	struct page *pg;
+
+	pg = alloc_pages(GFP_KERNEL_ACCOUNT, PMD_ORDER);
+	if (!pg)
+		return NULL;
+
+	if (!pgtable_pmd_page_ctor(pg)) {
+		__free_pages(pg, PMD_ORDER);
+		return NULL;
+	}
+
+	pmd = (pmd_t *)page_address(pg);
+	pmd_init((unsigned long)pmd, (unsigned long)invalid_pte_table);
+	return pmd;
+}
+
+#define __pmd_free_tlb(tlb, x, addr)	pmd_free((tlb)->mm, x)
+
+#endif
+
+#ifndef __PAGETABLE_PUD_FOLDED
+
+static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long address)
+{
+	pud_t *pud;
+
+	pud = (pud_t *) __get_free_pages(GFP_KERNEL, PUD_ORDER);
+	if (pud)
+		pud_init((unsigned long)pud, (unsigned long)invalid_pmd_table);
+	return pud;
+}
+
+#define __pud_free_tlb(tlb, x, addr)	pud_free((tlb)->mm, x)
+
+#endif /* __PAGETABLE_PUD_FOLDED */
+
+#endif /* _ASM_PGALLOC_H */
diff --git a/arch/loongarch/include/asm/pgtable-bits.h b/arch/loongarch/include/asm/pgtable-bits.h
new file mode 100644
index 000000000000..62fddbed56d2
--- /dev/null
+++ b/arch/loongarch/include/asm/pgtable-bits.h
@@ -0,0 +1,139 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_PGTABLE_BITS_H
+#define _ASM_PGTABLE_BITS_H
+
+/* Page table bits */
+
+#define	_PAGE_VALID_SHIFT	0
+#define	_PAGE_ACCESSED_SHIFT	0  /* Reuse Valid for Accessed */
+#define	_PAGE_DIRTY_SHIFT	1
+#define	_PAGE_PLV_SHIFT		2  /* 2~3, two bits */
+#define	_CACHE_SHIFT		4  /* 4~5, two bits */
+#define	_PAGE_GLOBAL_SHIFT	6
+#define	_PAGE_HUGE_SHIFT	6  /* HUGE is a PMD bit */
+#define	_PAGE_PRESENT_SHIFT	7
+#define	_PAGE_WRITE_SHIFT	8
+#define	_PAGE_MODIFIED_SHIFT	9
+#define	_PAGE_PROTNONE_SHIFT	10
+#define	_PAGE_SPECIAL_SHIFT	11
+#define	_PAGE_HGLOBAL_SHIFT	12 /* HGlobal is a PMD bit */
+#define	_PAGE_PFN_SHIFT		12
+#define	_PAGE_PFN_END_SHIFT	48
+#define	_PAGE_NO_READ_SHIFT	61
+#define	_PAGE_NO_EXEC_SHIFT	62
+#define	_PAGE_RPLV_SHIFT	63
+
+/* Used only by software */
+#define _PAGE_PRESENT		(_ULCAST_(1) << _PAGE_PRESENT_SHIFT)
+#define _PAGE_WRITE		(_ULCAST_(1) << _PAGE_WRITE_SHIFT)
+#define _PAGE_ACCESSED		(_ULCAST_(1) << _PAGE_ACCESSED_SHIFT)
+#define _PAGE_MODIFIED		(_ULCAST_(1) << _PAGE_MODIFIED_SHIFT)
+#define _PAGE_PROTNONE		(_ULCAST_(1) << _PAGE_PROTNONE_SHIFT)
+#define _PAGE_SPECIAL		(_ULCAST_(1) << _PAGE_SPECIAL_SHIFT)
+
+/* Used by TLB hardware (placed in EntryLo*) */
+#define _PAGE_VALID		(_ULCAST_(1) << _PAGE_VALID_SHIFT)
+#define _PAGE_DIRTY		(_ULCAST_(1) << _PAGE_DIRTY_SHIFT)
+#define _PAGE_PLV		(_ULCAST_(3) << _PAGE_PLV_SHIFT)
+#define _PAGE_GLOBAL		(_ULCAST_(1) << _PAGE_GLOBAL_SHIFT)
+#define _PAGE_HUGE		(_ULCAST_(1) << _PAGE_HUGE_SHIFT)
+#define _PAGE_HGLOBAL		(_ULCAST_(1) << _PAGE_HGLOBAL_SHIFT)
+#define _PAGE_NO_READ		(_ULCAST_(1) << _PAGE_NO_READ_SHIFT)
+#define _PAGE_NO_EXEC		(_ULCAST_(1) << _PAGE_NO_EXEC_SHIFT)
+#define _PAGE_RPLV		(_ULCAST_(1) << _PAGE_RPLV_SHIFT)
+#define _CACHE_MASK		(_ULCAST_(3) << _CACHE_SHIFT)
+#define _PFN_SHIFT		(PAGE_SHIFT - 12 + _PAGE_PFN_SHIFT)
+
+#define _PAGE_USER	(PLV_USER << _PAGE_PLV_SHIFT)
+#define _PAGE_KERN	(PLV_KERN << _PAGE_PLV_SHIFT)
+
+#define _PFN_MASK (~((_ULCAST_(1) << (_PFN_SHIFT)) - 1) & \
+		  ((_ULCAST_(1) << (_PAGE_PFN_END_SHIFT)) - 1))
+
+/*
+ * Cache attributes
+ */
+
+#ifndef _CACHE_SUC
+#define _CACHE_SUC			(0<<_CACHE_SHIFT) /* Strong-ordered UnCached */
+#endif
+#ifndef _CACHE_CC
+#define _CACHE_CC			(1<<_CACHE_SHIFT) /* Coherent Cached */
+#endif
+#ifndef _CACHE_WUC
+#define _CACHE_WUC			(2<<_CACHE_SHIFT) /* Weak-ordered UnCached */
+#endif
+
+#define __READABLE	(_PAGE_VALID)
+#define __WRITEABLE	(_PAGE_DIRTY | _PAGE_WRITE)
+
+#define _PAGE_CHG_MASK	(_PAGE_MODIFIED | _PAGE_SPECIAL | _PFN_MASK | _CACHE_MASK | _PAGE_PLV)
+#define _HPAGE_CHG_MASK	(_PAGE_MODIFIED | _PAGE_SPECIAL | _PFN_MASK | _CACHE_MASK | _PAGE_PLV | _PAGE_HUGE)
+
+#define PAGE_NONE	__pgprot(_PAGE_PROTNONE | _PAGE_NO_READ | \
+				 _PAGE_USER | _CACHE_CC)
+#define PAGE_SHARED	__pgprot(_PAGE_PRESENT | _PAGE_WRITE | \
+				 _PAGE_USER | _CACHE_CC)
+#define PAGE_READONLY	__pgprot(_PAGE_PRESENT | _PAGE_USER | _CACHE_CC)
+
+#define PAGE_KERNEL	__pgprot(_PAGE_PRESENT | __READABLE | __WRITEABLE | \
+				 _PAGE_GLOBAL | _PAGE_KERN | _CACHE_CC)
+#define PAGE_KERNEL_SUC __pgprot(_PAGE_PRESENT | __READABLE | __WRITEABLE | \
+				 _PAGE_GLOBAL | _PAGE_KERN |  _CACHE_SUC)
+#define PAGE_KERNEL_WUC __pgprot(_PAGE_PRESENT | __READABLE | __WRITEABLE | \
+				 _PAGE_GLOBAL | _PAGE_KERN |  _CACHE_WUC)
+
+#define __P000 __pgprot(_CACHE_CC | _PAGE_USER | _PAGE_PROTNONE | _PAGE_NO_EXEC | _PAGE_NO_READ)
+#define __P001 __pgprot(_CACHE_CC | _PAGE_VALID | _PAGE_USER | _PAGE_PRESENT | _PAGE_NO_EXEC)
+#define __P010 __pgprot(_CACHE_CC | _PAGE_VALID | _PAGE_USER | _PAGE_PRESENT | _PAGE_NO_EXEC)
+#define __P011 __pgprot(_CACHE_CC | _PAGE_VALID | _PAGE_USER | _PAGE_PRESENT | _PAGE_NO_EXEC)
+#define __P100 __pgprot(_CACHE_CC | _PAGE_VALID | _PAGE_USER | _PAGE_PRESENT)
+#define __P101 __pgprot(_CACHE_CC | _PAGE_VALID | _PAGE_USER | _PAGE_PRESENT)
+#define __P110 __pgprot(_CACHE_CC | _PAGE_VALID | _PAGE_USER | _PAGE_PRESENT)
+#define __P111 __pgprot(_CACHE_CC | _PAGE_VALID | _PAGE_USER | _PAGE_PRESENT)
+
+#define __S000 __pgprot(_CACHE_CC | _PAGE_USER | _PAGE_PROTNONE | _PAGE_NO_EXEC | _PAGE_NO_READ)
+#define __S001 __pgprot(_CACHE_CC | _PAGE_VALID | _PAGE_USER | _PAGE_PRESENT | _PAGE_NO_EXEC)
+#define __S010 __pgprot(_CACHE_CC | _PAGE_VALID | _PAGE_USER | _PAGE_PRESENT | _PAGE_NO_EXEC | _PAGE_WRITE)
+#define __S011 __pgprot(_CACHE_CC | _PAGE_VALID | _PAGE_USER | _PAGE_PRESENT | _PAGE_NO_EXEC | _PAGE_WRITE)
+#define __S100 __pgprot(_CACHE_CC | _PAGE_VALID | _PAGE_USER | _PAGE_PRESENT)
+#define __S101 __pgprot(_CACHE_CC | _PAGE_VALID | _PAGE_USER | _PAGE_PRESENT)
+#define __S110 __pgprot(_CACHE_CC | _PAGE_VALID | _PAGE_USER | _PAGE_PRESENT | _PAGE_WRITE)
+#define __S111 __pgprot(_CACHE_CC | _PAGE_VALID | _PAGE_USER | _PAGE_PRESENT | _PAGE_WRITE)
+
+#ifndef __ASSEMBLY__
+
+/*
+ * Macro to make mark a page protection value as "uncacheable".	 Note
+ * that "protection" is really a misnomer here as the protection value
+ * contains the memory attribute bits, dirty bits, and various other
+ * bits as well.
+ */
+#define pgprot_noncached pgprot_noncached
+
+static inline pgprot_t pgprot_noncached(pgprot_t _prot)
+{
+	unsigned long prot = pgprot_val(_prot);
+
+	prot = (prot & ~_CACHE_MASK) | _CACHE_SUC;
+
+	return __pgprot(prot);
+}
+
+#define pgprot_writecombine pgprot_writecombine
+
+static inline pgprot_t pgprot_writecombine(pgprot_t _prot)
+{
+	unsigned long prot = pgprot_val(_prot);
+
+	prot = (prot & ~_CACHE_MASK) | _CACHE_WUC;
+
+	return __pgprot(prot);
+}
+
+#endif /* !__ASSEMBLY__ */
+
+#endif /* _ASM_PGTABLE_BITS_H */
diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h
new file mode 100644
index 000000000000..ae8f3ef61091
--- /dev/null
+++ b/arch/loongarch/include/asm/pgtable.h
@@ -0,0 +1,539 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ *
+ * Derived from MIPS:
+ * Copyright (C) 1994, 95, 96, 97, 98, 99, 2000, 2003 Ralf Baechle
+ * Copyright (C) 1999, 2000, 2001 Silicon Graphics, Inc.
+ */
+#ifndef _ASM_PGTABLE_H
+#define _ASM_PGTABLE_H
+
+#include <linux/compiler.h>
+#include <asm/addrspace.h>
+#include <asm/pgtable-bits.h>
+
+#if CONFIG_PGTABLE_LEVELS == 2
+#include <asm-generic/pgtable-nopmd.h>
+#elif CONFIG_PGTABLE_LEVELS == 3
+#include <asm-generic/pgtable-nopud.h>
+#else
+#include <asm-generic/pgtable-nop4d.h>
+#endif
+
+#define PGD_ORDER		0
+#define PUD_ORDER		0
+#define PMD_ORDER		0
+#define PTE_ORDER		0
+
+#if CONFIG_PGTABLE_LEVELS == 2
+#define PGDIR_SHIFT	(PAGE_SHIFT + (PAGE_SHIFT + PTE_ORDER - 3))
+#elif CONFIG_PGTABLE_LEVELS == 3
+#define PMD_SHIFT	(PAGE_SHIFT + (PAGE_SHIFT + PTE_ORDER - 3))
+#define PMD_SIZE	(1UL << PMD_SHIFT)
+#define PMD_MASK	(~(PMD_SIZE-1))
+#define PGDIR_SHIFT	(PMD_SHIFT + (PAGE_SHIFT + PMD_ORDER - 3))
+#elif CONFIG_PGTABLE_LEVELS == 4
+#define PMD_SHIFT	(PAGE_SHIFT + (PAGE_SHIFT + PTE_ORDER - 3))
+#define PMD_SIZE	(1UL << PMD_SHIFT)
+#define PMD_MASK	(~(PMD_SIZE-1))
+#define PUD_SHIFT	(PMD_SHIFT + (PAGE_SHIFT + PMD_ORDER - 3))
+#define PUD_SIZE	(1UL << PUD_SHIFT)
+#define PUD_MASK	(~(PUD_SIZE-1))
+#define PGDIR_SHIFT	(PUD_SHIFT + (PAGE_SHIFT + PUD_ORDER - 3))
+#endif
+
+#define PGDIR_SIZE	(1UL << PGDIR_SHIFT)
+#define PGDIR_MASK	(~(PGDIR_SIZE-1))
+
+#define VA_BITS		(PGDIR_SHIFT + (PAGE_SHIFT + PGD_ORDER - 3))
+
+#define PTRS_PER_PGD	((PAGE_SIZE << PGD_ORDER) >> 3)
+#if CONFIG_PGTABLE_LEVELS > 3
+#define PTRS_PER_PUD	((PAGE_SIZE << PUD_ORDER) >> 3)
+#endif
+#if CONFIG_PGTABLE_LEVELS > 2
+#define PTRS_PER_PMD	((PAGE_SIZE << PMD_ORDER) >> 3)
+#endif
+#define PTRS_PER_PTE	((PAGE_SIZE << PTE_ORDER) >> 3)
+
+#define USER_PTRS_PER_PGD       ((TASK_SIZE64 / PGDIR_SIZE)?(TASK_SIZE64 / PGDIR_SIZE):1)
+
+#ifndef __ASSEMBLY__
+
+#include <linux/mm_types.h>
+#include <linux/mmzone.h>
+#include <asm/fixmap.h>
+#include <asm/io.h>
+
+struct mm_struct;
+struct vm_area_struct;
+
+/*
+ * ZERO_PAGE is a global shared page that is always zero; used
+ * for zero-mapped memory areas etc..
+ */
+
+extern unsigned long empty_zero_page;
+extern unsigned long zero_page_mask;
+
+#define ZERO_PAGE(vaddr) \
+	(virt_to_page((void *)(empty_zero_page + (((unsigned long)(vaddr)) & zero_page_mask))))
+#define __HAVE_COLOR_ZERO_PAGE
+
+/*
+ * TLB refill handlers may also map the vmalloc area into xkvrange.
+ * Avoid the first couple of pages so NULL pointer dereferences will
+ * still reliably trap.
+ */
+#define MODULES_VADDR	(vm_map_base + PCI_IOSIZE + (2 * PAGE_SIZE))
+#define MODULES_END	(MODULES_VADDR + SZ_256M)
+
+#define VMALLOC_START	MODULES_END
+#define VMALLOC_END	\
+	(vm_map_base +	\
+	 min(PTRS_PER_PGD * PTRS_PER_PUD * PTRS_PER_PMD * PTRS_PER_PTE * PAGE_SIZE, (1UL << cpu_vabits)) - PMD_SIZE)
+
+#define pte_ERROR(e) \
+	pr_err("%s:%d: bad pte %016lx.\n", __FILE__, __LINE__, pte_val(e))
+#ifndef __PAGETABLE_PMD_FOLDED
+#define pmd_ERROR(e) \
+	pr_err("%s:%d: bad pmd %016lx.\n", __FILE__, __LINE__, pmd_val(e))
+#endif
+#ifndef __PAGETABLE_PUD_FOLDED
+#define pud_ERROR(e) \
+	pr_err("%s:%d: bad pud %016lx.\n", __FILE__, __LINE__, pud_val(e))
+#endif
+#define pgd_ERROR(e) \
+	pr_err("%s:%d: bad pgd %016lx.\n", __FILE__, __LINE__, pgd_val(e))
+
+extern pte_t invalid_pte_table[PTRS_PER_PTE];
+
+#ifndef __PAGETABLE_PUD_FOLDED
+
+typedef struct { unsigned long pud; } pud_t;
+#define pud_val(x)	((x).pud)
+#define __pud(x)	((pud_t) { (x) })
+
+extern pud_t invalid_pud_table[PTRS_PER_PUD];
+
+/*
+ * Empty pgd/p4d entries point to the invalid_pud_table.
+ */
+static inline int p4d_none(p4d_t p4d)
+{
+	return p4d_val(p4d) == (unsigned long)invalid_pud_table;
+}
+
+static inline int p4d_bad(p4d_t p4d)
+{
+	return p4d_val(p4d) & ~PAGE_MASK;
+}
+
+static inline int p4d_present(p4d_t p4d)
+{
+	return p4d_val(p4d) != (unsigned long)invalid_pud_table;
+}
+
+static inline void p4d_clear(p4d_t *p4dp)
+{
+	p4d_val(*p4dp) = (unsigned long)invalid_pud_table;
+}
+
+static inline pud_t *p4d_pgtable(p4d_t p4d)
+{
+	return (pud_t *)p4d_val(p4d);
+}
+
+static inline void set_p4d(p4d_t *p4d, p4d_t p4dval)
+{
+	*p4d = p4dval;
+}
+
+#define p4d_phys(p4d)		virt_to_phys((void *)p4d_val(p4d))
+#define p4d_page(p4d)		(pfn_to_page(p4d_phys(p4d) >> PAGE_SHIFT))
+
+#endif
+
+#ifndef __PAGETABLE_PMD_FOLDED
+
+typedef struct { unsigned long pmd; } pmd_t;
+#define pmd_val(x)	((x).pmd)
+#define __pmd(x)	((pmd_t) { (x) })
+
+extern pmd_t invalid_pmd_table[PTRS_PER_PMD];
+
+/*
+ * Empty pud entries point to the invalid_pmd_table.
+ */
+static inline int pud_none(pud_t pud)
+{
+	return pud_val(pud) == (unsigned long)invalid_pmd_table;
+}
+
+static inline int pud_bad(pud_t pud)
+{
+	return pud_val(pud) & ~PAGE_MASK;
+}
+
+static inline int pud_present(pud_t pud)
+{
+	return pud_val(pud) != (unsigned long)invalid_pmd_table;
+}
+
+static inline void pud_clear(pud_t *pudp)
+{
+	pud_val(*pudp) = ((unsigned long)invalid_pmd_table);
+}
+
+static inline pmd_t *pud_pgtable(pud_t pud)
+{
+	return (pmd_t *)pud_val(pud);
+}
+
+#define set_pud(pudptr, pudval) do { *(pudptr) = (pudval); } while (0)
+
+#define pud_phys(pud)		virt_to_phys((void *)pud_val(pud))
+#define pud_page(pud)		(pfn_to_page(pud_phys(pud) >> PAGE_SHIFT))
+
+#endif
+
+/*
+ * Empty pmd entries point to the invalid_pte_table.
+ */
+static inline int pmd_none(pmd_t pmd)
+{
+	return pmd_val(pmd) == (unsigned long)invalid_pte_table;
+}
+
+static inline int pmd_bad(pmd_t pmd)
+{
+	/* pmd_huge(pmd) but inline */
+	if (unlikely(pmd_val(pmd) & _PAGE_HUGE))
+		return 0;
+
+	if (unlikely(pmd_val(pmd) & ~PAGE_MASK))
+		return 1;
+
+	return 0;
+}
+
+static inline int pmd_present(pmd_t pmd)
+{
+	if (unlikely(pmd_val(pmd) & _PAGE_HUGE))
+		return !!(pmd_val(pmd) & (_PAGE_PRESENT | _PAGE_PROTNONE));
+
+	return pmd_val(pmd) != (unsigned long)invalid_pte_table;
+}
+
+static inline void pmd_clear(pmd_t *pmdp)
+{
+	pmd_val(*pmdp) = ((unsigned long)invalid_pte_table);
+}
+
+#define set_pmd(pmdptr, pmdval) do { *(pmdptr) = (pmdval); } while (0)
+
+#define pmd_phys(pmd)		virt_to_phys((void *)pmd_val(pmd))
+
+#ifndef CONFIG_TRANSPARENT_HUGEPAGE
+#define pmd_page(pmd)		(pfn_to_page(pmd_phys(pmd) >> PAGE_SHIFT))
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE  */
+
+#define pmd_page_vaddr(pmd)	pmd_val(pmd)
+
+extern pmd_t mk_pmd(struct page *page, pgprot_t prot);
+extern void set_pmd_at(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pmd_t pmd);
+
+#define pte_page(x)		pfn_to_page(pte_pfn(x))
+#define pte_pfn(x)		((unsigned long)(((x).pte & _PFN_MASK) >> _PFN_SHIFT))
+#define pfn_pte(pfn, prot)	__pte(((pfn) << _PFN_SHIFT) | pgprot_val(prot))
+#define pfn_pmd(pfn, prot)	__pmd(((pfn) << _PFN_SHIFT) | pgprot_val(prot))
+
+/*
+ * Initialize a new pgd / pmd table with invalid pointers.
+ */
+extern void pgd_init(unsigned long page);
+extern void pud_init(unsigned long page, unsigned long pagetable);
+extern void pmd_init(unsigned long page, unsigned long pagetable);
+
+/*
+ * Non-present pages:  high 40 bits are offset, next 8 bits type,
+ * low 16 bits zero.
+ */
+static inline pte_t mk_swap_pte(unsigned long type, unsigned long offset)
+{ pte_t pte; pte_val(pte) = (type << 16) | (offset << 24); return pte; }
+
+#define __swp_type(x)		(((x).val >> 16) & 0xff)
+#define __swp_offset(x)		((x).val >> 24)
+#define __swp_entry(type, offset) ((swp_entry_t) { pte_val(mk_swap_pte((type), (offset))) })
+#define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) })
+#define __swp_entry_to_pte(x)	((pte_t) { (x).val })
+#define __pmd_to_swp_entry(pmd) ((swp_entry_t) { pmd_val(pmd) })
+#define __swp_entry_to_pmd(x)	((pmd_t) { (x).val | _PAGE_HUGE })
+
+extern void paging_init(void);
+
+#define pte_none(pte)		(!(pte_val(pte) & ~_PAGE_GLOBAL))
+#define pte_present(pte)	(pte_val(pte) & (_PAGE_PRESENT | _PAGE_PROTNONE))
+#define pte_no_exec(pte)	(pte_val(pte) & _PAGE_NO_EXEC)
+
+static inline void set_pte(pte_t *ptep, pte_t pteval)
+{
+	*ptep = pteval;
+	if (pte_val(pteval) & _PAGE_GLOBAL) {
+		pte_t *buddy = ptep_buddy(ptep);
+		/*
+		 * Make sure the buddy is global too (if it's !none,
+		 * it better already be global)
+		 */
+		if (pte_none(*buddy))
+			pte_val(*buddy) = pte_val(*buddy) | _PAGE_GLOBAL;
+	}
+}
+
+static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
+			      pte_t *ptep, pte_t pteval)
+{
+	set_pte(ptep, pteval);
+}
+
+static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
+{
+	/* Preserve global status for the pair */
+	if (pte_val(*ptep_buddy(ptep)) & _PAGE_GLOBAL)
+		set_pte_at(mm, addr, ptep, __pte(_PAGE_GLOBAL));
+	else
+		set_pte_at(mm, addr, ptep, __pte(0));
+}
+
+#define PGD_T_LOG2	(__builtin_ffs(sizeof(pgd_t)) - 1)
+#define PMD_T_LOG2	(__builtin_ffs(sizeof(pmd_t)) - 1)
+#define PTE_T_LOG2	(__builtin_ffs(sizeof(pte_t)) - 1)
+
+extern pgd_t swapper_pg_dir[];
+extern pgd_t invalid_pg_dir[];
+
+/*
+ * The following only work if pte_present() is true.
+ * Undefined behaviour if not..
+ */
+static inline int pte_write(pte_t pte)	{ return pte_val(pte) & _PAGE_WRITE; }
+static inline int pte_young(pte_t pte)	{ return pte_val(pte) & _PAGE_ACCESSED; }
+static inline int pte_dirty(pte_t pte)	{ return pte_val(pte) & _PAGE_MODIFIED; }
+
+static inline pte_t pte_mkold(pte_t pte)
+{
+	pte_val(pte) &= ~_PAGE_ACCESSED;
+	return pte;
+}
+
+static inline pte_t pte_mkyoung(pte_t pte)
+{
+	pte_val(pte) |= _PAGE_ACCESSED;
+	return pte;
+}
+
+static inline pte_t pte_mkclean(pte_t pte)
+{
+	pte_val(pte) &= ~(_PAGE_DIRTY | _PAGE_MODIFIED);
+	return pte;
+}
+
+static inline pte_t pte_mkdirty(pte_t pte)
+{
+	pte_val(pte) |= (_PAGE_DIRTY | _PAGE_MODIFIED);
+	return pte;
+}
+
+static inline pte_t pte_mkwrite(pte_t pte)
+{
+	pte_val(pte) |= (_PAGE_WRITE | _PAGE_DIRTY);
+	return pte;
+}
+
+static inline pte_t pte_wrprotect(pte_t pte)
+{
+	pte_val(pte) &= ~(_PAGE_WRITE | _PAGE_DIRTY);
+	return pte;
+}
+
+static inline int pte_huge(pte_t pte)	{ return pte_val(pte) & _PAGE_HUGE; }
+
+static inline pte_t pte_mkhuge(pte_t pte)
+{
+	pte_val(pte) |= _PAGE_HUGE;
+	return pte;
+}
+
+#if defined(CONFIG_ARCH_HAS_PTE_SPECIAL)
+static inline int pte_special(pte_t pte)	{ return pte_val(pte) & _PAGE_SPECIAL; }
+static inline pte_t pte_mkspecial(pte_t pte)	{ pte_val(pte) |= _PAGE_SPECIAL; return pte; }
+#endif /* CONFIG_ARCH_HAS_PTE_SPECIAL */
+
+#define pte_accessible pte_accessible
+static inline unsigned long pte_accessible(struct mm_struct *mm, pte_t a)
+{
+	if (pte_val(a) & _PAGE_PRESENT)
+		return true;
+
+	if ((pte_val(a) & _PAGE_PROTNONE) &&
+			atomic_read(&mm->tlb_flush_pending))
+		return true;
+
+	return false;
+}
+
+/*
+ * Conversion functions: convert a page and protection to a page entry,
+ * and a page entry and page directory to the page they refer to.
+ */
+#define mk_pte(page, pgprot)	pfn_pte(page_to_pfn(page), (pgprot))
+
+static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
+{
+	return __pte((pte_val(pte) & _PAGE_CHG_MASK) |
+		     (pgprot_val(newprot) & ~_PAGE_CHG_MASK));
+}
+
+extern void __update_tlb(struct vm_area_struct *vma,
+			unsigned long address, pte_t *ptep);
+
+static inline void update_mmu_cache(struct vm_area_struct *vma,
+			unsigned long address, pte_t *ptep)
+{
+	__update_tlb(vma, address, ptep);
+}
+
+static inline void update_mmu_cache_pmd(struct vm_area_struct *vma,
+			unsigned long address, pmd_t *pmdp)
+{
+	__update_tlb(vma, address, (pte_t *)pmdp);
+}
+
+#define kern_addr_valid(addr)	(1)
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+
+/* We don't have hardware dirty/accessed bits, generic_pmdp_establish is fine.*/
+#define pmdp_establish generic_pmdp_establish
+
+static inline int pmd_trans_huge(pmd_t pmd)
+{
+	return !!(pmd_val(pmd) & _PAGE_HUGE) && pmd_present(pmd);
+}
+
+static inline pmd_t pmd_mkhuge(pmd_t pmd)
+{
+	pmd_val(pmd) = (pmd_val(pmd) & ~(_PAGE_GLOBAL)) |
+		((pmd_val(pmd) & _PAGE_GLOBAL) << (_PAGE_HGLOBAL_SHIFT - _PAGE_GLOBAL_SHIFT));
+	pmd_val(pmd) |= _PAGE_HUGE;
+
+	return pmd;
+}
+
+#define pmd_write pmd_write
+static inline int pmd_write(pmd_t pmd)
+{
+	return !!(pmd_val(pmd) & _PAGE_WRITE);
+}
+
+static inline pmd_t pmd_mkwrite(pmd_t pmd)
+{
+	pmd_val(pmd) |= (_PAGE_WRITE | _PAGE_DIRTY);
+	return pmd;
+}
+
+static inline pmd_t pmd_wrprotect(pmd_t pmd)
+{
+	pmd_val(pmd) &= ~(_PAGE_WRITE | _PAGE_DIRTY);
+	return pmd;
+}
+
+static inline int pmd_dirty(pmd_t pmd)
+{
+	return !!(pmd_val(pmd) & _PAGE_MODIFIED);
+}
+
+static inline pmd_t pmd_mkclean(pmd_t pmd)
+{
+	pmd_val(pmd) &= ~(_PAGE_DIRTY | _PAGE_MODIFIED);
+	return pmd;
+}
+
+static inline pmd_t pmd_mkdirty(pmd_t pmd)
+{
+	pmd_val(pmd) |= (_PAGE_DIRTY | _PAGE_MODIFIED);
+	return pmd;
+}
+
+static inline int pmd_young(pmd_t pmd)
+{
+	return !!(pmd_val(pmd) & _PAGE_ACCESSED);
+}
+
+static inline pmd_t pmd_mkold(pmd_t pmd)
+{
+	pmd_val(pmd) &= ~_PAGE_ACCESSED;
+	return pmd;
+}
+
+static inline pmd_t pmd_mkyoung(pmd_t pmd)
+{
+	pmd_val(pmd) |= _PAGE_ACCESSED;
+	return pmd;
+}
+
+static inline unsigned long pmd_pfn(pmd_t pmd)
+{
+	return (pmd_val(pmd) & _PFN_MASK) >> _PFN_SHIFT;
+}
+
+static inline struct page *pmd_page(pmd_t pmd)
+{
+	if (pmd_trans_huge(pmd))
+		return pfn_to_page(pmd_pfn(pmd));
+
+	return pfn_to_page(pmd_phys(pmd) >> PAGE_SHIFT);
+}
+
+static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot)
+{
+	pmd_val(pmd) = (pmd_val(pmd) & _HPAGE_CHG_MASK) |
+				(pgprot_val(newprot) & ~_HPAGE_CHG_MASK);
+	return pmd;
+}
+
+static inline pmd_t pmd_mkinvalid(pmd_t pmd)
+{
+	pmd_val(pmd) &= ~(_PAGE_PRESENT | _PAGE_VALID | _PAGE_DIRTY | _PAGE_PROTNONE);
+
+	return pmd;
+}
+
+/*
+ * The generic version pmdp_huge_get_and_clear uses a version of pmd_clear() with a
+ * different prototype.
+ */
+#define __HAVE_ARCH_PMDP_HUGE_GET_AND_CLEAR
+static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm,
+					    unsigned long address, pmd_t *pmdp)
+{
+	pmd_t old = *pmdp;
+
+	pmd_clear(pmdp);
+
+	return old;
+}
+
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+
+/*
+ * We provide our own get_unmapped area to cope with the virtual aliasing
+ * constraints placed on us by the cache architecture.
+ */
+#define HAVE_ARCH_UNMAPPED_AREA
+#define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
+
+#endif /* !__ASSEMBLY__ */
+
+#endif /* _ASM_PGTABLE_H */
diff --git a/arch/loongarch/include/asm/shmparam.h b/arch/loongarch/include/asm/shmparam.h
new file mode 100644
index 000000000000..c9554f48d2df
--- /dev/null
+++ b/arch/loongarch/include/asm/shmparam.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_SHMPARAM_H
+#define _ASM_SHMPARAM_H
+
+#define __ARCH_FORCE_SHMLBA	1
+
+#define	SHMLBA	SZ_64K		 /* attach addr a multiple of this */
+
+#endif /* _ASM_SHMPARAM_H */
diff --git a/arch/loongarch/include/asm/sparsemem.h b/arch/loongarch/include/asm/sparsemem.h
new file mode 100644
index 000000000000..3d18cdf1b069
--- /dev/null
+++ b/arch/loongarch/include/asm/sparsemem.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LOONGARCH_SPARSEMEM_H
+#define _LOONGARCH_SPARSEMEM_H
+
+#ifdef CONFIG_SPARSEMEM
+
+/*
+ * SECTION_SIZE_BITS		2^N: how big each section will be
+ * MAX_PHYSMEM_BITS		2^N: how much memory we can have in that space
+ */
+#define SECTION_SIZE_BITS	29 /* 2^29 = Largest Huge Page Size */
+#define MAX_PHYSMEM_BITS	48
+
+#endif /* CONFIG_SPARSEMEM */
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+int memory_add_physaddr_to_nid(u64 addr);
+#define memory_add_physaddr_to_nid memory_add_physaddr_to_nid
+#endif
+
+#define INIT_MEMBLOCK_RESERVED_REGIONS	(INIT_MEMBLOCK_REGIONS + NR_CPUS)
+
+#endif /* _LOONGARCH_SPARSEMEM_H */
diff --git a/arch/loongarch/include/asm/tlb.h b/arch/loongarch/include/asm/tlb.h
new file mode 100644
index 000000000000..a9dda11c494b
--- /dev/null
+++ b/arch/loongarch/include/asm/tlb.h
@@ -0,0 +1,216 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM_TLB_H
+#define __ASM_TLB_H
+
+#include <linux/mm_types.h>
+#include <asm/cpu-features.h>
+#include <asm/loongarch.h>
+
+/*
+ * TLB Invalidate Flush
+ */
+static inline void tlbclr(void)
+{
+	__asm__ __volatile__("tlbclr");
+}
+
+static inline void tlbflush(void)
+{
+	__asm__ __volatile__("tlbflush");
+}
+
+/*
+ * TLB R/W operations.
+ */
+static inline void tlb_probe(void)
+{
+	__asm__ __volatile__("tlbsrch");
+}
+
+static inline void tlb_read(void)
+{
+	__asm__ __volatile__("tlbrd");
+}
+
+static inline void tlb_write_indexed(void)
+{
+	__asm__ __volatile__("tlbwr");
+}
+
+static inline void tlb_write_random(void)
+{
+	__asm__ __volatile__("tlbfill");
+}
+
+/*
+ * Guest TLB Invalidate Flush
+ */
+static inline void guest_tlbflush(void)
+{
+	__asm__ __volatile__(
+		".word 0x6482401\n\t");
+}
+
+/*
+ * Guest TLB R/W operations.
+ */
+static inline void guest_tlb_probe(void)
+{
+	__asm__ __volatile__(
+		".word 0x6482801\n\t");
+}
+
+static inline void guest_tlb_read(void)
+{
+	__asm__ __volatile__(
+		".word 0x6482c01\n\t");
+}
+
+static inline void guest_tlb_write_indexed(void)
+{
+	__asm__ __volatile__(
+		".word 0x6483001\n\t");
+}
+
+static inline void guest_tlb_write_random(void)
+{
+	__asm__ __volatile__(
+		".word 0x6483401\n\t");
+}
+
+enum invtlb_ops {
+	/* Invalid all tlb */
+	INVTLB_ALL = 0x0,
+	/* Invalid current tlb */
+	INVTLB_CURRENT_ALL = 0x1,
+	/* Invalid all global=1 lines in current tlb */
+	INVTLB_CURRENT_GTRUE = 0x2,
+	/* Invalid all global=0 lines in current tlb */
+	INVTLB_CURRENT_GFALSE = 0x3,
+	/* Invalid global=0 and matched asid lines in current tlb */
+	INVTLB_GFALSE_AND_ASID = 0x4,
+	/* Invalid addr with global=0 and matched asid in current tlb */
+	INVTLB_ADDR_GFALSE_AND_ASID = 0x5,
+	/* Invalid addr with global=1 or matched asid in current tlb */
+	INVTLB_ADDR_GTRUE_OR_ASID = 0x6,
+	/* Invalid matched gid in guest tlb */
+	INVGTLB_GID = 0x9,
+	/* Invalid global=1, matched gid in guest tlb */
+	INVGTLB_GID_GTRUE = 0xa,
+	/* Invalid global=0, matched gid in guest tlb */
+	INVGTLB_GID_GFALSE = 0xb,
+	/* Invalid global=0, matched gid and asid in guest tlb */
+	INVGTLB_GID_GFALSE_ASID = 0xc,
+	/* Invalid global=0 , matched gid, asid and addr in guest tlb */
+	INVGTLB_GID_GFALSE_ASID_ADDR = 0xd,
+	/* Invalid global=1 , matched gid, asid and addr in guest tlb */
+	INVGTLB_GID_GTRUE_ASID_ADDR = 0xe,
+	/* Invalid all gid gva-->gpa guest tlb */
+	INVGTLB_ALLGID_GVA_TO_GPA = 0x10,
+	/* Invalid all gid gpa-->hpa tlb */
+	INVTLB_ALLGID_GPA_TO_HPA = 0x11,
+	/* Invalid all gid tlb, including  gva-->gpa and gpa-->hpa */
+	INVTLB_ALLGID = 0x12,
+	/* Invalid matched gid gva-->gpa guest tlb */
+	INVGTLB_GID_GVA_TO_GPA = 0x13,
+	/* Invalid matched gid gpa-->hpa tlb */
+	INVTLB_GID_GPA_TO_HPA = 0x14,
+	/* Invalid matched gid tlb,including gva-->gpa and gpa-->hpa */
+	INVTLB_GID_ALL = 0x15,
+	/* Invalid matched gid and addr gpa-->hpa tlb */
+	INVTLB_GID_ADDR = 0x16,
+};
+
+/*
+ * invtlb op info addr
+ * (0x1 << 26) | (0x24 << 20) | (0x13 << 15) |
+ * (addr << 10) | (info << 5) | op
+ */
+static inline void invtlb(u32 op, u32 info, u64 addr)
+{
+	__asm__ __volatile__(
+		"parse_r addr,%0\n\t"
+		"parse_r info,%1\n\t"
+		".word ((0x6498000) | (addr << 10) | (info << 5) | %2)\n\t"
+		:
+		: "r"(addr), "r"(info), "i"(op)
+		:
+		);
+}
+
+static inline void invtlb_addr(u32 op, u32 info, u64 addr)
+{
+	__asm__ __volatile__(
+		"parse_r addr,%0\n\t"
+		".word ((0x6498000) | (addr << 10) | (0 << 5) | %1)\n\t"
+		:
+		: "r"(addr), "i"(op)
+		:
+		);
+}
+
+static inline void invtlb_info(u32 op, u32 info, u64 addr)
+{
+	__asm__ __volatile__(
+		"parse_r info,%0\n\t"
+		".word ((0x6498000) | (0 << 10) | (info << 5) | %1)\n\t"
+		:
+		: "r"(info), "i"(op)
+		:
+		);
+}
+
+static inline void invtlb_all(u32 op, u32 info, u64 addr)
+{
+	__asm__ __volatile__(
+		".word ((0x6498000) | (0 << 10) | (0 << 5) | %0)\n\t"
+		:
+		: "i"(op)
+		:
+		);
+}
+
+/*
+ * LoongArch doesn't need any special per-pte or per-vma handling, except
+ * we need to flush cache for area to be unmapped.
+ */
+#define tlb_start_vma(tlb, vma)					\
+	do {							\
+		if (!(tlb)->fullmm)				\
+			flush_cache_range(vma, vma->vm_start, vma->vm_end); \
+	}  while (0)
+#define tlb_end_vma(tlb, vma) do { } while (0)
+#define __tlb_remove_tlb_entry(tlb, ptep, address) do { } while (0)
+
+static void tlb_flush(struct mmu_gather *tlb);
+
+#define tlb_flush tlb_flush
+#include <asm-generic/tlb.h>
+
+static inline void tlb_flush(struct mmu_gather *tlb)
+{
+	struct vm_area_struct vma;
+
+	vma.vm_mm = tlb->mm;
+	vma.vm_flags = 0;
+	if (tlb->fullmm) {
+		flush_tlb_mm(tlb->mm);
+		return;
+	}
+
+	flush_tlb_range(&vma, tlb->start, tlb->end);
+}
+
+extern void handle_tlb_load(void);
+extern void handle_tlb_store(void);
+extern void handle_tlb_modify(void);
+extern void handle_tlb_refill(void);
+extern void handle_tlb_protect(void);
+
+extern void dump_tlb_all(void);
+extern void dump_tlb_regs(void);
+
+#endif /* __ASM_TLB_H */
diff --git a/arch/loongarch/include/asm/tlbflush.h b/arch/loongarch/include/asm/tlbflush.h
new file mode 100644
index 000000000000..36bd6d11dc2d
--- /dev/null
+++ b/arch/loongarch/include/asm/tlbflush.h
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM_TLBFLUSH_H
+#define __ASM_TLBFLUSH_H
+
+#include <linux/mm.h>
+
+/*
+ * TLB flushing:
+ *
+ *  - flush_tlb_all() flushes all processes TLB entries
+ *  - flush_tlb_mm(mm) flushes the specified mm context TLB entries
+ *  - flush_tlb_page(vma, vmaddr) flushes one page
+ *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
+ */
+extern void local_flush_tlb_all(void);
+extern void local_flush_tlb_user(void);
+extern void local_flush_tlb_kernel(void);
+extern void local_flush_tlb_mm(struct mm_struct *mm);
+extern void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end);
+extern void local_flush_tlb_kernel_range(unsigned long start, unsigned long end);
+extern void local_flush_tlb_page(struct vm_area_struct *vma, unsigned long page);
+extern void local_flush_tlb_one(unsigned long vaddr);
+
+#define flush_tlb_all()			local_flush_tlb_all()
+#define flush_tlb_mm(mm)		local_flush_tlb_mm(mm)
+#define flush_tlb_range(vma, vmaddr, end)	local_flush_tlb_range(vma, vmaddr, end)
+#define flush_tlb_kernel_range(vmaddr, end)	local_flush_tlb_kernel_range(vmaddr, end)
+#define flush_tlb_page(vma, page)	local_flush_tlb_page(vma, page)
+#define flush_tlb_one(vaddr)		local_flush_tlb_one(vaddr)
+
+#endif /* __ASM_TLBFLUSH_H */
diff --git a/arch/loongarch/include/asm/vmalloc.h b/arch/loongarch/include/asm/vmalloc.h
new file mode 100644
index 000000000000..965a0d41ac2d
--- /dev/null
+++ b/arch/loongarch/include/asm/vmalloc.h
@@ -0,0 +1,4 @@
+#ifndef _ASM_LOONGARCH_VMALLOC_H
+#define _ASM_LOONGARCH_VMALLOC_H
+
+#endif /* _ASM_LOONGARCH_VMALLOC_H */
diff --git a/arch/loongarch/mm/cache.c b/arch/loongarch/mm/cache.c
new file mode 100644
index 000000000000..7f65f750c776
--- /dev/null
+++ b/arch/loongarch/mm/cache.c
@@ -0,0 +1,140 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ *
+ * Derived from MIPS:
+ * Copyright (C) 1994 - 2003, 06, 07 by Ralf Baechle (ralf@linux-mips.org)
+ * Copyright (C) 2007 MIPS Technologies, Inc.
+ */
+#include <linux/export.h>
+#include <linux/fcntl.h>
+#include <linux/fs.h>
+#include <linux/highmem.h>
+#include <linux/kernel.h>
+#include <linux/linkage.h>
+#include <linux/mm.h>
+#include <linux/sched.h>
+#include <linux/syscalls.h>
+
+#include <asm/cacheflush.h>
+#include <asm/cpu.h>
+#include <asm/cpu-features.h>
+#include <asm/dma.h>
+#include <asm/loongarch.h>
+#include <asm/processor.h>
+#include <asm/setup.h>
+
+/*
+ * LoongArch maintains ICache/DCache coherency by hardware,
+ * we just need "ibar" to avoid instruction hazard here.
+ */
+void local_flush_icache_range(unsigned long start, unsigned long end)
+{
+	asm volatile ("\tibar 0\n"::);
+}
+
+void cache_error_setup(void)
+{
+	extern char __weak except_vec_cex;
+	set_merr_handler(0x0, &except_vec_cex, 0x80);
+}
+
+static unsigned long icache_size __read_mostly;
+static unsigned long dcache_size __read_mostly;
+static unsigned long vcache_size __read_mostly;
+static unsigned long scache_size __read_mostly;
+
+static char *way_string[] = { NULL, "direct mapped", "2-way",
+	"3-way", "4-way", "5-way", "6-way", "7-way", "8-way",
+	"9-way", "10-way", "11-way", "12-way",
+	"13-way", "14-way", "15-way", "16-way",
+};
+
+static void probe_pcache(void)
+{
+	struct cpuinfo_loongarch *c = &current_cpu_data;
+	unsigned int lsize, sets, ways;
+	unsigned int config;
+
+	config = read_cpucfg(LOONGARCH_CPUCFG17);
+	lsize = 1 << ((config & CPUCFG17_L1I_SIZE_M) >> CPUCFG17_L1I_SIZE);
+	sets  = 1 << ((config & CPUCFG17_L1I_SETS_M) >> CPUCFG17_L1I_SETS);
+	ways  = ((config & CPUCFG17_L1I_WAYS_M) >> CPUCFG17_L1I_WAYS) + 1;
+
+	c->icache.linesz = lsize;
+	c->icache.sets = sets;
+	c->icache.ways = ways;
+	icache_size = sets * ways * lsize;
+	c->icache.waysize = icache_size / c->icache.ways;
+
+	config = read_cpucfg(LOONGARCH_CPUCFG18);
+	lsize = 1 << ((config & CPUCFG18_L1D_SIZE_M) >> CPUCFG18_L1D_SIZE);
+	sets  = 1 << ((config & CPUCFG18_L1D_SETS_M) >> CPUCFG18_L1D_SETS);
+	ways  = ((config & CPUCFG18_L1D_WAYS_M) >> CPUCFG18_L1D_WAYS) + 1;
+
+	c->dcache.linesz = lsize;
+	c->dcache.sets = sets;
+	c->dcache.ways = ways;
+	dcache_size = sets * ways * lsize;
+	c->dcache.waysize = dcache_size / c->dcache.ways;
+
+	c->options |= LOONGARCH_CPU_PREFETCH;
+
+	pr_info("Primary instruction cache %ldkB, %s, %s, linesize %d bytes.\n",
+		icache_size >> 10, way_string[c->icache.ways], "VIPT", c->icache.linesz);
+
+	pr_info("Primary data cache %ldkB, %s, %s, %s, linesize %d bytes\n",
+		dcache_size >> 10, way_string[c->dcache.ways], "VIPT", "no aliases", c->dcache.linesz);
+}
+
+static void probe_vcache(void)
+{
+	struct cpuinfo_loongarch *c = &current_cpu_data;
+	unsigned int lsize, sets, ways;
+	unsigned int config;
+
+	config = read_cpucfg(LOONGARCH_CPUCFG19);
+	lsize = 1 << ((config & CPUCFG19_L2_SIZE_M) >> CPUCFG19_L2_SIZE);
+	sets  = 1 << ((config & CPUCFG19_L2_SETS_M) >> CPUCFG19_L2_SETS);
+	ways  = ((config & CPUCFG19_L2_WAYS_M) >> CPUCFG19_L2_WAYS) + 1;
+
+	c->vcache.linesz = lsize;
+	c->vcache.sets = sets;
+	c->vcache.ways = ways;
+	vcache_size = lsize * sets * ways;
+	c->vcache.waysize = vcache_size / c->vcache.ways;
+
+	pr_info("Unified victim cache %ldkB %s, linesize %d bytes.\n",
+		vcache_size >> 10, way_string[c->vcache.ways], c->vcache.linesz);
+}
+
+static void probe_scache(void)
+{
+	struct cpuinfo_loongarch *c = &current_cpu_data;
+	unsigned int lsize, sets, ways;
+	unsigned int config;
+
+	config = read_cpucfg(LOONGARCH_CPUCFG20);
+	lsize = 1 << ((config & CPUCFG20_L3_SIZE_M) >> CPUCFG20_L3_SIZE);
+	sets  = 1 << ((config & CPUCFG20_L3_SETS_M) >> CPUCFG20_L3_SETS);
+	ways  = ((config & CPUCFG20_L3_WAYS_M) >> CPUCFG20_L3_WAYS) + 1;
+
+	c->scache.linesz = lsize;
+	c->scache.sets = sets;
+	c->scache.ways = ways;
+	/* 4 cores. scaches are shared */
+	scache_size = lsize * sets * ways;
+	c->scache.waysize = scache_size / c->scache.ways;
+
+	pr_info("Unified secondary cache %ldkB %s, linesize %d bytes.\n",
+		scache_size >> 10, way_string[c->scache.ways], c->scache.linesz);
+}
+
+void cpu_cache_init(void)
+{
+	probe_pcache();
+	probe_vcache();
+	probe_scache();
+
+	shm_align_mask = PAGE_SIZE - 1;
+}
diff --git a/arch/loongarch/mm/extable.c b/arch/loongarch/mm/extable.c
new file mode 100644
index 000000000000..bc20988f2b87
--- /dev/null
+++ b/arch/loongarch/mm/extable.c
@@ -0,0 +1,22 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/extable.h>
+#include <linux/spinlock.h>
+#include <asm/branch.h>
+#include <linux/uaccess.h>
+
+int fixup_exception(struct pt_regs *regs)
+{
+	const struct exception_table_entry *fixup;
+
+	fixup = search_exception_tables(exception_era(regs));
+	if (fixup) {
+		regs->csr_era = fixup->fixup;
+
+		return 1;
+	}
+
+	return 0;
+}
diff --git a/arch/loongarch/mm/fault.c b/arch/loongarch/mm/fault.c
new file mode 100644
index 000000000000..605579b19a00
--- /dev/null
+++ b/arch/loongarch/mm/fault.c
@@ -0,0 +1,261 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ *
+ * Derived from MIPS:
+ * Copyright (C) 1995 - 2000 by Ralf Baechle
+ */
+#include <linux/context_tracking.h>
+#include <linux/signal.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/kernel.h>
+#include <linux/entry-common.h>
+#include <linux/errno.h>
+#include <linux/string.h>
+#include <linux/types.h>
+#include <linux/ptrace.h>
+#include <linux/ratelimit.h>
+#include <linux/mman.h>
+#include <linux/mm.h>
+#include <linux/smp.h>
+#include <linux/kdebug.h>
+#include <linux/kprobes.h>
+#include <linux/perf_event.h>
+#include <linux/uaccess.h>
+
+#include <asm/branch.h>
+#include <asm/mmu_context.h>
+#include <asm/ptrace.h>
+
+int show_unhandled_signals = 1;
+
+static void __kprobes no_context(struct pt_regs *regs, unsigned long address)
+{
+	const int field = sizeof(unsigned long) * 2;
+
+	/* Are we prepared to handle this kernel fault?	 */
+	if (fixup_exception(regs))
+		return;
+
+	/*
+	 * Oops. The kernel tried to access some bad page. We'll have to
+	 * terminate things with extreme prejudice.
+	 */
+	bust_spinlocks(1);
+
+	pr_alert("CPU %d Unable to handle kernel paging request at "
+	       "virtual address %0*lx, era == %0*lx, ra == %0*lx\n",
+	       raw_smp_processor_id(), field, address, field, regs->csr_era,
+	       field,  regs->regs[1]);
+	die("Oops", regs);
+}
+
+static void __kprobes do_out_of_memory(struct pt_regs *regs, unsigned long address)
+{
+	/*
+	 * We ran out of memory, call the OOM killer, and return the userspace
+	 * (which will retry the fault, or kill us if we got oom-killed).
+	 */
+	if (!user_mode(regs)) {
+		no_context(regs, address);
+		return;
+	}
+	pagefault_out_of_memory();
+}
+
+static void __kprobes do_sigbus(struct pt_regs *regs,
+		unsigned long write, unsigned long address, int si_code)
+{
+	/* Kernel mode? Handle exceptions or die */
+	if (!user_mode(regs)) {
+		no_context(regs, address);
+		return;
+	}
+
+	/*
+	 * Send a sigbus, regardless of whether we were in kernel
+	 * or user mode.
+	 */
+	current->thread.csr_badvaddr = address;
+	current->thread.trap_nr = read_csr_excode();
+	force_sig_fault(SIGBUS, BUS_ADRERR, (void __user *)address);
+}
+
+static void __kprobes do_sigsegv(struct pt_regs *regs,
+		unsigned long write, unsigned long address, int si_code)
+{
+	const int field = sizeof(unsigned long) * 2;
+	static DEFINE_RATELIMIT_STATE(ratelimit_state, 5 * HZ, 10);
+
+	/* Kernel mode? Handle exceptions or die */
+	if (!user_mode(regs)) {
+		no_context(regs, address);
+		return;
+	}
+
+	/* User mode accesses just cause a SIGSEGV */
+	current->thread.csr_badvaddr = address;
+	if (!write)
+		current->thread.error_code = 1;
+	else
+		current->thread.error_code = 2;
+	current->thread.trap_nr = read_csr_excode();
+
+	if (show_unhandled_signals &&
+	    unhandled_signal(current, SIGSEGV) && __ratelimit(&ratelimit_state)) {
+		pr_info("do_page_fault(): sending SIGSEGV to %s for invalid %s %0*lx\n",
+			current->comm,
+			write ? "write access to" : "read access from",
+			field, address);
+		pr_info("era = %0*lx in", field,
+			(unsigned long) regs->csr_era);
+		print_vma_addr(KERN_CONT " ", regs->csr_era);
+		pr_cont("\n");
+		pr_info("ra  = %0*lx in", field,
+			(unsigned long) regs->regs[1]);
+		print_vma_addr(KERN_CONT " ", regs->regs[1]);
+		pr_cont("\n");
+	}
+	force_sig_fault(SIGSEGV, si_code, (void __user *)address);
+}
+
+/*
+ * This routine handles page faults.  It determines the address,
+ * and the problem, and then passes it off to one of the appropriate
+ * routines.
+ */
+static void __kprobes __do_page_fault(struct pt_regs *regs,
+			unsigned long write, unsigned long address)
+{
+	int si_code = SEGV_MAPERR;
+	unsigned int flags = FAULT_FLAG_DEFAULT;
+	struct task_struct *tsk = current;
+	struct mm_struct *mm = tsk->mm;
+	struct vm_area_struct *vma = NULL;
+	vm_fault_t fault;
+
+	/*
+	 * We fault-in kernel-space virtual memory on-demand. The
+	 * 'reference' page table is init_mm.pgd.
+	 *
+	 * NOTE! We MUST NOT take any locks for this case. We may
+	 * be in an interrupt or a critical region, and should
+	 * only copy the information from the master page table,
+	 * nothing more.
+	 */
+	if (address & __UA_LIMIT) {
+		if (!user_mode(regs))
+			no_context(regs, address);
+		else
+			do_sigsegv(regs, write, address, si_code);
+		return;
+	}
+
+	/*
+	 * If we're in an interrupt or have no user
+	 * context, we must not take the fault..
+	 */
+	if (faulthandler_disabled() || !mm) {
+		do_sigsegv(regs, write, address, si_code);
+		return;
+	}
+
+	if (user_mode(regs))
+		flags |= FAULT_FLAG_USER;
+
+	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+retry:
+	mmap_read_lock(mm);
+	vma = find_vma(mm, address);
+	if (!vma)
+		goto bad_area;
+	if (vma->vm_start <= address)
+		goto good_area;
+	if (!(vma->vm_flags & VM_GROWSDOWN))
+		goto bad_area;
+	if (!expand_stack(vma, address))
+		goto good_area;
+/*
+ * Something tried to access memory that isn't in our memory map..
+ * Fix it, but check if it's kernel or user first..
+ */
+bad_area:
+	mmap_read_unlock(mm);
+	do_sigsegv(regs, write, address, si_code);
+	return;
+
+/*
+ * Ok, we have a good vm_area for this memory access, so
+ * we can handle it..
+ */
+good_area:
+	si_code = SEGV_ACCERR;
+
+	if (write) {
+		flags |= FAULT_FLAG_WRITE;
+		if (!(vma->vm_flags & VM_WRITE))
+			goto bad_area;
+	} else {
+		if (!(vma->vm_flags & VM_READ) && address != exception_era(regs))
+			goto bad_area;
+		if (!(vma->vm_flags & VM_EXEC) && address == exception_era(regs))
+			goto bad_area;
+	}
+
+	/*
+	 * If for any reason at all we couldn't handle the fault,
+	 * make sure we exit gracefully rather than endlessly redo
+	 * the fault.
+	 */
+	fault = handle_mm_fault(vma, address, flags, regs);
+
+	if (fault_signal_pending(fault, regs)) {
+		if (!user_mode(regs))
+			no_context(regs, address);
+		return;
+	}
+
+	if (unlikely(fault & VM_FAULT_RETRY)) {
+		flags |= FAULT_FLAG_TRIED;
+
+		/*
+		 * No need to mmap_read_unlock(mm) as we would
+		 * have already released it in __lock_page_or_retry
+		 * in mm/filemap.c.
+		 */
+		goto retry;
+	}
+	if (unlikely(fault & VM_FAULT_ERROR)) {
+		mmap_read_unlock(mm);
+		if (fault & VM_FAULT_OOM) {
+			do_out_of_memory(regs, address);
+			return;
+		} else if (fault & VM_FAULT_SIGSEGV) {
+			do_sigsegv(regs, write, address, si_code);
+			return;
+		} else if (fault & (VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|VM_FAULT_HWPOISON_LARGE)) {
+			do_sigbus(regs, write, address, si_code);
+			return;
+		}
+		BUG();
+	}
+
+	mmap_read_unlock(mm);
+}
+
+asmlinkage void __kprobes do_page_fault(struct pt_regs *regs,
+			unsigned long write, unsigned long address)
+{
+	irqentry_state_t state = irqentry_enter(regs);
+
+	/* Enable interrupt if enabled in parent context */
+	if (likely(regs->csr_prmd & CSR_PRMD_PIE))
+		local_irq_enable();
+
+	__do_page_fault(regs, write, address);
+
+	local_irq_disable();
+
+	irqentry_exit(regs, state);
+}
diff --git a/arch/loongarch/mm/hugetlbpage.c b/arch/loongarch/mm/hugetlbpage.c
new file mode 100644
index 000000000000..ba138117b124
--- /dev/null
+++ b/arch/loongarch/mm/hugetlbpage.c
@@ -0,0 +1,87 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <linux/fs.h>
+#include <linux/mm.h>
+#include <linux/hugetlb.h>
+#include <linux/pagemap.h>
+#include <linux/err.h>
+#include <linux/sysctl.h>
+#include <asm/mman.h>
+#include <asm/tlb.h>
+#include <asm/tlbflush.h>
+
+pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma,
+		      unsigned long addr, unsigned long sz)
+{
+	pgd_t *pgd;
+	p4d_t *p4d;
+	pud_t *pud;
+	pte_t *pte = NULL;
+
+	pgd = pgd_offset(mm, addr);
+	p4d = p4d_alloc(mm, pgd, addr);
+	pud = pud_alloc(mm, p4d, addr);
+	if (pud)
+		pte = (pte_t *)pmd_alloc(mm, pud, addr);
+
+	return pte;
+}
+
+pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr,
+		       unsigned long sz)
+{
+	pgd_t *pgd;
+	p4d_t *p4d;
+	pud_t *pud;
+	pmd_t *pmd = NULL;
+
+	pgd = pgd_offset(mm, addr);
+	if (pgd_present(*pgd)) {
+		p4d = p4d_offset(pgd, addr);
+		if (p4d_present(*p4d)) {
+			pud = pud_offset(p4d, addr);
+			if (pud_present(*pud))
+				pmd = pmd_offset(pud, addr);
+		}
+	}
+	return (pte_t *) pmd;
+}
+
+/*
+ * This function checks for proper alignment of input addr and len parameters.
+ */
+int is_aligned_hugepage_range(unsigned long addr, unsigned long len)
+{
+	if (len & ~HPAGE_MASK)
+		return -EINVAL;
+	if (addr & ~HPAGE_MASK)
+		return -EINVAL;
+	return 0;
+}
+
+int pmd_huge(pmd_t pmd)
+{
+	return (pmd_val(pmd) & _PAGE_HUGE) != 0;
+}
+
+int pud_huge(pud_t pud)
+{
+	return (pud_val(pud) & _PAGE_HUGE) != 0;
+}
+
+uint64_t pmd_to_entrylo(unsigned long pmd_val)
+{
+	uint64_t val;
+	/* PMD as PTE. Must be huge page */
+	if (!pmd_huge(__pmd(pmd_val)))
+		panic("%s", __func__);
+
+	val = pmd_val ^ _PAGE_HUGE;
+	val |= ((val & _PAGE_HGLOBAL) >>
+		(_PAGE_HGLOBAL_SHIFT - _PAGE_GLOBAL_SHIFT));
+
+	return val;
+}
diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c
new file mode 100644
index 000000000000..afd6634ce171
--- /dev/null
+++ b/arch/loongarch/mm/init.c
@@ -0,0 +1,165 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/init.h>
+#include <linux/export.h>
+#include <linux/signal.h>
+#include <linux/sched.h>
+#include <linux/smp.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/string.h>
+#include <linux/types.h>
+#include <linux/pagemap.h>
+#include <linux/memblock.h>
+#include <linux/memremap.h>
+#include <linux/mm.h>
+#include <linux/mman.h>
+#include <linux/highmem.h>
+#include <linux/swap.h>
+#include <linux/proc_fs.h>
+#include <linux/pfn.h>
+#include <linux/hardirq.h>
+#include <linux/gfp.h>
+#include <linux/initrd.h>
+#include <linux/mmzone.h>
+
+#include <asm/asm-offsets.h>
+#include <asm/bootinfo.h>
+#include <asm/cpu.h>
+#include <asm/dma.h>
+#include <asm/mmu_context.h>
+#include <asm/sections.h>
+#include <asm/pgtable.h>
+#include <asm/pgalloc.h>
+#include <asm/tlb.h>
+
+/*
+ * We have up to 8 empty zeroed pages so we can map one of the right colour
+ * when needed.	 Since page is never written to after the initialization we
+ * don't have to care about aliases on other CPUs.
+ */
+unsigned long empty_zero_page, zero_page_mask;
+EXPORT_SYMBOL_GPL(empty_zero_page);
+EXPORT_SYMBOL(zero_page_mask);
+
+void setup_zero_pages(void)
+{
+	unsigned int order, i;
+	struct page *page;
+
+	order = 0;
+
+	empty_zero_page = __get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
+	if (!empty_zero_page)
+		panic("Oh boy, that early out of memory?");
+
+	page = virt_to_page((void *)empty_zero_page);
+	split_page(page, order);
+	for (i = 0; i < (1 << order); i++, page++)
+		mark_page_reserved(page);
+
+	zero_page_mask = ((PAGE_SIZE << order) - 1) & PAGE_MASK;
+}
+
+void copy_user_highpage(struct page *to, struct page *from,
+	unsigned long vaddr, struct vm_area_struct *vma)
+{
+	void *vfrom, *vto;
+
+	vto = kmap_atomic(to);
+	vfrom = kmap_atomic(from);
+	copy_page(vto, vfrom);
+	kunmap_atomic(vfrom);
+	kunmap_atomic(vto);
+	/* Make sure this page is cleared on other CPU's too before using it */
+	smp_wmb();
+}
+
+int __ref page_is_ram(unsigned long pfn)
+{
+	unsigned long addr = PFN_PHYS(pfn);
+
+	return memblock_is_memory(addr) && !memblock_is_reserved(addr);
+}
+
+void __init paging_init(void)
+{
+	unsigned long max_zone_pfns[MAX_NR_ZONES];
+
+#ifdef CONFIG_ZONE_DMA
+	max_zone_pfns[ZONE_DMA] = MAX_DMA_PFN;
+#endif
+#ifdef CONFIG_ZONE_DMA32
+	max_zone_pfns[ZONE_DMA32] = MAX_DMA32_PFN;
+#endif
+	max_zone_pfns[ZONE_NORMAL] = max_low_pfn;
+
+	free_area_init(max_zone_pfns);
+}
+
+void __init mem_init(void)
+{
+	max_mapnr = max_low_pfn;
+	high_memory = (void *) __va(max_low_pfn << PAGE_SHIFT);
+
+	memblock_free_all();
+	setup_zero_pages();	/* Setup zeroed pages.  */
+}
+
+void __ref free_initmem(void)
+{
+	free_initmem_default(POISON_FREE_INITMEM);
+}
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+int arch_add_memory(int nid, u64 start, u64 size, struct mhp_params *params)
+{
+	unsigned long start_pfn = start >> PAGE_SHIFT;
+	unsigned long nr_pages = size >> PAGE_SHIFT;
+	int ret;
+
+	ret = __add_pages(nid, start_pfn, nr_pages, params);
+
+	if (ret)
+		pr_warn("%s: Problem encountered in __add_pages() as ret=%d\n",
+				__func__,  ret);
+
+	return ret;
+}
+
+#ifdef CONFIG_MEMORY_HOTREMOVE
+void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap)
+{
+	unsigned long start_pfn = start >> PAGE_SHIFT;
+	unsigned long nr_pages = size >> PAGE_SHIFT;
+	struct page *page = pfn_to_page(start_pfn);
+
+	/* With altmap the first mapped page is offset from @start */
+	if (altmap)
+		page += vmem_altmap_offset(altmap);
+	__remove_pages(start_pfn, nr_pages, altmap);
+}
+#endif
+#endif
+
+/*
+ * Align swapper_pg_dir in to 64K, allows its address to be loaded
+ * with a single LUI instruction in the TLB handlers.  If we used
+ * __aligned(64K), its size would get rounded up to the alignment
+ * size, and waste space.  So we place it in its own section and align
+ * it in the linker script.
+ */
+pgd_t swapper_pg_dir[_PTRS_PER_PGD] __section(".bss..swapper_pg_dir");
+
+pgd_t invalid_pg_dir[_PTRS_PER_PGD] __page_aligned_bss;
+#ifndef __PAGETABLE_PUD_FOLDED
+pud_t invalid_pud_table[PTRS_PER_PUD] __page_aligned_bss;
+#endif
+#ifndef __PAGETABLE_PMD_FOLDED
+pmd_t invalid_pmd_table[PTRS_PER_PMD] __page_aligned_bss;
+EXPORT_SYMBOL_GPL(invalid_pmd_table);
+#endif
+pte_t invalid_pte_table[PTRS_PER_PTE] __page_aligned_bss;
+EXPORT_SYMBOL(invalid_pte_table);
diff --git a/arch/loongarch/mm/ioremap.c b/arch/loongarch/mm/ioremap.c
new file mode 100644
index 000000000000..abf66ffb78fa
--- /dev/null
+++ b/arch/loongarch/mm/ioremap.c
@@ -0,0 +1,27 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <asm/io.h>
+
+void __init __iomem *early_ioremap(u64 phys_addr, unsigned long size)
+{
+	return ((void __iomem *)TO_CAC(phys_addr));
+}
+
+void __init early_iounmap(void __iomem *addr, unsigned long size)
+{
+
+}
+
+void *early_memremap_ro(resource_size_t phys_addr, unsigned long size)
+{
+	return early_memremap(phys_addr, size);
+}
+
+void *early_memremap_prot(resource_size_t phys_addr, unsigned long size,
+		    unsigned long prot_val)
+{
+	return early_memremap(phys_addr, size);
+}
diff --git a/arch/loongarch/mm/maccess.c b/arch/loongarch/mm/maccess.c
new file mode 100644
index 000000000000..58173842c6be
--- /dev/null
+++ b/arch/loongarch/mm/maccess.c
@@ -0,0 +1,10 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/uaccess.h>
+#include <linux/kernel.h>
+
+bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size)
+{
+	/* highest bit set means kernel space */
+	return (unsigned long)unsafe_src >> (BITS_PER_LONG - 1);
+}
diff --git a/arch/loongarch/mm/mmap.c b/arch/loongarch/mm/mmap.c
new file mode 100644
index 000000000000..52e40f0ba732
--- /dev/null
+++ b/arch/loongarch/mm/mmap.c
@@ -0,0 +1,125 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/compiler.h>
+#include <linux/elf-randomize.h>
+#include <linux/errno.h>
+#include <linux/mm.h>
+#include <linux/mman.h>
+#include <linux/export.h>
+#include <linux/personality.h>
+#include <linux/random.h>
+#include <linux/sched/signal.h>
+#include <linux/sched/mm.h>
+
+unsigned long shm_align_mask = PAGE_SIZE - 1;	/* Sane caches */
+EXPORT_SYMBOL(shm_align_mask);
+
+#define COLOUR_ALIGN(addr, pgoff)				\
+	((((addr) + shm_align_mask) & ~shm_align_mask) +	\
+	 (((pgoff) << PAGE_SHIFT) & shm_align_mask))
+
+enum mmap_allocation_direction {UP, DOWN};
+
+static unsigned long arch_get_unmapped_area_common(struct file *filp,
+	unsigned long addr0, unsigned long len, unsigned long pgoff,
+	unsigned long flags, enum mmap_allocation_direction dir)
+{
+	struct mm_struct *mm = current->mm;
+	struct vm_area_struct *vma;
+	unsigned long addr = addr0;
+	int do_color_align;
+	struct vm_unmapped_area_info info;
+
+	if (unlikely(len > TASK_SIZE))
+		return -ENOMEM;
+
+	if (flags & MAP_FIXED) {
+		/* Even MAP_FIXED mappings must reside within TASK_SIZE */
+		if (TASK_SIZE - len < addr)
+			return -EINVAL;
+
+		/*
+		 * We do not accept a shared mapping if it would violate
+		 * cache aliasing constraints.
+		 */
+		if ((flags & MAP_SHARED) &&
+		    ((addr - (pgoff << PAGE_SHIFT)) & shm_align_mask))
+			return -EINVAL;
+		return addr;
+	}
+
+	do_color_align = 0;
+	if (filp || (flags & MAP_SHARED))
+		do_color_align = 1;
+
+	/* requesting a specific address */
+	if (addr) {
+		if (do_color_align)
+			addr = COLOUR_ALIGN(addr, pgoff);
+		else
+			addr = PAGE_ALIGN(addr);
+
+		vma = find_vma(mm, addr);
+		if (TASK_SIZE - len >= addr &&
+		    (!vma || addr + len <= vm_start_gap(vma)))
+			return addr;
+	}
+
+	info.length = len;
+	info.align_mask = do_color_align ? (PAGE_MASK & shm_align_mask) : 0;
+	info.align_offset = pgoff << PAGE_SHIFT;
+
+	if (dir == DOWN) {
+		info.flags = VM_UNMAPPED_AREA_TOPDOWN;
+		info.low_limit = PAGE_SIZE;
+		info.high_limit = mm->mmap_base;
+		addr = vm_unmapped_area(&info);
+
+		if (!(addr & ~PAGE_MASK))
+			return addr;
+
+		/*
+		 * A failed mmap() very likely causes application failure,
+		 * so fall back to the bottom-up function here. This scenario
+		 * can happen with large stack limits and large mmap()
+		 * allocations.
+		 */
+	}
+
+	info.flags = 0;
+	info.low_limit = mm->mmap_base;
+	info.high_limit = TASK_SIZE;
+	return vm_unmapped_area(&info);
+}
+
+unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr0,
+	unsigned long len, unsigned long pgoff, unsigned long flags)
+{
+	return arch_get_unmapped_area_common(filp,
+			addr0, len, pgoff, flags, UP);
+}
+
+/*
+ * There is no need to export this but sched.h declares the function as
+ * extern so making it static here results in an error.
+ */
+unsigned long arch_get_unmapped_area_topdown(struct file *filp,
+	unsigned long addr0, unsigned long len, unsigned long pgoff,
+	unsigned long flags)
+{
+	return arch_get_unmapped_area_common(filp,
+			addr0, len, pgoff, flags, DOWN);
+}
+
+int __virt_addr_valid(volatile void *kaddr)
+{
+	unsigned long vaddr = (unsigned long)kaddr;
+
+	if ((vaddr < PAGE_OFFSET) || (vaddr >= vm_map_base))
+		return 0;
+
+	return pfn_valid(PFN_DOWN(virt_to_phys(kaddr)));
+}
+EXPORT_SYMBOL_GPL(__virt_addr_valid);
diff --git a/arch/loongarch/mm/page.S b/arch/loongarch/mm/page.S
new file mode 100644
index 000000000000..ddc78ab33c7b
--- /dev/null
+++ b/arch/loongarch/mm/page.S
@@ -0,0 +1,84 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/linkage.h>
+#include <asm/asm.h>
+#include <asm/export.h>
+#include <asm/page.h>
+#include <asm/regdef.h>
+
+	.align 5
+SYM_FUNC_START(clear_page)
+	lu12i.w  t0, 1 << (PAGE_SHIFT - 12)
+	add.d    t0, t0, a0
+1:
+	st.d     zero, a0, 0
+	st.d     zero, a0, 8
+	st.d     zero, a0, 16
+	st.d     zero, a0, 24
+	st.d     zero, a0, 32
+	st.d     zero, a0, 40
+	st.d     zero, a0, 48
+	st.d     zero, a0, 56
+	addi.d   a0,   a0, 128
+	st.d     zero, a0, -64
+	st.d     zero, a0, -56
+	st.d     zero, a0, -48
+	st.d     zero, a0, -40
+	st.d     zero, a0, -32
+	st.d     zero, a0, -24
+	st.d     zero, a0, -16
+	st.d     zero, a0, -8
+	bne      t0,   a0, 1b
+
+	jirl     $r0, ra, 0
+SYM_FUNC_END(clear_page)
+EXPORT_SYMBOL(clear_page)
+
+.align 5
+SYM_FUNC_START(copy_page)
+	lu12i.w  t8, 1 << (PAGE_SHIFT - 12)
+	add.d    t8, t8, a0
+1:
+	ld.d     t0, a1,  0
+	ld.d     t1, a1,  8
+	ld.d     t2, a1,  16
+	ld.d     t3, a1,  24
+	ld.d     t4, a1,  32
+	ld.d     t5, a1,  40
+	ld.d     t6, a1,  48
+	ld.d     t7, a1,  56
+
+	st.d     t0, a0,  0
+	st.d     t1, a0,  8
+	ld.d     t0, a1,  64
+	ld.d     t1, a1,  72
+	st.d     t2, a0,  16
+	st.d     t3, a0,  24
+	ld.d     t2, a1,  80
+	ld.d     t3, a1,  88
+	st.d     t4, a0,  32
+	st.d     t5, a0,  40
+	ld.d     t4, a1,  96
+	ld.d     t5, a1,  104
+	st.d     t6, a0,  48
+	st.d     t7, a0,  56
+	ld.d     t6, a1,  112
+	ld.d     t7, a1,  120
+	addi.d   a0, a0,  128
+	addi.d   a1, a1,  128
+
+	st.d     t0, a0,  -64
+	st.d     t1, a0,  -56
+	st.d     t2, a0,  -48
+	st.d     t3, a0,  -40
+	st.d     t4, a0,  -32
+	st.d     t5, a0,  -24
+	st.d     t6, a0,  -16
+	st.d     t7, a0,  -8
+
+	bne      t8, a0, 1b
+	jirl     $r0, ra, 0
+SYM_FUNC_END(copy_page)
+EXPORT_SYMBOL(copy_page)
diff --git a/arch/loongarch/mm/pgtable.c b/arch/loongarch/mm/pgtable.c
new file mode 100644
index 000000000000..0569647152e9
--- /dev/null
+++ b/arch/loongarch/mm/pgtable.c
@@ -0,0 +1,130 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/init.h>
+#include <linux/export.h>
+#include <linux/mm.h>
+#include <asm/pgalloc.h>
+#include <asm/pgtable.h>
+#include <asm/tlbflush.h>
+
+pgd_t *pgd_alloc(struct mm_struct *mm)
+{
+	pgd_t *ret, *init;
+
+	ret = (pgd_t *) __get_free_pages(GFP_KERNEL, PGD_ORDER);
+	if (ret) {
+		init = pgd_offset(&init_mm, 0UL);
+		pgd_init((unsigned long)ret);
+		memcpy(ret + USER_PTRS_PER_PGD, init + USER_PTRS_PER_PGD,
+		       (PTRS_PER_PGD - USER_PTRS_PER_PGD) * sizeof(pgd_t));
+	}
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(pgd_alloc);
+
+void pgd_init(unsigned long page)
+{
+	unsigned long *p, *end;
+	unsigned long entry;
+
+#if !defined(__PAGETABLE_PUD_FOLDED)
+	entry = (unsigned long)invalid_pud_table;
+#elif !defined(__PAGETABLE_PMD_FOLDED)
+	entry = (unsigned long)invalid_pmd_table;
+#else
+	entry = (unsigned long)invalid_pte_table;
+#endif
+
+	p = (unsigned long *) page;
+	end = p + PTRS_PER_PGD;
+
+	do {
+		p[0] = entry;
+		p[1] = entry;
+		p[2] = entry;
+		p[3] = entry;
+		p[4] = entry;
+		p += 8;
+		p[-3] = entry;
+		p[-2] = entry;
+		p[-1] = entry;
+	} while (p != end);
+}
+EXPORT_SYMBOL_GPL(pgd_init);
+
+#ifndef __PAGETABLE_PMD_FOLDED
+void pmd_init(unsigned long addr, unsigned long pagetable)
+{
+	unsigned long *p, *end;
+
+	p = (unsigned long *) addr;
+	end = p + PTRS_PER_PMD;
+
+	do {
+		p[0] = pagetable;
+		p[1] = pagetable;
+		p[2] = pagetable;
+		p[3] = pagetable;
+		p[4] = pagetable;
+		p += 8;
+		p[-3] = pagetable;
+		p[-2] = pagetable;
+		p[-1] = pagetable;
+	} while (p != end);
+}
+EXPORT_SYMBOL_GPL(pmd_init);
+#endif
+
+#ifndef __PAGETABLE_PUD_FOLDED
+void pud_init(unsigned long addr, unsigned long pagetable)
+{
+	unsigned long *p, *end;
+
+	p = (unsigned long *)addr;
+	end = p + PTRS_PER_PUD;
+
+	do {
+		p[0] = pagetable;
+		p[1] = pagetable;
+		p[2] = pagetable;
+		p[3] = pagetable;
+		p[4] = pagetable;
+		p += 8;
+		p[-3] = pagetable;
+		p[-2] = pagetable;
+		p[-1] = pagetable;
+	} while (p != end);
+}
+#endif
+
+pmd_t mk_pmd(struct page *page, pgprot_t prot)
+{
+	pmd_t pmd;
+
+	pmd_val(pmd) = (page_to_pfn(page) << _PFN_SHIFT) | pgprot_val(prot);
+
+	return pmd;
+}
+
+void set_pmd_at(struct mm_struct *mm, unsigned long addr,
+		pmd_t *pmdp, pmd_t pmd)
+{
+	*pmdp = pmd;
+	flush_tlb_all();
+}
+
+void __init pagetable_init(void)
+{
+	/* Initialize the entire pgd.  */
+	pgd_init((unsigned long)swapper_pg_dir);
+	pgd_init((unsigned long)invalid_pg_dir);
+#ifndef __PAGETABLE_PUD_FOLDED
+	pud_init((unsigned long)invalid_pud_table, (unsigned long)invalid_pmd_table);
+#endif
+#ifndef __PAGETABLE_PMD_FOLDED
+	pmd_init((unsigned long)invalid_pmd_table, (unsigned long)invalid_pte_table);
+#endif
+}
diff --git a/arch/loongarch/mm/tlb.c b/arch/loongarch/mm/tlb.c
new file mode 100644
index 000000000000..5201a87937fe
--- /dev/null
+++ b/arch/loongarch/mm/tlb.c
@@ -0,0 +1,282 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/init.h>
+#include <linux/sched.h>
+#include <linux/smp.h>
+#include <linux/mm.h>
+#include <linux/hugetlb.h>
+#include <linux/export.h>
+
+#include <asm/cpu.h>
+#include <asm/bootinfo.h>
+#include <asm/mmu_context.h>
+#include <asm/pgtable.h>
+#include <asm/tlb.h>
+
+void local_flush_tlb_all(void)
+{
+	invtlb_all(INVTLB_CURRENT_ALL, 0, 0);
+}
+EXPORT_SYMBOL(local_flush_tlb_all);
+
+void local_flush_tlb_user(void)
+{
+	invtlb_all(INVTLB_CURRENT_GFALSE, 0, 0);
+}
+EXPORT_SYMBOL(local_flush_tlb_user);
+
+void local_flush_tlb_kernel(void)
+{
+	invtlb_all(INVTLB_CURRENT_GTRUE, 0, 0);
+}
+EXPORT_SYMBOL(local_flush_tlb_kernel);
+
+/*
+ * All entries common to a mm share an asid. To effectively flush
+ * these entries, we just bump the asid.
+ */
+void local_flush_tlb_mm(struct mm_struct *mm)
+{
+	int cpu;
+
+	preempt_disable();
+
+	cpu = smp_processor_id();
+
+	if (asid_valid(mm, cpu))
+		drop_mmu_context(mm, cpu);
+	else
+		cpumask_clear_cpu(cpu, mm_cpumask(mm));
+
+	preempt_enable();
+}
+
+void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+	unsigned long end)
+{
+	struct mm_struct *mm = vma->vm_mm;
+	int cpu = smp_processor_id();
+
+	if (asid_valid(mm, cpu)) {
+		unsigned long size, flags;
+
+		local_irq_save(flags);
+		start = round_down(start, PAGE_SIZE << 1);
+		end = round_up(end, PAGE_SIZE << 1);
+		size = (end - start) >> (PAGE_SHIFT + 1);
+		if (size <= (current_cpu_data.tlbsizestlbsets ?
+			     current_cpu_data.tlbsize / 8 :
+			     current_cpu_data.tlbsize / 2)) {
+			int asid = cpu_asid(cpu, mm);
+
+			while (start < end) {
+				invtlb(INVTLB_ADDR_GFALSE_AND_ASID, asid, start);
+				start += (PAGE_SIZE << 1);
+			}
+		} else {
+			drop_mmu_context(mm, cpu);
+		}
+		local_irq_restore(flags);
+	} else {
+		cpumask_clear_cpu(cpu, mm_cpumask(mm));
+	}
+}
+
+void local_flush_tlb_kernel_range(unsigned long start, unsigned long end)
+{
+	unsigned long size, flags;
+
+	local_irq_save(flags);
+	size = (end - start + (PAGE_SIZE - 1)) >> PAGE_SHIFT;
+	size = (size + 1) >> 1;
+	if (size <= (current_cpu_data.tlbsizestlbsets ?
+		     current_cpu_data.tlbsize / 8 :
+		     current_cpu_data.tlbsize / 2)) {
+
+		start &= (PAGE_MASK << 1);
+		end += ((PAGE_SIZE << 1) - 1);
+		end &= (PAGE_MASK << 1);
+
+		while (start < end) {
+			invtlb_addr(INVTLB_ADDR_GTRUE_OR_ASID, 0, start);
+			start += (PAGE_SIZE << 1);
+		}
+	} else {
+		local_flush_tlb_kernel();
+	}
+	local_irq_restore(flags);
+}
+
+void local_flush_tlb_page(struct vm_area_struct *vma, unsigned long page)
+{
+	int cpu = smp_processor_id();
+
+	if (asid_valid(vma->vm_mm, cpu)) {
+		int newpid;
+
+		newpid = cpu_asid(cpu, vma->vm_mm);
+		page &= (PAGE_MASK << 1);
+		invtlb(INVTLB_ADDR_GFALSE_AND_ASID, newpid, page);
+	} else {
+		cpumask_clear_cpu(cpu, mm_cpumask(vma->vm_mm));
+	}
+}
+
+/*
+ * This one is only used for pages with the global bit set so we don't care
+ * much about the ASID.
+ */
+void local_flush_tlb_one(unsigned long page)
+{
+	page &= (PAGE_MASK << 1);
+	invtlb_addr(INVTLB_ADDR_GTRUE_OR_ASID, 0, page);
+}
+
+static void __update_hugetlb(struct vm_area_struct *vma, unsigned long address, pte_t *ptep)
+{
+#ifdef CONFIG_HUGETLB_PAGE
+	int idx;
+	unsigned long lo;
+	unsigned long flags;
+
+	local_irq_save(flags);
+
+	address &= (PAGE_MASK << 1);
+	write_csr_entryhi(address);
+	tlb_probe();
+	idx = read_csr_tlbidx();
+	write_csr_pagesize(PS_HUGE_SIZE);
+	lo = pmd_to_entrylo(pte_val(*ptep));
+	write_csr_entrylo0(lo);
+	write_csr_entrylo1(lo + (HPAGE_SIZE >> 1));
+
+	if (idx < 0)
+		tlb_write_random();
+	else
+		tlb_write_indexed();
+	write_csr_pagesize(PS_DEFAULT_SIZE);
+
+	local_irq_restore(flags);
+#endif
+}
+
+void __update_tlb(struct vm_area_struct *vma, unsigned long address, pte_t *ptep)
+{
+	int idx;
+	unsigned long flags;
+
+	/*
+	 * Handle debugger faulting in for debugee.
+	 */
+	if (current->active_mm != vma->vm_mm)
+		return;
+
+	if (pte_val(*ptep) & _PAGE_HUGE) {
+		__update_hugetlb(vma, address, ptep);
+		return;
+	}
+
+	local_irq_save(flags);
+
+	if ((unsigned long)ptep & sizeof(pte_t))
+		ptep--;
+
+	address &= (PAGE_MASK << 1);
+	write_csr_entryhi(address);
+	tlb_probe();
+	idx = read_csr_tlbidx();
+	write_csr_pagesize(PS_DEFAULT_SIZE);
+	write_csr_entrylo0(pte_val(*ptep++));
+	write_csr_entrylo1(pte_val(*ptep));
+	if (idx < 0)
+		tlb_write_random();
+	else
+		tlb_write_indexed();
+
+	local_irq_restore(flags);
+}
+
+static void setup_ptwalker(void)
+{
+	unsigned long pwctl0, pwctl1;
+	unsigned long pgd_i = 0, pgd_w = 0;
+	unsigned long pud_i = 0, pud_w = 0;
+	unsigned long pmd_i = 0, pmd_w = 0;
+	unsigned long pte_i = 0, pte_w = 0;
+
+	pgd_i = PGDIR_SHIFT;
+	pgd_w = PAGE_SHIFT - 3;
+#if CONFIG_PGTABLE_LEVELS > 3
+	pud_i = PUD_SHIFT;
+	pud_w = PAGE_SHIFT - 3;
+#endif
+#if CONFIG_PGTABLE_LEVELS > 2
+	pmd_i = PMD_SHIFT;
+	pmd_w = PAGE_SHIFT - 3;
+#endif
+	pte_i = PAGE_SHIFT;
+	pte_w = PAGE_SHIFT - 3;
+
+	pwctl0 = pte_i | pte_w << 5 | pmd_i << 10 | pmd_w << 15 | pud_i << 20 | pud_w << 25;
+	pwctl1 = pgd_i | pgd_w << 6;
+
+	csr_writeq(pwctl0, LOONGARCH_CSR_PWCTL0);
+	csr_writeq(pwctl1, LOONGARCH_CSR_PWCTL1);
+	csr_writeq((long)swapper_pg_dir, LOONGARCH_CSR_PGDH);
+	csr_writeq((long)invalid_pg_dir, LOONGARCH_CSR_PGDL);
+	csr_writeq((long)smp_processor_id(), LOONGARCH_CSR_TMID);
+}
+
+static void output_pgtable_bits_defines(void)
+{
+#define pr_define(fmt, ...)					\
+	pr_debug("#define " fmt, ##__VA_ARGS__)
+
+	pr_debug("#include <asm/asm.h>\n");
+	pr_debug("#include <asm/regdef.h>\n");
+	pr_debug("\n");
+
+	pr_define("_PAGE_VALID_SHIFT %d\n", _PAGE_VALID_SHIFT);
+	pr_define("_PAGE_DIRTY_SHIFT %d\n", _PAGE_DIRTY_SHIFT);
+	pr_define("_PAGE_HUGE_SHIFT %d\n", _PAGE_HUGE_SHIFT);
+	pr_define("_PAGE_GLOBAL_SHIFT %d\n", _PAGE_GLOBAL_SHIFT);
+	pr_define("_PAGE_PRESENT_SHIFT %d\n", _PAGE_PRESENT_SHIFT);
+	pr_define("_PAGE_WRITE_SHIFT %d\n", _PAGE_WRITE_SHIFT);
+	pr_define("_PAGE_NO_READ_SHIFT %d\n", _PAGE_NO_READ_SHIFT);
+	pr_define("_PAGE_NO_EXEC_SHIFT %d\n", _PAGE_NO_EXEC_SHIFT);
+	pr_define("_PFN_SHIFT %d\n", _PFN_SHIFT);
+	pr_debug("\n");
+}
+
+void setup_tlb_handler(void)
+{
+	static int run_once = 0;
+
+	setup_ptwalker();
+	output_pgtable_bits_defines();
+
+	/* The tlb handlers are generated only once */
+	if (!run_once) {
+		memcpy((void *)tlbrentry, handle_tlb_refill, 0x80);
+		local_flush_icache_range(tlbrentry, tlbrentry + 0x80);
+		set_handler(EXCCODE_TLBI * VECSIZE, handle_tlb_load, VECSIZE);
+		set_handler(EXCCODE_TLBL * VECSIZE, handle_tlb_load, VECSIZE);
+		set_handler(EXCCODE_TLBS * VECSIZE, handle_tlb_store, VECSIZE);
+		set_handler(EXCCODE_TLBM * VECSIZE, handle_tlb_modify, VECSIZE);
+		set_handler(EXCCODE_TLBNR * VECSIZE, handle_tlb_protect, VECSIZE);
+		set_handler(EXCCODE_TLBNX * VECSIZE, handle_tlb_protect, VECSIZE);
+		set_handler(EXCCODE_TLBPE * VECSIZE, handle_tlb_protect, VECSIZE);
+		run_once++;
+	}
+}
+
+void tlb_init(void)
+{
+	write_csr_pagesize(PS_DEFAULT_SIZE);
+	write_csr_stlbpgsize(PS_DEFAULT_SIZE);
+	write_csr_tlbrefill_pagesize(PS_DEFAULT_SIZE);
+	setup_tlb_handler();
+	local_flush_tlb_all();
+}
diff --git a/arch/loongarch/mm/tlbex.S b/arch/loongarch/mm/tlbex.S
new file mode 100644
index 000000000000..a4ca4e507ee8
--- /dev/null
+++ b/arch/loongarch/mm/tlbex.S
@@ -0,0 +1,477 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <asm/asm.h>
+#include <asm/export.h>
+#include <asm/loongarch.h>
+#include <asm/page.h>
+#include <asm/pgtable.h>
+#include <asm/regdef.h>
+#include <asm/stackframe.h>
+
+	.macro tlb_do_page_fault, write
+	SYM_FUNC_START(tlb_do_page_fault_\write)
+	SAVE_ALL
+	csrrd	a2, LOONGARCH_CSR_BADV
+	move	a0, sp
+	REG_S	a2, sp, PT_BVADDR
+	li.w	a1, \write
+	la.abs	t0, do_page_fault
+	jirl    ra, t0, 0
+	RESTORE_ALL_AND_RET
+	SYM_FUNC_END(tlb_do_page_fault_\write)
+	.endm
+
+	tlb_do_page_fault 0
+	tlb_do_page_fault 1
+
+SYM_FUNC_START(handle_tlb_protect)
+	BACKUP_T0T1
+	SAVE_ALL
+	move	a0, sp
+	move	a1, zero
+	csrrd	a2, LOONGARCH_CSR_BADV
+	REG_S	a2, sp, PT_BVADDR
+	la.abs	t0, do_page_fault
+	jirl    ra, t0, 0
+	RESTORE_ALL_AND_RET
+SYM_FUNC_END(handle_tlb_protect)
+
+SYM_FUNC_START(handle_tlb_load)
+	csrwr	t0, EXCEPTION_KS0
+	csrwr	t1, EXCEPTION_KS1
+	csrwr	ra, EXCEPTION_KS2
+
+	/*
+	 * The vmalloc handling is not in the hotpath.
+	 */
+	csrrd	t0, LOONGARCH_CSR_BADV
+	blt	t0, $r0, vmalloc_load
+	csrrd	t1, LOONGARCH_CSR_PGDL
+
+vmalloc_done_load:
+	/* Get PGD offset in bytes */
+	srli.d	t0, t0, PGDIR_SHIFT
+	andi	t0, t0, (PTRS_PER_PGD - 1)
+	slli.d	t0, t0, 3
+	add.d	t1, t1, t0
+#if CONFIG_PGTABLE_LEVELS > 3
+	csrrd	t0, LOONGARCH_CSR_BADV
+	ld.d	t1, t1, 0
+	srli.d	t0, t0, PUD_SHIFT
+	andi	t0, t0, (PTRS_PER_PUD - 1)
+	slli.d	t0, t0, 3
+	add.d	t1, t1, t0
+#endif
+#if CONFIG_PGTABLE_LEVELS > 2
+	csrrd	t0, LOONGARCH_CSR_BADV
+	ld.d	t1, t1, 0
+	srli.d	t0, t0, PMD_SHIFT
+	andi	t0, t0, (PTRS_PER_PMD - 1)
+	slli.d	t0, t0, 3
+	add.d	t1, t1, t0
+#endif
+	ld.d	ra, t1, 0
+
+	/*
+	 * For huge tlb entries, pmde doesn't contain an address but
+	 * instead contains the tlb pte. Check the PAGE_HUGE bit and
+	 * see if we need to jump to huge tlb processing.
+	 */
+	andi	t0, ra, _PAGE_HUGE
+	bne	t0, $r0, tlb_huge_update_load
+
+	csrrd	t0, LOONGARCH_CSR_BADV
+	srli.d	t0, t0, (PAGE_SHIFT + PTE_ORDER)
+	andi	t0, t0, (PTRS_PER_PTE - 1)
+	slli.d	t0, t0, _PTE_T_LOG2
+	add.d	t1, ra, t0
+
+	ld.d	t0, t1, 0
+	tlbsrch
+
+	srli.d	ra, t0, _PAGE_PRESENT_SHIFT
+	andi	ra, ra, 1
+	beq	ra, $r0, nopage_tlb_load
+
+	ori	t0, t0, _PAGE_VALID
+	st.d	t0, t1, 0
+	ori	t1, t1, 8
+	xori	t1, t1, 8
+	ld.d	t0, t1, 0
+	ld.d	t1, t1, 8
+	csrwr	t0, LOONGARCH_CSR_TLBELO0
+	csrwr	t1, LOONGARCH_CSR_TLBELO1
+	tlbwr
+leave_load:
+	csrrd	t0, EXCEPTION_KS0
+	csrrd	t1, EXCEPTION_KS1
+	csrrd	ra, EXCEPTION_KS2
+	ertn
+#ifdef CONFIG_64BIT
+vmalloc_load:
+	la.abs	t1, swapper_pg_dir
+	b	vmalloc_done_load
+#endif
+
+	/*
+	 * This is the entry point when build_tlbchange_handler_head
+	 * spots a huge page.
+	 */
+tlb_huge_update_load:
+	ld.d	t0, t1, 0
+	srli.d	ra, t0, _PAGE_PRESENT_SHIFT
+	andi	ra, ra, 1
+	beq	ra, $r0, nopage_tlb_load
+	tlbsrch
+
+	ori	t0, t0, _PAGE_VALID
+	st.d	t0, t1, 0
+	addu16i.d	t1, $r0, -(CSR_TLBIDX_EHINV >> 16)
+	addi.d	ra, t1, 0
+	csrxchg	ra, t1, LOONGARCH_CSR_TLBIDX
+	tlbwr
+
+	csrxchg	$r0, t1, LOONGARCH_CSR_TLBIDX
+
+	/*
+	 * A huge PTE describes an area the size of the
+	 * configured huge page size. This is twice the
+	 * of the large TLB entry size we intend to use.
+	 * A TLB entry half the size of the configured
+	 * huge page size is configured into entrylo0
+	 * and entrylo1 to cover the contiguous huge PTE
+	 * address space.
+	 */
+	/* Huge page: Move Global bit */
+	xori	t0, t0, _PAGE_HUGE
+	lu12i.w	t1, _PAGE_HGLOBAL >> 12
+	and	t1, t0, t1
+	srli.d	t1, t1, (_PAGE_HGLOBAL_SHIFT - _PAGE_GLOBAL_SHIFT)
+	or	t0, t0, t1
+
+	addi.d	ra, t0, 0
+	csrwr	t0, LOONGARCH_CSR_TLBELO0
+	addi.d	t0, ra, 0
+
+	/* Convert to entrylo1 */
+	addi.d	t1, $r0, 1
+	slli.d	t1, t1, (HPAGE_SHIFT - 1)
+	add.d	t0, t0, t1
+	csrwr	t0, LOONGARCH_CSR_TLBELO1
+
+	/* Set huge page tlb entry size */
+	addu16i.d	t0, $r0, (PS_MASK >> 16)
+	addu16i.d	t1, $r0, (PS_HUGE_SIZE << (PS_SHIFT - 16))
+	csrxchg		t1, t0, LOONGARCH_CSR_TLBIDX
+
+	tlbfill
+
+	addu16i.d	t0, $r0, (PS_MASK >> 16)
+	addu16i.d	t1, $r0, (PS_DEFAULT_SIZE << (PS_SHIFT - 16))
+	csrxchg		t1, t0, LOONGARCH_CSR_TLBIDX
+
+nopage_tlb_load:
+	csrrd	ra, EXCEPTION_KS2
+	la.abs	t0, tlb_do_page_fault_0
+	jirl	$r0, t0, 0
+SYM_FUNC_END(handle_tlb_load)
+
+SYM_FUNC_START(handle_tlb_store)
+	csrwr	t0, EXCEPTION_KS0
+	csrwr	t1, EXCEPTION_KS1
+	csrwr	ra, EXCEPTION_KS2
+
+	/*
+	 * The vmalloc handling is not in the hotpath.
+	 */
+	csrrd	t0, LOONGARCH_CSR_BADV
+	blt	t0, $r0, vmalloc_store
+	csrrd	t1, LOONGARCH_CSR_PGDL
+
+vmalloc_done_store:
+	/* Get PGD offset in bytes */
+	srli.d	t0, t0, PGDIR_SHIFT
+	andi	t0, t0, (PTRS_PER_PGD - 1)
+	slli.d	t0, t0, 3
+	add.d	t1, t1, t0
+
+#if CONFIG_PGTABLE_LEVELS > 3
+	csrrd	t0, LOONGARCH_CSR_BADV
+	ld.d	t1, t1, 0
+	srli.d	t0, t0, PUD_SHIFT
+	andi	t0, t0, (PTRS_PER_PUD - 1)
+	slli.d	t0, t0, 3
+	add.d	t1, t1, t0
+#endif
+#if CONFIG_PGTABLE_LEVELS > 2
+	csrrd	t0, LOONGARCH_CSR_BADV
+	ld.d	t1, t1, 0
+	srli.d	t0, t0, PMD_SHIFT
+	andi	t0, t0, (PTRS_PER_PMD - 1)
+	slli.d	t0, t0, 3
+	add.d	t1, t1, t0
+#endif
+	ld.d	ra, t1, 0
+
+	/*
+	 * For huge tlb entries, pmde doesn't contain an address but
+	 * instead contains the tlb pte. Check the PAGE_HUGE bit and
+	 * see if we need to jump to huge tlb processing.
+	 */
+	andi	t0, ra, _PAGE_HUGE
+	bne	t0, $r0, tlb_huge_update_store
+
+	csrrd	t0, LOONGARCH_CSR_BADV
+	srli.d	t0, t0, (PAGE_SHIFT + PTE_ORDER)
+	andi	t0, t0, (PTRS_PER_PTE - 1)
+	slli.d	t0, t0, _PTE_T_LOG2
+	add.d	t1, ra, t0
+
+	ld.d	t0, t1, 0
+	tlbsrch
+
+	srli.d	ra, t0, _PAGE_PRESENT_SHIFT
+	andi	ra, ra, ((_PAGE_PRESENT | _PAGE_WRITE) >> _PAGE_PRESENT_SHIFT)
+	xori	ra, ra, ((_PAGE_PRESENT | _PAGE_WRITE) >> _PAGE_PRESENT_SHIFT)
+	bne	ra, $r0, nopage_tlb_store
+
+	ori	t0, t0, (_PAGE_VALID | _PAGE_DIRTY | _PAGE_MODIFIED)
+	st.d	t0, t1, 0
+
+	ori	t1, t1, 8
+	xori	t1, t1, 8
+	ld.d	t0, t1, 0
+	ld.d	t1, t1, 8
+	csrwr	t0, LOONGARCH_CSR_TLBELO0
+	csrwr	t1, LOONGARCH_CSR_TLBELO1
+	tlbwr
+leave_store:
+	csrrd	t0, EXCEPTION_KS0
+	csrrd	t1, EXCEPTION_KS1
+	csrrd	ra, EXCEPTION_KS2
+	ertn
+#ifdef CONFIG_64BIT
+vmalloc_store:
+	la.abs	t1, swapper_pg_dir
+	b	vmalloc_done_store
+#endif
+
+	/*
+	 * This is the entry point when build_tlbchange_handler_head
+	 * spots a huge page.
+	 */
+tlb_huge_update_store:
+	ld.d	t0, t1, 0
+	srli.d	ra, t0, _PAGE_PRESENT_SHIFT
+	andi	ra, ra, ((_PAGE_PRESENT | _PAGE_WRITE) >> _PAGE_PRESENT_SHIFT)
+	xori	ra, ra, ((_PAGE_PRESENT | _PAGE_WRITE) >> _PAGE_PRESENT_SHIFT)
+	bne	ra, $r0, nopage_tlb_store
+
+	tlbsrch
+	ori	t0, t0, (_PAGE_VALID | _PAGE_DIRTY | _PAGE_MODIFIED)
+
+	st.d	t0, t1, 0
+	addu16i.d	t1, $r0, -(CSR_TLBIDX_EHINV >> 16)
+	addi.d	ra, t1, 0
+	csrxchg	ra, t1, LOONGARCH_CSR_TLBIDX
+	tlbwr
+
+	csrxchg	$r0, t1, LOONGARCH_CSR_TLBIDX
+	/*
+	 * A huge PTE describes an area the size of the
+	 * configured huge page size. This is twice the
+	 * of the large TLB entry size we intend to use.
+	 * A TLB entry half the size of the configured
+	 * huge page size is configured into entrylo0
+	 * and entrylo1 to cover the contiguous huge PTE
+	 * address space.
+	 */
+	/* Huge page: Move Global bit */
+	xori	t0, t0, _PAGE_HUGE
+	lu12i.w	t1, _PAGE_HGLOBAL >> 12
+	and	t1, t0, t1
+	srli.d	t1, t1, (_PAGE_HGLOBAL_SHIFT - _PAGE_GLOBAL_SHIFT)
+	or	t0, t0, t1
+
+	addi.d	ra, t0, 0
+	csrwr	t0, LOONGARCH_CSR_TLBELO0
+	addi.d	t0, ra, 0
+
+	/* Convert to entrylo1 */
+	addi.d	t1, $r0, 1
+	slli.d	t1, t1, (HPAGE_SHIFT - 1)
+	add.d	t0, t0, t1
+	csrwr	t0, LOONGARCH_CSR_TLBELO1
+
+	/* Set huge page tlb entry size */
+	addu16i.d	t0, $r0, (PS_MASK >> 16)
+	addu16i.d	t1, $r0, (PS_HUGE_SIZE << (PS_SHIFT - 16))
+	csrxchg		t1, t0, LOONGARCH_CSR_TLBIDX
+
+	tlbfill
+
+	/* Reset default page size */
+	addu16i.d	t0, $r0, (PS_MASK >> 16)
+	addu16i.d	t1, $r0, (PS_DEFAULT_SIZE << (PS_SHIFT - 16))
+	csrxchg		t1, t0, LOONGARCH_CSR_TLBIDX
+
+nopage_tlb_store:
+	csrrd	ra, EXCEPTION_KS2
+	la.abs	t0, tlb_do_page_fault_1
+	jirl	$r0, t0, 0
+SYM_FUNC_END(handle_tlb_store)
+
+SYM_FUNC_START(handle_tlb_modify)
+	csrwr	t0, EXCEPTION_KS0
+	csrwr	t1, EXCEPTION_KS1
+	csrwr	ra, EXCEPTION_KS2
+
+	/*
+	 * The vmalloc handling is not in the hotpath.
+	 */
+	csrrd	t0, LOONGARCH_CSR_BADV
+	blt	t0, $r0, vmalloc_modify
+	csrrd	t1, LOONGARCH_CSR_PGDL
+
+vmalloc_done_modify:
+	/* Get PGD offset in bytes */
+	srli.d	t0, t0, PGDIR_SHIFT
+	andi	t0, t0, (PTRS_PER_PGD - 1)
+	slli.d	t0, t0, 3
+	add.d	t1, t1, t0
+#if CONFIG_PGTABLE_LEVELS > 3
+	csrrd	t0, LOONGARCH_CSR_BADV
+	ld.d	t1, t1, 0
+	srli.d	t0, t0, PUD_SHIFT
+	andi	t0, t0, (PTRS_PER_PUD - 1)
+	slli.d	t0, t0, 3
+	add.d	t1, t1, t0
+#endif
+#if CONFIG_PGTABLE_LEVELS > 2
+	csrrd	t0, LOONGARCH_CSR_BADV
+	ld.d	t1, t1, 0
+	srli.d	t0, t0, PMD_SHIFT
+	andi	t0, t0, (PTRS_PER_PMD - 1)
+	slli.d	t0, t0, 3
+	add.d	t1, t1, t0
+#endif
+	ld.d	ra, t1, 0
+
+	/*
+	 * For huge tlb entries, pmde doesn't contain an address but
+	 * instead contains the tlb pte. Check the PAGE_HUGE bit and
+	 * see if we need to jump to huge tlb processing.
+	 */
+	andi	t0, ra, _PAGE_HUGE
+	bne	t0, $r0, tlb_huge_update_modify
+
+	csrrd	t0, LOONGARCH_CSR_BADV
+	srli.d	t0, t0, (PAGE_SHIFT + PTE_ORDER)
+	andi	t0, t0, (PTRS_PER_PTE - 1)
+	slli.d	t0, t0, _PTE_T_LOG2
+	add.d	t1, ra, t0
+
+	ld.d	t0, t1, 0
+	tlbsrch
+
+	srli.d	ra, t0, _PAGE_WRITE_SHIFT
+	andi	ra, ra, 1
+	beq	ra, $r0, nopage_tlb_modify
+
+	ori	t0, t0, (_PAGE_VALID | _PAGE_DIRTY | _PAGE_MODIFIED)
+	st.d	t0, t1, 0
+	ori	t1, t1, 8
+	xori	t1, t1, 8
+	ld.d	t0, t1, 0
+	ld.d	t1, t1, 8
+	csrwr	t0, LOONGARCH_CSR_TLBELO0
+	csrwr	t1, LOONGARCH_CSR_TLBELO1
+	tlbwr
+leave_modify:
+	csrrd	t0, EXCEPTION_KS0
+	csrrd	t1, EXCEPTION_KS1
+	csrrd	ra, EXCEPTION_KS2
+	ertn
+#ifdef CONFIG_64BIT
+vmalloc_modify:
+	la.abs  t1, swapper_pg_dir
+	b	vmalloc_done_modify
+#endif
+
+	/*
+	 * This is the entry point when
+	 * build_tlbchange_handler_head spots a huge page.
+	 */
+tlb_huge_update_modify:
+	ld.d	t0, t1, 0
+
+	srli.d	ra, t0, _PAGE_WRITE_SHIFT
+	andi	ra, ra, 1
+	beq	ra, $r0, nopage_tlb_modify
+
+	tlbsrch
+	ori	t0, t0, (_PAGE_VALID | _PAGE_DIRTY | _PAGE_MODIFIED)
+
+	st.d	t0, t1, 0
+	/*
+	 * A huge PTE describes an area the size of the
+	 * configured huge page size. This is twice the
+	 * of the large TLB entry size we intend to use.
+	 * A TLB entry half the size of the configured
+	 * huge page size is configured into entrylo0
+	 * and entrylo1 to cover the contiguous huge PTE
+	 * address space.
+	 */
+	/* Huge page: Move Global bit */
+	xori	t0, t0, _PAGE_HUGE
+	lu12i.w	t1, _PAGE_HGLOBAL >> 12
+	and	t1, t0, t1
+	srli.d	t1, t1, (_PAGE_HGLOBAL_SHIFT - _PAGE_GLOBAL_SHIFT)
+	or	t0, t0, t1
+
+	addi.d	ra, t0, 0
+	csrwr	t0, LOONGARCH_CSR_TLBELO0
+	addi.d	t0, ra, 0
+
+	/* Convert to entrylo1 */
+	addi.d	t1, $r0, 1
+	slli.d	t1, t1, (HPAGE_SHIFT - 1)
+	add.d	t0, t0, t1
+	csrwr	t0, LOONGARCH_CSR_TLBELO1
+
+	/* Set huge page tlb entry size */
+	addu16i.d	t0, $r0, (PS_MASK >> 16)
+	addu16i.d	t1, $r0, (PS_HUGE_SIZE << (PS_SHIFT - 16))
+	csrxchg	t1, t0, LOONGARCH_CSR_TLBIDX
+
+	tlbwr
+
+	/* Reset default page size */
+	addu16i.d	t0, $r0, (PS_MASK >> 16)
+	addu16i.d	t1, $r0, (PS_DEFAULT_SIZE << (PS_SHIFT - 16))
+	csrxchg	t1, t0, LOONGARCH_CSR_TLBIDX
+
+nopage_tlb_modify:
+	csrrd	ra, EXCEPTION_KS2
+	la.abs	t0, tlb_do_page_fault_1
+	jirl	$r0, t0, 0
+SYM_FUNC_END(handle_tlb_modify)
+
+SYM_FUNC_START(handle_tlb_refill)
+	csrwr	t0, LOONGARCH_CSR_TLBRSAVE
+	csrrd	t0, LOONGARCH_CSR_PGD
+	lddir	t0, t0, 3
+#if CONFIG_PGTABLE_LEVELS > 3
+	lddir	t0, t0, 2
+#endif
+#if CONFIG_PGTABLE_LEVELS > 2
+	lddir	t0, t0, 1
+#endif
+	ldpte	t0, 0
+	ldpte	t0, 1
+	tlbfill
+	csrrd	t0, LOONGARCH_CSR_TLBRSAVE
+	ertn
+SYM_FUNC_END(handle_tlb_refill)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 13/24] LoongArch: Add system call support
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (11 preceding siblings ...)
  2022-04-30  9:05 ` [PATCH V9 12/24] LoongArch: Add memory management Huacai Chen
@ 2022-04-30  9:05 ` Huacai Chen
  2022-04-30  9:44   ` Arnd Bergmann
  2022-04-30  9:05 ` [PATCH V9 14/24] LoongArch: Add signal handling support Huacai Chen
                   ` (11 subsequent siblings)
  24 siblings, 1 reply; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:05 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen

This patch adds system call support and related uaccess.h for LoongArch.

Q: Why keep __ARCH_WANT_NEW_STAT definition while there is statx:
A: Until the latest glibc release (2.34), statx is only used for 32-bit
   platforms, or 64-bit platforms with 32-bit timestamp. I.e., Most 64-
   bit platforms still use newstat now.

Q: Why keep _ARCH_WANT_SYS_CLONE definition while there is clone3:
A: The latest glibc release (2.34) has some basic support for clone3 but
   it isn't complete. E.g., pthread_create() and spawni() have converted
   to use clone3 but fork() will still use clone. Moreover, some seccomp
   related applications can still not work perfectly with clone3. E.g.,
   Chromium sandbox cannot work at all and there is no solution for it,
   which is more terrible than the fork() story [1].

[1] https://chromium-review.googlesource.com/c/chromium/src/+/2936184

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/include/asm/seccomp.h     |  20 ++
 arch/loongarch/include/asm/syscall.h     |  74 +++++++
 arch/loongarch/include/asm/uaccess.h     | 270 +++++++++++++++++++++++
 arch/loongarch/include/asm/unistd.h      |  11 +
 arch/loongarch/include/uapi/asm/unistd.h |   6 +
 arch/loongarch/kernel/entry.S            |  89 ++++++++
 arch/loongarch/kernel/syscall.c          |  63 ++++++
 7 files changed, 533 insertions(+)
 create mode 100644 arch/loongarch/include/asm/seccomp.h
 create mode 100644 arch/loongarch/include/asm/syscall.h
 create mode 100644 arch/loongarch/include/asm/uaccess.h
 create mode 100644 arch/loongarch/include/asm/unistd.h
 create mode 100644 arch/loongarch/include/uapi/asm/unistd.h
 create mode 100644 arch/loongarch/kernel/entry.S
 create mode 100644 arch/loongarch/kernel/syscall.c

diff --git a/arch/loongarch/include/asm/seccomp.h b/arch/loongarch/include/asm/seccomp.h
new file mode 100644
index 000000000000..31d6ab42e43e
--- /dev/null
+++ b/arch/loongarch/include/asm/seccomp.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _ASM_SECCOMP_H
+#define _ASM_SECCOMP_H
+
+#include <asm/unistd.h>
+
+#include <asm-generic/seccomp.h>
+
+#ifdef CONFIG_32BIT
+# define SECCOMP_ARCH_NATIVE		AUDIT_ARCH_LOONGARCH32
+# define SECCOMP_ARCH_NATIVE_NR		NR_syscalls
+# define SECCOMP_ARCH_NATIVE_NAME	"loongarch32"
+#else
+# define SECCOMP_ARCH_NATIVE		AUDIT_ARCH_LOONGARCH64
+# define SECCOMP_ARCH_NATIVE_NR		NR_syscalls
+# define SECCOMP_ARCH_NATIVE_NAME	"loongarch64"
+#endif
+
+#endif /* _ASM_SECCOMP_H */
diff --git a/arch/loongarch/include/asm/syscall.h b/arch/loongarch/include/asm/syscall.h
new file mode 100644
index 000000000000..e286dc58476e
--- /dev/null
+++ b/arch/loongarch/include/asm/syscall.h
@@ -0,0 +1,74 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+/*
+ * Author: Hanlu Li <lihanlu@loongson.cn>
+ *         Huacai Chen <chenhuacai@loongson.cn>
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#ifndef __ASM_LOONGARCH_SYSCALL_H
+#define __ASM_LOONGARCH_SYSCALL_H
+
+#include <linux/compiler.h>
+#include <uapi/linux/audit.h>
+#include <linux/elf-em.h>
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/uaccess.h>
+#include <asm/ptrace.h>
+#include <asm/unistd.h>
+
+extern void *sys_call_table[];
+
+static inline long syscall_get_nr(struct task_struct *task,
+				  struct pt_regs *regs)
+{
+	return regs->regs[11];
+}
+
+static inline void syscall_rollback(struct task_struct *task,
+				    struct pt_regs *regs)
+{
+        regs->regs[4] = regs->orig_a0;
+}
+
+static inline long syscall_get_error(struct task_struct *task,
+				     struct pt_regs *regs)
+{
+	unsigned long error = regs->regs[4];
+
+	return IS_ERR_VALUE(error) ? error : 0;
+}
+
+static inline long syscall_get_return_value(struct task_struct *task,
+					    struct pt_regs *regs)
+{
+	return regs->regs[4];
+}
+
+static inline void syscall_set_return_value(struct task_struct *task,
+					    struct pt_regs *regs,
+					    int error, long val)
+{
+	regs->regs[4] = (long) error ? error : val;
+}
+
+static inline void syscall_get_arguments(struct task_struct *task,
+					 struct pt_regs *regs,
+					 unsigned long *args)
+{
+	args[0] = regs->orig_a0;
+	memcpy(&args[1], &regs->regs[5], 5 * sizeof(long));
+}
+
+static inline int syscall_get_arch(struct task_struct *task)
+{
+	return AUDIT_ARCH_LOONGARCH64;
+}
+
+static inline bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs)
+{
+	return false;
+}
+
+#endif	/* __ASM_LOONGARCH_SYSCALL_H */
diff --git a/arch/loongarch/include/asm/uaccess.h b/arch/loongarch/include/asm/uaccess.h
new file mode 100644
index 000000000000..d61c8ecfbbce
--- /dev/null
+++ b/arch/loongarch/include/asm/uaccess.h
@@ -0,0 +1,270 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ *
+ * Derived from MIPS:
+ * Copyright (C) 1996, 1997, 1998, 1999, 2000, 03, 04 by Ralf Baechle
+ * Copyright (C) 1999, 2000 Silicon Graphics, Inc.
+ * Copyright (C) 2007  Maciej W. Rozycki
+ * Copyright (C) 2014, Imagination Technologies Ltd.
+ */
+#ifndef _ASM_UACCESS_H
+#define _ASM_UACCESS_H
+
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/extable.h>
+#include <asm/pgtable.h>
+#include <asm-generic/extable.h>
+#include <asm-generic/access_ok.h>
+
+extern u64 __ua_limit;
+
+#define __UA_ADDR	".dword"
+#define __UA_ADDU	"add.d"
+#define __UA_LA		"la.abs"
+#define __UA_LIMIT	__ua_limit
+
+/*
+ * get_user: - Get a simple variable from user space.
+ * @x:	 Variable to store result.
+ * @ptr: Source address, in user space.
+ *
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
+ *
+ * This macro copies a single simple variable from user space to kernel
+ * space.  It supports simple types like char and int, but not larger
+ * data types like structures or arrays.
+ *
+ * @ptr must have pointer-to-simple-variable type, and the result of
+ * dereferencing @ptr must be assignable to @x without a cast.
+ *
+ * Returns zero on success, or -EFAULT on error.
+ * On error, the variable @x is set to zero.
+ */
+#define get_user(x, ptr) \
+({									\
+	const __typeof__(*(ptr)) __user *__p = (ptr);			\
+									\
+	might_fault();							\
+	access_ok(__p, sizeof(*__p)) ? __get_user((x), __p) :		\
+				       ((x) = 0, -EFAULT);		\
+})
+
+/*
+ * put_user: - Write a simple value into user space.
+ * @x:	 Value to copy to user space.
+ * @ptr: Destination address, in user space.
+ *
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
+ *
+ * This macro copies a single simple value from kernel space to user
+ * space.  It supports simple types like char and int, but not larger
+ * data types like structures or arrays.
+ *
+ * @ptr must have pointer-to-simple-variable type, and @x must be assignable
+ * to the result of dereferencing @ptr.
+ *
+ * Returns zero on success, or -EFAULT on error.
+ */
+#define put_user(x, ptr) \
+({									\
+	__typeof__(*(ptr)) __user *__p = (ptr);				\
+									\
+	might_fault();							\
+	access_ok(__p, sizeof(*__p)) ? __put_user((x), __p) : -EFAULT;	\
+})
+
+/*
+ * __get_user: - Get a simple variable from user space, with less checking.
+ * @x:	 Variable to store result.
+ * @ptr: Source address, in user space.
+ *
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
+ *
+ * This macro copies a single simple variable from user space to kernel
+ * space.  It supports simple types like char and int, but not larger
+ * data types like structures or arrays.
+ *
+ * @ptr must have pointer-to-simple-variable type, and the result of
+ * dereferencing @ptr must be assignable to @x without a cast.
+ *
+ * Caller must check the pointer with access_ok() before calling this
+ * function.
+ *
+ * Returns zero on success, or -EFAULT on error.
+ * On error, the variable @x is set to zero.
+ */
+#define __get_user(x, ptr) \
+({									\
+	int __gu_err = 0;						\
+									\
+	__chk_user_ptr(ptr);						\
+	__get_user_common((x), sizeof(*(ptr)), ptr);			\
+	__gu_err;							\
+})
+
+/*
+ * __put_user: - Write a simple value into user space, with less checking.
+ * @x:	 Value to copy to user space.
+ * @ptr: Destination address, in user space.
+ *
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
+ *
+ * This macro copies a single simple value from kernel space to user
+ * space.  It supports simple types like char and int, but not larger
+ * data types like structures or arrays.
+ *
+ * @ptr must have pointer-to-simple-variable type, and @x must be assignable
+ * to the result of dereferencing @ptr.
+ *
+ * Caller must check the pointer with access_ok() before calling this
+ * function.
+ *
+ * Returns zero on success, or -EFAULT on error.
+ */
+#define __put_user(x, ptr) \
+({									\
+	int __pu_err = 0;						\
+	__typeof__(*(ptr)) __pu_val;					\
+									\
+	__pu_val = (x);							\
+	__chk_user_ptr(ptr);						\
+	__put_user_common(ptr, sizeof(*(ptr)));				\
+	__pu_err;							\
+})
+
+struct __large_struct { unsigned long buf[100]; };
+#define __m(x) (*(struct __large_struct __user *)(x))
+
+#define __get_user_common(val, size, ptr)				\
+do {									\
+	switch (size) {							\
+	case 1: __get_data_asm(val, "ld.b", ptr); break;		\
+	case 2: __get_data_asm(val, "ld.h", ptr); break;		\
+	case 4: __get_data_asm(val, "ld.w", ptr); break;		\
+	case 8: __get_data_asm(val, "ld.d", ptr); break;		\
+	default: BUILD_BUG(); break;					\
+	}								\
+} while (0)
+
+#define __get_kernel_common(val, size, ptr) __get_user_common(val, size, ptr)
+
+#define __get_data_asm(val, insn, ptr)					\
+{									\
+	long __gu_tmp;							\
+									\
+	__asm__ __volatile__(						\
+	"1:	" insn "	%1, %2				\n"	\
+	"2:							\n"	\
+	"	.section .fixup,\"ax\"				\n"	\
+	"3:	li.w	%0, %3					\n"	\
+	"	or	%1, $r0, $r0				\n"	\
+	"	b	2b					\n"	\
+	"	.previous					\n"	\
+	"	.section __ex_table,\"a\"			\n"	\
+	"	"__UA_ADDR "\t1b, 3b				\n"	\
+	"	.previous					\n"	\
+	: "+r" (__gu_err), "=r" (__gu_tmp)				\
+	: "m" (__m(ptr)), "i" (-EFAULT));				\
+									\
+	(val) = (__typeof__(*(ptr))) __gu_tmp;				\
+}
+
+#define __put_user_common(ptr, size)					\
+do {									\
+	switch (size) {							\
+	case 1: __put_data_asm("st.b", ptr); break;			\
+	case 2: __put_data_asm("st.h", ptr); break;			\
+	case 4: __put_data_asm("st.w", ptr); break;			\
+	case 8: __put_data_asm("st.d", ptr); break;			\
+	default: BUILD_BUG(); break;					\
+	}								\
+} while (0)
+
+#define __put_kernel_common(ptr, size) __put_user_common(ptr, size)
+
+#define __put_data_asm(insn, ptr)					\
+{									\
+	__asm__ __volatile__(						\
+	"1:	" insn "	%z2, %1		# __put_user_asm\n"	\
+	"2:							\n"	\
+	"	.section	.fixup,\"ax\"			\n"	\
+	"3:	li.w	%0, %3					\n"	\
+	"	b	2b					\n"	\
+	"	.previous					\n"	\
+	"	.section	__ex_table,\"a\"		\n"	\
+	"	" __UA_ADDR "	1b, 3b				\n"	\
+	"	.previous					\n"	\
+	: "+r" (__pu_err), "=m" (__m(ptr))				\
+	: "Jr" (__pu_val), "i" (-EFAULT));				\
+}
+
+#define __get_kernel_nofault(dst, src, type, err_label)			\
+do {									\
+	int __gu_err = 0;						\
+									\
+	__get_kernel_common(*((type *)(dst)), sizeof(type),		\
+			    (__force type *)(src));			\
+	if (unlikely(__gu_err))						\
+		goto err_label;						\
+} while (0)
+
+#define __put_kernel_nofault(dst, src, type, err_label)			\
+do {									\
+	type __pu_val;							\
+	int __pu_err = 0;						\
+									\
+	__pu_val = *(__force type *)(src);				\
+	__put_kernel_common(((type *)(dst)), sizeof(type));		\
+	if (unlikely(__pu_err))						\
+		goto err_label;						\
+} while (0)
+
+extern unsigned long __copy_user(void *to, const void *from, __kernel_size_t n);
+
+static inline unsigned long __must_check
+raw_copy_from_user(void *to, const void __user *from, unsigned long n)
+{
+	return __copy_user(to, from, n);
+}
+
+static inline unsigned long __must_check
+raw_copy_to_user(void __user *to, const void *from, unsigned long n)
+{
+	return __copy_user(to, from, n);
+}
+
+#define INLINE_COPY_FROM_USER
+#define INLINE_COPY_TO_USER
+
+/*
+ * __clear_user: - Zero a block of memory in user space, with less checking.
+ * @addr: Destination address, in user space.
+ * @size: Number of bytes to zero.
+ *
+ * Zero a block of memory in user space.  Caller must check
+ * the specified block with access_ok() before calling this function.
+ *
+ * Returns number of bytes that could not be cleared.
+ * On success, this will be zero.
+ */
+extern unsigned long __clear_user(void __user *addr, __kernel_size_t size);
+
+#define clear_user(addr, n)						\
+({									\
+	void __user *__cl_addr = (addr);				\
+	unsigned long __cl_size = (n);					\
+	if (__cl_size && access_ok(__cl_addr, __cl_size))		\
+		__cl_size = __clear_user(__cl_addr, __cl_size);		\
+	__cl_size;							\
+})
+
+extern long strncpy_from_user(char *to, const char __user *from, long n);
+extern long strnlen_user(const char __user *str, long n);
+
+#endif /* _ASM_UACCESS_H */
diff --git a/arch/loongarch/include/asm/unistd.h b/arch/loongarch/include/asm/unistd.h
new file mode 100644
index 000000000000..cfddb0116a8c
--- /dev/null
+++ b/arch/loongarch/include/asm/unistd.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+/*
+ * Author: Hanlu Li <lihanlu@loongson.cn>
+ *         Huacai Chen <chenhuacai@loongson.cn>
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <uapi/asm/unistd.h>
+
+#define NR_syscalls (__NR_syscalls)
diff --git a/arch/loongarch/include/uapi/asm/unistd.h b/arch/loongarch/include/uapi/asm/unistd.h
new file mode 100644
index 000000000000..b344b1f91715
--- /dev/null
+++ b/arch/loongarch/include/uapi/asm/unistd.h
@@ -0,0 +1,6 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#define __ARCH_WANT_NEW_STAT
+#define __ARCH_WANT_SYS_CLONE
+#define __ARCH_WANT_SYS_CLONE3
+
+#include <asm-generic/unistd.h>
diff --git a/arch/loongarch/kernel/entry.S b/arch/loongarch/kernel/entry.S
new file mode 100644
index 000000000000..d5b3dbcf5425
--- /dev/null
+++ b/arch/loongarch/kernel/entry.S
@@ -0,0 +1,89 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ *
+ * Derived from MIPS:
+ * Copyright (C) 1994 - 2000, 2001, 2003 Ralf Baechle
+ * Copyright (C) 1999, 2000 Silicon Graphics, Inc.
+ * Copyright (C) 2001 MIPS Technologies, Inc.
+ */
+
+#include <asm/asm.h>
+#include <asm/asmmacro.h>
+#include <asm/loongarch.h>
+#include <asm/regdef.h>
+#include <asm/stackframe.h>
+#include <asm/thread_info.h>
+
+	.text
+	.cfi_sections	.debug_frame
+	.align	5
+SYM_FUNC_START(handle_syscall)
+	csrrd	t0, PERCPU_BASE_KS
+	la.abs	t1, kernelsp
+	add.d	t1, t1, t0
+	move	t2, sp
+	ld.d	sp, t1, 0
+
+	addi.d	sp, sp, -PT_SIZE
+	cfi_st	t2, PT_R3
+	cfi_rel_offset  sp, PT_R3
+	st.d	zero, sp, PT_R0
+	csrrd	t2, LOONGARCH_CSR_PRMD
+	st.d	t2, sp, PT_PRMD
+	csrrd	t2, LOONGARCH_CSR_CRMD
+	st.d	t2, sp, PT_CRMD
+	csrrd	t2, LOONGARCH_CSR_EUEN
+	st.d	t2, sp, PT_EUEN
+	csrrd	t2, LOONGARCH_CSR_ECFG
+	st.d	t2, sp, PT_ECFG
+	csrrd	t2, LOONGARCH_CSR_ESTAT
+	st.d	t2, sp, PT_ESTAT
+	cfi_st	ra, PT_R1
+	cfi_st	a0, PT_R4
+	cfi_st	a1, PT_R5
+	cfi_st	a2, PT_R6
+	cfi_st	a3, PT_R7
+	cfi_st	a4, PT_R8
+	cfi_st	a5, PT_R9
+	cfi_st	a6, PT_R10
+	cfi_st	a7, PT_R11
+	csrrd	ra, LOONGARCH_CSR_ERA
+	st.d	ra, sp, PT_ERA
+	cfi_rel_offset ra, PT_ERA
+
+	cfi_st	tp, PT_R2
+	cfi_st	u0, PT_R21
+	cfi_st	fp, PT_R22
+
+	SAVE_STATIC
+
+	move	u0, t0
+	li.d	tp, ~_THREAD_MASK
+	and	tp, tp, sp
+
+	move	a0, sp
+	bl	do_syscall
+
+	RESTORE_ALL_AND_RET
+SYM_FUNC_END(handle_syscall)
+
+SYM_CODE_START(ret_from_fork)
+	bl	schedule_tail		# a0 = struct task_struct *prev
+	move	a0, sp
+	bl 	syscall_exit_to_user_mode
+	RESTORE_STATIC
+	RESTORE_SOME
+	RESTORE_SP_AND_RET
+SYM_CODE_END(ret_from_fork)
+
+SYM_CODE_START(ret_from_kernel_thread)
+	bl	schedule_tail		# a0 = struct task_struct *prev
+	move	a0, s1
+	jirl	ra, s0, 0
+	move	a0, sp
+	bl	syscall_exit_to_user_mode
+	RESTORE_STATIC
+	RESTORE_SOME
+	RESTORE_SP_AND_RET
+SYM_CODE_END(ret_from_kernel_thread)
diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c
new file mode 100644
index 000000000000..3fc4211db989
--- /dev/null
+++ b/arch/loongarch/kernel/syscall.c
@@ -0,0 +1,63 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Author: Hanlu Li <lihanlu@loongson.cn>
+ *         Huacai Chen <chenhuacai@loongson.cn>
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/capability.h>
+#include <linux/entry-common.h>
+#include <linux/errno.h>
+#include <linux/linkage.h>
+#include <linux/syscalls.h>
+#include <linux/unistd.h>
+
+#include <asm/asm.h>
+#include <asm/signal.h>
+#include <asm/switch_to.h>
+#include <asm-generic/syscalls.h>
+
+#undef __SYSCALL
+#define __SYSCALL(nr, call)	[nr] = (call),
+
+SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len, unsigned long,
+		prot, unsigned long, flags, unsigned long, fd, off_t, offset)
+{
+	if (offset & ~PAGE_MASK)
+		return -EINVAL;
+
+	return ksys_mmap_pgoff(addr, len, prot, flags, fd, offset >> PAGE_SHIFT);
+}
+
+void *sys_call_table[__NR_syscalls] = {
+	[0 ... __NR_syscalls - 1] = sys_ni_syscall,
+#include <asm/unistd.h>
+};
+
+typedef long (*sys_call_fn)(unsigned long, unsigned long,
+	unsigned long, unsigned long, unsigned long, unsigned long);
+
+void noinstr do_syscall(struct pt_regs *regs)
+{
+	unsigned long nr;
+	sys_call_fn syscall_fn;
+
+	nr = regs->regs[11];
+	/* Set for syscall restarting */
+	if (nr < NR_syscalls)
+		regs->regs[0] = nr + 1;
+
+	regs->csr_era += 4;
+	regs->orig_a0 = regs->regs[4];
+	regs->regs[4] = -ENOSYS;
+
+	nr = syscall_enter_from_user_mode(regs, nr);
+
+	if (nr < NR_syscalls) {
+		syscall_fn = sys_call_table[nr];
+		regs->regs[4] = syscall_fn(regs->orig_a0, regs->regs[5], regs->regs[6],
+					   regs->regs[7], regs->regs[8], regs->regs[9]);
+	}
+
+	syscall_exit_to_user_mode(regs);
+}
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 14/24] LoongArch: Add signal handling support
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (12 preceding siblings ...)
  2022-04-30  9:05 ` [PATCH V9 13/24] LoongArch: Add system call support Huacai Chen
@ 2022-04-30  9:05 ` Huacai Chen
  2022-04-30  9:05 ` [PATCH V9 15/24] LoongArch: Add elf and module support Huacai Chen
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:05 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen,
	Eric Biederman, Al Viro

This patch adds signal handling support for LoongArch.

Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/include/uapi/asm/sigcontext.h |  63 ++
 arch/loongarch/include/uapi/asm/signal.h     |  13 +
 arch/loongarch/include/uapi/asm/ucontext.h   |  35 ++
 arch/loongarch/kernel/signal.c               | 568 +++++++++++++++++++
 4 files changed, 679 insertions(+)
 create mode 100644 arch/loongarch/include/uapi/asm/sigcontext.h
 create mode 100644 arch/loongarch/include/uapi/asm/signal.h
 create mode 100644 arch/loongarch/include/uapi/asm/ucontext.h
 create mode 100644 arch/loongarch/kernel/signal.c

diff --git a/arch/loongarch/include/uapi/asm/sigcontext.h b/arch/loongarch/include/uapi/asm/sigcontext.h
new file mode 100644
index 000000000000..efeb8b3f8236
--- /dev/null
+++ b/arch/loongarch/include/uapi/asm/sigcontext.h
@@ -0,0 +1,63 @@
+/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
+/*
+ * Author: Hanlu Li <lihanlu@loongson.cn>
+ *         Huacai Chen <chenhuacai@loongson.cn>
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _UAPI_ASM_SIGCONTEXT_H
+#define _UAPI_ASM_SIGCONTEXT_H
+
+#include <linux/types.h>
+#include <linux/posix_types.h>
+
+/* FP context was used */
+#define USED_FP			(1 << 0)
+/* Load/Store access flags for address error */
+#define ADRERR_RD		(1 << 30)
+#define ADRERR_WR		(1 << 31)
+
+struct sigcontext {
+	__u64	sc_pc;
+	__u64	sc_regs[32];
+	__u32	sc_flags;
+	__u64	sc_extcontext[0] __attribute__((__aligned__(16)));
+};
+
+#define CONTEXT_INFO_ALIGN	16
+struct _ctxinfo {
+	__u32	magic;
+	__u32	size;
+	__u64	padding;	/* padding to 16 bytes */
+};
+
+/* FPU context */
+#define FPU_CTX_MAGIC		0x46505501
+#define FPU_CTX_ALIGN		8
+struct fpu_context {
+	__u64	regs[32];
+	__u64	fcc;
+	__u32	fcsr;
+};
+
+/* LSX context */
+#define LSX_CTX_MAGIC		0x53580001
+#define LSX_CTX_ALIGN		16
+struct lsx_context {
+	__u64	regs[2*32];
+	__u64	fcc;
+	__u32	fcsr;
+	__u32	vcsr;
+};
+
+/* LASX context */
+#define LASX_CTX_MAGIC		0x41535801
+#define LASX_CTX_ALIGN		32
+struct lasx_context {
+	__u64	regs[4*32];
+	__u64	fcc;
+	__u32	fcsr;
+	__u32	vcsr;
+};
+
+#endif /* _UAPI_ASM_SIGCONTEXT_H */
diff --git a/arch/loongarch/include/uapi/asm/signal.h b/arch/loongarch/include/uapi/asm/signal.h
new file mode 100644
index 000000000000..992d965aa13f
--- /dev/null
+++ b/arch/loongarch/include/uapi/asm/signal.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _UAPI_ASM_SIGNAL_H
+#define _UAPI_ASM_SIGNAL_H
+
+#define MINSIGSTKSZ 4096
+#define SIGSTKSZ    16384
+
+#include <asm-generic/signal.h>
+
+#endif
diff --git a/arch/loongarch/include/uapi/asm/ucontext.h b/arch/loongarch/include/uapi/asm/ucontext.h
new file mode 100644
index 000000000000..12577e22b1c7
--- /dev/null
+++ b/arch/loongarch/include/uapi/asm/ucontext.h
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef __LOONGARCH_UAPI_ASM_UCONTEXT_H
+#define __LOONGARCH_UAPI_ASM_UCONTEXT_H
+
+/**
+ * struct ucontext - user context structure
+ * @uc_flags:
+ * @uc_link:
+ * @uc_stack:
+ * @uc_mcontext:	holds basic processor state
+ * @uc_sigmask:
+ * @uc_extcontext:	holds extended processor state
+ */
+struct ucontext {
+	unsigned long		uc_flags;
+	struct ucontext		*uc_link;
+	stack_t			uc_stack;
+	sigset_t		uc_sigmask;
+	/* There's some padding here to allow sigset_t to be expanded in the
+	 * future.  Though this is unlikely, other architectures put uc_sigmask
+	 * at the end of this structure and explicitly state it can be
+	 * expanded, so we didn't want to box ourselves in here. */
+	__u8		  __unused[1024 / 8 - sizeof(sigset_t)];
+	/* We can't put uc_sigmask at the end of this structure because we need
+	 * to be able to expand sigcontext in the future.  For example, the
+	 * vector ISA extension will almost certainly add ISA state.  We want
+	 * to ensure all user-visible ISA state can be saved and restored via a
+	 * ucontext, so we're putting this at the end in order to allow for
+	 * infinite extensibility.  Since we know this will be extended and we
+	 * assume sigset_t won't be extended an extreme amount, we're
+	 * prioritizing this. */
+	struct sigcontext	uc_mcontext;
+};
+
+#endif /* __LOONGARCH_UAPI_ASM_UCONTEXT_H */
diff --git a/arch/loongarch/kernel/signal.c b/arch/loongarch/kernel/signal.c
new file mode 100644
index 000000000000..489e169e1d13
--- /dev/null
+++ b/arch/loongarch/kernel/signal.c
@@ -0,0 +1,568 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Author: Hanlu Li <lihanlu@loongson.cn>
+ *         Huacai Chen <chenhuacai@loongson.cn>
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ *
+ * Derived from MIPS:
+ * Copyright (C) 1991, 1992  Linus Torvalds
+ * Copyright (C) 1994 - 2000  Ralf Baechle
+ * Copyright (C) 1999, 2000 Silicon Graphics, Inc.
+ * Copyright (C) 2014, Imagination Technologies Ltd.
+ */
+#include <linux/audit.h>
+#include <linux/cache.h>
+#include <linux/context_tracking.h>
+#include <linux/irqflags.h>
+#include <linux/sched.h>
+#include <linux/mm.h>
+#include <linux/personality.h>
+#include <linux/smp.h>
+#include <linux/kernel.h>
+#include <linux/signal.h>
+#include <linux/errno.h>
+#include <linux/wait.h>
+#include <linux/ptrace.h>
+#include <linux/unistd.h>
+#include <linux/compiler.h>
+#include <linux/syscalls.h>
+#include <linux/uaccess.h>
+
+#include <asm/asm.h>
+#include <asm/cacheflush.h>
+#include <asm/cpu-features.h>
+#include <asm/fpu.h>
+#include <asm/ucontext.h>
+#include <asm/vdso.h>
+
+#ifdef DEBUG_SIG
+#  define DEBUGP(fmt, args...) printk("%s: " fmt, __func__, ##args)
+#else
+#  define DEBUGP(fmt, args...)
+#endif
+
+/* Make sure we will not lose FPU ownership */
+#define lock_fpu_owner()	({ preempt_disable(); pagefault_disable(); })
+#define unlock_fpu_owner()	({ pagefault_enable(); preempt_enable(); })
+
+/* Assembly functions to move context to/from the FPU */
+extern asmlinkage int
+_save_fp_context(void __user *fpregs, void __user *fcc, void __user *csr);
+extern asmlinkage int
+_restore_fp_context(void __user *fpregs, void __user *fcc, void __user *csr);
+
+struct rt_sigframe {
+	struct siginfo rs_info;
+	struct ucontext rs_uctx;
+};
+
+struct _ctx_layout {
+	struct _ctxinfo *addr;
+	unsigned int size;
+};
+
+struct extctx_layout {
+	unsigned long size;
+	unsigned int flags;
+	struct _ctx_layout fpu;
+	struct _ctx_layout lsx;
+	struct _ctx_layout lasx;
+	struct _ctx_layout end;
+};
+
+static void __user *get_ctx_through_ctxinfo(struct _ctxinfo *info)
+{
+	return (void __user *)((char *)info + sizeof(struct _ctxinfo));
+}
+
+/*
+ * Thread saved context copy to/from a signal context presumed to be on the
+ * user stack, and therefore accessed with appropriate macros from uaccess.h.
+ */
+static int copy_fpu_to_sigcontext(struct fpu_context __user *ctx)
+{
+	int i;
+	int err = 0;
+	uint64_t __user *regs	= (uint64_t *)&ctx->regs;
+	uint64_t __user *fcc	= &ctx->fcc;
+	uint32_t __user *fcsr	= &ctx->fcsr;
+
+	for (i = 0; i < NUM_FPU_REGS; i++) {
+		err |=
+		    __put_user(get_fpr64(&current->thread.fpu.fpr[i], 0),
+			       &regs[i]);
+	}
+	err |= __put_user(current->thread.fpu.fcc, fcc);
+	err |= __put_user(current->thread.fpu.fcsr, fcsr);
+
+	return err;
+}
+
+static int copy_fpu_from_sigcontext(struct fpu_context __user *ctx)
+{
+	int i;
+	int err = 0;
+	u64 fpr_val;
+	uint64_t __user *regs	= (uint64_t *)&ctx->regs;
+	uint64_t __user *fcc	= &ctx->fcc;
+	uint32_t __user *fcsr	= &ctx->fcsr;
+
+	for (i = 0; i < NUM_FPU_REGS; i++) {
+		err |= __get_user(fpr_val, &regs[i]);
+		set_fpr64(&current->thread.fpu.fpr[i], 0, fpr_val);
+	}
+	err |= __get_user(current->thread.fpu.fcc, fcc);
+	err |= __get_user(current->thread.fpu.fcsr, fcsr);
+
+	return err;
+}
+
+/*
+ * Wrappers for the assembly _{save,restore}_fp_context functions.
+ */
+static int save_hw_fpu_context(struct fpu_context __user *ctx)
+{
+	uint64_t __user *regs	= (uint64_t *)&ctx->regs;
+	uint64_t __user *fcc	= &ctx->fcc;
+	uint32_t __user *fcsr	= &ctx->fcsr;
+
+	return _save_fp_context(regs, fcc, fcsr);
+}
+
+static int restore_hw_fpu_context(struct fpu_context __user *ctx)
+{
+	uint64_t __user *regs	= (uint64_t *)&ctx->regs;
+	uint64_t __user *fcc	= &ctx->fcc;
+	uint32_t __user *fcsr	= &ctx->fcsr;
+
+	return _restore_fp_context(regs, fcc, fcsr);
+}
+
+int fpcsr_pending(unsigned int __user *fpcsr)
+{
+	int err, sig = 0;
+	unsigned int csr, enabled;
+
+	err = __get_user(csr, fpcsr);
+	enabled = ((csr & FPU_CSR_ALL_E) << 24);
+	/*
+	 * If the signal handler set some FPU exceptions, clear it and
+	 * send SIGFPE.
+	 */
+	if (csr & enabled) {
+		csr &= ~enabled;
+		err |= __put_user(csr, fpcsr);
+		sig = SIGFPE;
+	}
+	return err ?: sig;
+}
+
+/*
+ * Helper routines
+ */
+static int protected_save_fpu_context(struct extctx_layout *extctx)
+{
+	int err = 0;
+	struct _ctxinfo __user *info = extctx->fpu.addr;
+	struct fpu_context __user *fpu_ctx = (struct fpu_context *)get_ctx_through_ctxinfo(info);
+	uint64_t __user *regs	= (uint64_t *)&fpu_ctx->regs;
+	uint64_t __user *fcc	= &fpu_ctx->fcc;
+	uint32_t __user *fcsr	= &fpu_ctx->fcsr;
+
+	while (1) {
+		lock_fpu_owner();
+		if (is_fpu_owner())
+			err = save_hw_fpu_context(fpu_ctx);
+		else
+			err = copy_fpu_to_sigcontext(fpu_ctx);
+		unlock_fpu_owner();
+
+		err |= __put_user(FPU_CTX_MAGIC, &info->magic);
+		err |= __put_user(extctx->fpu.size, &info->size);
+
+		if (likely(!err))
+			break;
+		/* Touch the FPU context and try again */
+		err = __put_user(0, &regs[0]) |
+			__put_user(0, &regs[31]) |
+			__put_user(0, fcc) |
+			__put_user(0, fcsr);
+		if (err)
+			return err;	/* really bad sigcontext */
+	}
+
+	return err;
+}
+
+static int protected_restore_fpu_context(struct extctx_layout *extctx)
+{
+	int err = 0, sig = 0, tmp __maybe_unused;
+	struct _ctxinfo __user *info = extctx->fpu.addr;
+	struct fpu_context __user *fpu_ctx = (struct fpu_context *)get_ctx_through_ctxinfo(info);
+	uint64_t __user *regs	= (uint64_t *)&fpu_ctx->regs;
+	uint64_t __user *fcc	= &fpu_ctx->fcc;
+	uint32_t __user *fcsr	= &fpu_ctx->fcsr;
+
+	err = sig = fpcsr_pending(fcsr);
+	if (err < 0)
+		return err;
+
+	while (1) {
+		lock_fpu_owner();
+		if (is_fpu_owner())
+			err = restore_hw_fpu_context(fpu_ctx);
+		else
+			err = copy_fpu_from_sigcontext(fpu_ctx);
+		unlock_fpu_owner();
+
+		if (likely(!err))
+			break;
+		/* Touch the FPU context and try again */
+		err = __get_user(tmp, &regs[0]) |
+			__get_user(tmp, &regs[31]) |
+			__get_user(tmp, fcc) |
+			__get_user(tmp, fcsr);
+		if (err)
+			break;	/* really bad sigcontext */
+	}
+
+	return err ?: sig;
+}
+
+static int setup_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc,
+			    struct extctx_layout *extctx)
+{
+	int i, err = 0;
+	struct _ctxinfo __user *info;
+
+	err |= __put_user(regs->csr_era, &sc->sc_pc);
+	err |= __put_user(extctx->flags, &sc->sc_flags);
+
+	err |= __put_user(0, &sc->sc_regs[0]);
+	for (i = 1; i < 32; i++)
+		err |= __put_user(regs->regs[i], &sc->sc_regs[i]);
+
+	if (extctx->fpu.addr)
+		err |= protected_save_fpu_context(extctx);
+
+	/* Set the "end" magic */
+	info = (struct _ctxinfo *)extctx->end.addr;
+	err |= __put_user(0, &info->magic);
+	err |= __put_user(0, &info->size);
+
+	return err;
+}
+
+static int parse_extcontext(struct sigcontext __user *sc, struct extctx_layout *extctx)
+{
+	int err = 0;
+	unsigned int magic, size;
+	struct _ctxinfo __user *info = (struct _ctxinfo __user *)&sc->sc_extcontext;
+
+	while(1) {
+		err |= __get_user(magic, &info->magic);
+		err |= __get_user(size, &info->size);
+		if (err)
+			return err;
+
+		switch (magic) {
+		case 0: /* END */
+			goto done;
+
+		case FPU_CTX_MAGIC:
+			if (size < (sizeof(struct _ctxinfo) +
+				    sizeof(struct fpu_context)))
+				goto invalid;
+			extctx->fpu.addr = info;
+			break;
+
+		default:
+			goto invalid;
+		}
+
+		info = (struct _ctxinfo *)((char *)info + size);
+	}
+
+done:
+	return 0;
+
+invalid:
+	return -EINVAL;
+}
+
+static int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc)
+{
+	int i, err = 0;
+	struct extctx_layout extctx;
+
+	memset(&extctx, 0, sizeof(struct extctx_layout));
+
+	err = __get_user(extctx.flags, &sc->sc_flags);
+	if (err)
+		goto bad;
+
+	err = parse_extcontext(sc, &extctx);
+	if (err)
+		goto bad;
+
+	conditional_used_math(extctx.flags & USED_FP);
+
+	/*
+	 * The signal handler may have used FPU; give it up if the program
+	 * doesn't want it following sigreturn.
+	 */
+	if (!(extctx.flags & USED_FP))
+		lose_fpu(0);
+
+	/* Always make any pending restarted system calls return -EINTR */
+	current->restart_block.fn = do_no_restart_syscall;
+
+	err |= __get_user(regs->csr_era, &sc->sc_pc);
+	for (i = 1; i < 32; i++)
+		err |= __get_user(regs->regs[i], &sc->sc_regs[i]);
+
+	if (extctx.fpu.addr)
+		err |= protected_restore_fpu_context(&extctx);
+
+bad:
+	return err;
+}
+
+static unsigned int handle_flags(void)
+{
+	unsigned int flags = 0;
+
+	flags |= used_math() ? USED_FP : 0;
+
+	switch (current->thread.error_code) {
+	case 1:
+		flags |= ADRERR_RD;
+		break;
+	case 2:
+		flags |= ADRERR_WR;
+		break;
+	}
+
+	return flags;
+}
+
+static unsigned long extframe_alloc(struct extctx_layout *extctx,
+				    struct _ctx_layout *layout,
+				    size_t size, unsigned int align, unsigned long base)
+{
+	unsigned long new_base = base - size;
+
+	new_base = round_down(new_base, (align < 16 ? 16 : align));
+	new_base -= sizeof(struct _ctxinfo);
+
+	layout->addr = (void *)new_base;
+	layout->size = (unsigned int)(base - new_base);
+	extctx->size += layout->size;
+
+	return new_base;
+}
+
+static unsigned long setup_extcontext(struct extctx_layout *extctx, unsigned long sp)
+{
+	unsigned long new_sp = sp;
+
+	memset(extctx, 0, sizeof(struct extctx_layout));
+
+	extctx->flags = handle_flags();
+
+	/* Grow down, alloc "end" context info first. */
+	new_sp -= sizeof(struct _ctxinfo);
+	extctx->end.addr = (void *)new_sp;
+	extctx->end.size = (unsigned int)sizeof(struct _ctxinfo);
+	extctx->size += extctx->end.size;
+
+	if (extctx->flags & USED_FP) {
+		if (cpu_has_fpu)
+			new_sp = extframe_alloc(extctx, &extctx->fpu,
+			  sizeof(struct fpu_context), FPU_CTX_ALIGN, new_sp);
+	}
+
+	return new_sp;
+}
+
+void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,
+			  struct extctx_layout *extctx)
+{
+	unsigned long sp;
+
+	/* Default to using normal stack */
+	sp = regs->regs[3];
+
+	/*
+	 * If we are on the alternate signal stack and would overflow it, don't.
+	 * Return an always-bogus address instead so we will die with SIGSEGV.
+	 */
+	if (on_sig_stack(sp) &&
+	    !likely(on_sig_stack(sp - sizeof(struct rt_sigframe))))
+		return (void __user __force *)(-1UL);
+
+	sp = sigsp(sp, ksig);
+	sp = round_down(sp, 16);
+	sp = setup_extcontext(extctx, sp);
+	sp -= sizeof(struct rt_sigframe);
+
+	if (!IS_ALIGNED(sp, 16))
+		BUG();
+
+	return (void __user *)sp;
+}
+
+/*
+ * Atomically swap in the new signal mask, and wait for a signal.
+ */
+
+asmlinkage long sys_rt_sigreturn(void)
+{
+	int sig;
+	sigset_t set;
+	struct pt_regs *regs;
+	struct rt_sigframe __user *frame;
+
+	regs = current_pt_regs();
+	frame = (struct rt_sigframe __user *)regs->regs[3];
+	if (!access_ok(frame, sizeof(*frame)))
+		goto badframe;
+	if (__copy_from_user(&set, &frame->rs_uctx.uc_sigmask, sizeof(set)))
+		goto badframe;
+
+	set_current_blocked(&set);
+
+	sig = restore_sigcontext(regs, &frame->rs_uctx.uc_mcontext);
+	if (sig < 0)
+		goto badframe;
+	else if (sig)
+		force_sig(sig);
+
+	regs->regs[0] = 0; /* No syscall restarting */
+	if (restore_altstack(&frame->rs_uctx.uc_stack))
+		goto badframe;
+
+	return regs->regs[4];
+
+badframe:
+	force_sig(SIGSEGV);
+	return 0;
+}
+
+static int setup_rt_frame(void *sig_return, struct ksignal *ksig,
+			  struct pt_regs *regs, sigset_t *set)
+{
+	int err = 0;
+	struct extctx_layout extctx;
+	struct rt_sigframe __user *frame;
+
+	frame = get_sigframe(ksig, regs, &extctx);
+	if (!access_ok(frame, sizeof(*frame) + extctx.size))
+		return -EFAULT;
+
+	/* Create siginfo.  */
+	err |= copy_siginfo_to_user(&frame->rs_info, &ksig->info);
+
+	/* Create the ucontext.	 */
+	err |= __put_user(0, &frame->rs_uctx.uc_flags);
+	err |= __put_user(NULL, &frame->rs_uctx.uc_link);
+	err |= __save_altstack(&frame->rs_uctx.uc_stack, regs->regs[3]);
+	err |= setup_sigcontext(regs, &frame->rs_uctx.uc_mcontext, &extctx);
+	err |= __copy_to_user(&frame->rs_uctx.uc_sigmask, set, sizeof(*set));
+
+	if (err)
+		return -EFAULT;
+
+	/*
+	 * Arguments to signal handler:
+	 *
+	 *   a0 = signal number
+	 *   a1 = pointer to siginfo
+	 *   a2 = pointer to ucontext
+	 *
+	 * c0_era point to the signal handler, $r3 (sp) points to
+	 * the struct rt_sigframe.
+	 */
+	regs->regs[4] = ksig->sig;
+	regs->regs[5] = (unsigned long) &frame->rs_info;
+	regs->regs[6] = (unsigned long) &frame->rs_uctx;
+	regs->regs[3] = (unsigned long) frame;
+	regs->regs[1] = (unsigned long) sig_return;
+	regs->csr_era = (unsigned long) ksig->ka.sa.sa_handler;
+
+	DEBUGP("SIG deliver (%s:%d): sp=0x%p pc=0x%lx ra=0x%lx\n",
+	       current->comm, current->pid,
+	       frame, regs->csr_era, regs->regs[1]);
+
+	return 0;
+}
+
+static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
+{
+	int ret;
+	sigset_t *oldset = sigmask_to_save();
+	void *vdso = current->mm->context.vdso;
+
+	/* Are we from a system call? */
+	if (regs->regs[0]) {
+		switch (regs->regs[4]) {
+		case -ERESTART_RESTARTBLOCK:
+		case -ERESTARTNOHAND:
+			regs->regs[4] = -EINTR;
+			break;
+		case -ERESTARTSYS:
+			if (!(ksig->ka.sa.sa_flags & SA_RESTART)) {
+				regs->regs[4] = -EINTR;
+				break;
+			}
+			fallthrough;
+		case -ERESTARTNOINTR:
+			regs->regs[4] = regs->orig_a0;
+			regs->csr_era -= 4;
+		}
+
+		regs->regs[0] = 0;	/* Don't deal with this again.	*/
+	}
+
+	rseq_signal_deliver(ksig, regs);
+
+	ret = setup_rt_frame(vdso + current->thread.vdso->offset_sigreturn, ksig, regs, oldset);
+
+	signal_setup_done(ret, ksig, 0);
+}
+
+void arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal)
+{
+	struct ksignal ksig;
+
+	if (has_signal && get_signal(&ksig)) {
+		/* Whee!  Actually deliver the signal.	*/
+		handle_signal(&ksig, regs);
+		return;
+	}
+
+	/* Are we from a system call? */
+	if (regs->regs[0]) {
+		switch (regs->regs[4]) {
+		case -ERESTARTNOHAND:
+		case -ERESTARTSYS:
+		case -ERESTARTNOINTR:
+			regs->regs[4] = regs->orig_a0;
+			regs->csr_era -= 4;
+			break;
+
+		case -ERESTART_RESTARTBLOCK:
+			regs->regs[4] = regs->orig_a0;
+			regs->regs[11] = __NR_restart_syscall;
+			regs->csr_era -= 4;
+			break;
+		}
+		regs->regs[0] = 0;	/* Don't deal with this again.	*/
+	}
+
+	/*
+	 * If there's no signal to deliver, we just put the saved sigmask
+	 * back
+	 */
+	restore_saved_sigmask();
+}
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 15/24] LoongArch: Add elf and module support
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (13 preceding siblings ...)
  2022-04-30  9:05 ` [PATCH V9 14/24] LoongArch: Add signal handling support Huacai Chen
@ 2022-04-30  9:05 ` Huacai Chen
  2022-04-30  9:05 ` [PATCH V9 16/24] LoongArch: Add misc common routines Huacai Chen
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:05 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen,
	Jessica Yu, Luis Chamberlain

This patch adds elf definition and module relocate codes.

Cc: Jessica Yu <jeyu@kernel.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/include/asm/cpufeature.h  |  24 ++
 arch/loongarch/include/asm/elf.h         | 299 ++++++++++++++++++
 arch/loongarch/include/asm/exec.h        |  10 +
 arch/loongarch/include/asm/module.h      |  80 +++++
 arch/loongarch/include/asm/module.lds.h  |   7 +
 arch/loongarch/include/asm/vermagic.h    |  19 ++
 arch/loongarch/include/uapi/asm/auxvec.h |  17 +
 arch/loongarch/include/uapi/asm/hwcap.h  |  20 ++
 arch/loongarch/kernel/elf.c              |  30 ++
 arch/loongarch/kernel/inst.c             |  40 +++
 arch/loongarch/kernel/module-sections.c  | 121 +++++++
 arch/loongarch/kernel/module.c           | 384 +++++++++++++++++++++++
 12 files changed, 1051 insertions(+)
 create mode 100644 arch/loongarch/include/asm/cpufeature.h
 create mode 100644 arch/loongarch/include/asm/elf.h
 create mode 100644 arch/loongarch/include/asm/exec.h
 create mode 100644 arch/loongarch/include/asm/module.h
 create mode 100644 arch/loongarch/include/asm/module.lds.h
 create mode 100644 arch/loongarch/include/asm/vermagic.h
 create mode 100644 arch/loongarch/include/uapi/asm/auxvec.h
 create mode 100644 arch/loongarch/include/uapi/asm/hwcap.h
 create mode 100644 arch/loongarch/kernel/elf.c
 create mode 100644 arch/loongarch/kernel/inst.c
 create mode 100644 arch/loongarch/kernel/module-sections.c
 create mode 100644 arch/loongarch/kernel/module.c

diff --git a/arch/loongarch/include/asm/cpufeature.h b/arch/loongarch/include/asm/cpufeature.h
new file mode 100644
index 000000000000..4da22a8e63de
--- /dev/null
+++ b/arch/loongarch/include/asm/cpufeature.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * CPU feature definitions for module loading, used by
+ * module_cpu_feature_match(), see uapi/asm/hwcap.h for LoongArch CPU features.
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#ifndef __ASM_CPUFEATURE_H
+#define __ASM_CPUFEATURE_H
+
+#include <uapi/asm/hwcap.h>
+#include <asm/elf.h>
+
+#define MAX_CPU_FEATURES (8 * sizeof(elf_hwcap))
+
+#define cpu_feature(x)		ilog2(HWCAP_ ## x)
+
+static inline bool cpu_have_feature(unsigned int num)
+{
+	return elf_hwcap & (1UL << num);
+}
+
+#endif /* __ASM_CPUFEATURE_H */
diff --git a/arch/loongarch/include/asm/elf.h b/arch/loongarch/include/asm/elf.h
new file mode 100644
index 000000000000..52734d705545
--- /dev/null
+++ b/arch/loongarch/include/asm/elf.h
@@ -0,0 +1,299 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_ELF_H
+#define _ASM_ELF_H
+
+#include <linux/auxvec.h>
+#include <linux/fs.h>
+#include <uapi/linux/elf.h>
+
+#include <asm/current.h>
+#include <asm/vdso.h>
+
+/* ELF header e_flags defines. */
+
+/* The ABI of a file. */
+#define EF_LARCH_ABI_LP32		0x00000001	/* LP32 ABI.  */
+#define EF_LARCH_ABI_LP64		0x00000003	/* LP64 ABI  */
+#define EF_LARCH_ABI			0x00000003
+
+/* LoongArch relocation types used by the dynamic linker */
+#define R_LARCH_NONE				0
+#define R_LARCH_32				1
+#define R_LARCH_64				2
+#define R_LARCH_RELATIVE			3
+#define R_LARCH_COPY				4
+#define R_LARCH_JUMP_SLOT			5
+#define R_LARCH_TLS_DTPMOD32			6
+#define R_LARCH_TLS_DTPMOD64			7
+#define R_LARCH_TLS_DTPREL32			8
+#define R_LARCH_TLS_DTPREL64			9
+#define R_LARCH_TLS_TPREL32			10
+#define R_LARCH_TLS_TPREL64			11
+#define R_LARCH_IRELATIVE			12
+#define R_LARCH_MARK_LA				20
+#define R_LARCH_MARK_PCREL			21
+#define R_LARCH_SOP_PUSH_PCREL			22
+#define R_LARCH_SOP_PUSH_ABSOLUTE		23
+#define R_LARCH_SOP_PUSH_DUP			24
+#define R_LARCH_SOP_PUSH_GPREL			25
+#define R_LARCH_SOP_PUSH_TLS_TPREL		26
+#define R_LARCH_SOP_PUSH_TLS_GOT		27
+#define R_LARCH_SOP_PUSH_TLS_GD			28
+#define R_LARCH_SOP_PUSH_PLT_PCREL		29
+#define R_LARCH_SOP_ASSERT			30
+#define R_LARCH_SOP_NOT				31
+#define R_LARCH_SOP_SUB				32
+#define R_LARCH_SOP_SL				33
+#define R_LARCH_SOP_SR				34
+#define R_LARCH_SOP_ADD				35
+#define R_LARCH_SOP_AND				36
+#define R_LARCH_SOP_IF_ELSE			37
+#define R_LARCH_SOP_POP_32_S_10_5		38
+#define R_LARCH_SOP_POP_32_U_10_12		39
+#define R_LARCH_SOP_POP_32_S_10_12		40
+#define R_LARCH_SOP_POP_32_S_10_16		41
+#define R_LARCH_SOP_POP_32_S_10_16_S2		42
+#define R_LARCH_SOP_POP_32_S_5_20		43
+#define R_LARCH_SOP_POP_32_S_0_5_10_16_S2	44
+#define R_LARCH_SOP_POP_32_S_0_10_10_16_S2	45
+#define R_LARCH_SOP_POP_32_U			46
+#define R_LARCH_ADD8				47
+#define R_LARCH_ADD16				48
+#define R_LARCH_ADD24				49
+#define R_LARCH_ADD32				50
+#define R_LARCH_ADD64				51
+#define R_LARCH_SUB8				52
+#define R_LARCH_SUB16				53
+#define R_LARCH_SUB24				54
+#define R_LARCH_SUB32				55
+#define R_LARCH_SUB64				56
+#define R_LARCH_GNU_VTINHERIT			57
+#define R_LARCH_GNU_VTENTRY			58
+
+#ifndef ELF_ARCH
+
+/* ELF register definitions */
+
+/*
+ * General purpose have the following registers:
+ *	Register	Number
+ *	GPRs		32
+ *	ORIG_A0		1
+ *	ERA		1
+ *	BADVADDR	1
+ *	CRMD		1
+ *	PRMD		1
+ *	EUEN		1
+ *	ECFG		1
+ *	ESTAT		1
+ *	Reserved	5
+ */
+#define ELF_NGREG	45
+
+/*
+ * Floating point have the following registers:
+ *	Register	Number
+ *	FPR		32
+ *	FCC		1
+ *	FCSR		1
+ */
+#define ELF_NFPREG	34
+
+typedef unsigned long elf_greg_t;
+typedef elf_greg_t elf_gregset_t[ELF_NGREG];
+
+typedef double elf_fpreg_t;
+typedef elf_fpreg_t elf_fpregset_t[ELF_NFPREG];
+
+void loongarch_dump_regs64(u64 *uregs, const struct pt_regs *regs);
+
+#ifdef CONFIG_32BIT
+/*
+ * This is used to ensure we don't load something for the wrong architecture.
+ */
+#define elf_check_arch elf32_check_arch
+
+/*
+ * These are used to set parameters in the core dumps.
+ */
+#define ELF_CLASS	ELFCLASS32
+
+#define ELF_CORE_COPY_REGS(dest, regs) \
+	loongarch_dump_regs32((u32 *)&(dest), (regs));
+
+#endif /* CONFIG_32BIT */
+
+#ifdef CONFIG_64BIT
+/*
+ * This is used to ensure we don't load something for the wrong architecture.
+ */
+#define elf_check_arch elf64_check_arch
+
+/*
+ * These are used to set parameters in the core dumps.
+ */
+#define ELF_CLASS	ELFCLASS64
+
+#define ELF_CORE_COPY_REGS(dest, regs) \
+	loongarch_dump_regs64((u64 *)&(dest), (regs));
+
+#endif /* CONFIG_64BIT */
+
+/*
+ * These are used to set parameters in the core dumps.
+ */
+#define ELF_DATA	ELFDATA2LSB
+#define ELF_ARCH	EM_LOONGARCH
+
+#endif /* !defined(ELF_ARCH) */
+
+#define loongarch_elf_check_machine(x) ((x)->e_machine == EM_LOONGARCH)
+
+#define vmcore_elf32_check_arch loongarch_elf_check_machine
+#define vmcore_elf64_check_arch loongarch_elf_check_machine
+
+/*
+ * Return non-zero if HDR identifies an 32bit ELF binary.
+ */
+#define elf32_check_arch(hdr)						\
+({									\
+	int __res = 1;							\
+	struct elfhdr *__h = (hdr);					\
+									\
+	if (!loongarch_elf_check_machine(__h))				\
+		__res = 0;						\
+	if (__h->e_ident[EI_CLASS] != ELFCLASS32)			\
+		__res = 0;						\
+									\
+	__res;								\
+})
+
+/*
+ * Return non-zero if HDR identifies an 64bit ELF binary.
+ */
+#define elf64_check_arch(hdr)						\
+({									\
+	int __res = 1;							\
+	struct elfhdr *__h = (hdr);					\
+									\
+	if (!loongarch_elf_check_machine(__h))				\
+		__res = 0;						\
+	if (__h->e_ident[EI_CLASS] != ELFCLASS64)			\
+		__res = 0;						\
+									\
+	__res;								\
+})
+
+#ifdef CONFIG_32BIT
+
+#define SET_PERSONALITY2(ex, state)					\
+do {									\
+	current->thread.vdso = &vdso_info;				\
+									\
+	loongarch_set_personality_fcsr(state);				\
+									\
+	if (personality(current->personality) != PER_LINUX)		\
+		set_personality(PER_LINUX);				\
+} while (0)
+
+#endif /* CONFIG_32BIT */
+
+#ifdef CONFIG_64BIT
+
+#define SET_PERSONALITY2(ex, state)					\
+do {									\
+	unsigned int p;							\
+									\
+	clear_thread_flag(TIF_32BIT_REGS);				\
+	clear_thread_flag(TIF_32BIT_ADDR);				\
+									\
+	current->thread.vdso = &vdso_info;				\
+	loongarch_set_personality_fcsr(state);				\
+									\
+	p = personality(current->personality);				\
+	if (p != PER_LINUX32 && p != PER_LINUX)				\
+		set_personality(PER_LINUX);				\
+} while (0)
+
+#endif /* CONFIG_64BIT */
+
+#define CORE_DUMP_USE_REGSET
+#define ELF_EXEC_PAGESIZE	PAGE_SIZE
+
+/*
+ * This yields a mask that user programs can use to figure out what
+ * instruction set this cpu supports. This could be done in userspace,
+ * but it's not easy, and we've already done it here.
+ */
+
+#define ELF_HWCAP	(elf_hwcap)
+extern unsigned int elf_hwcap;
+#include <asm/hwcap.h>
+
+/*
+ * This yields a string that ld.so will use to load implementation
+ * specific libraries for optimization.	 This is more specific in
+ * intent than poking at uname or /proc/cpuinfo.
+ */
+
+#define ELF_PLATFORM  __elf_platform
+extern const char *__elf_platform;
+
+#define ELF_PLAT_INIT(_r, load_addr)	do { \
+	_r->regs[1] = _r->regs[2] = _r->regs[3] = _r->regs[4] = 0;	\
+	_r->regs[5] = _r->regs[6] = _r->regs[7] = _r->regs[8] = 0;	\
+	_r->regs[9] = _r->regs[10] = _r->regs[11] = _r->regs[12] = 0;	\
+	_r->regs[13] = _r->regs[14] = _r->regs[15] = _r->regs[16] = 0;	\
+	_r->regs[17] = _r->regs[18] = _r->regs[19] = _r->regs[20] = 0;	\
+	_r->regs[21] = _r->regs[22] = _r->regs[23] = _r->regs[24] = 0;	\
+	_r->regs[25] = _r->regs[26] = _r->regs[27] = _r->regs[28] = 0;	\
+	_r->regs[29] = _r->regs[30] = _r->regs[31] = 0;			\
+} while (0)
+
+/*
+ * This is the location that an ET_DYN program is loaded if exec'ed. Typical
+ * use of this is to invoke "./ld.so someprog" to test out a new version of
+ * the loader. We need to make sure that it is out of the way of the program
+ * that it will "exec", and that there is sufficient room for the brk.
+ */
+
+#define ELF_ET_DYN_BASE		(TASK_SIZE / 3 * 2)
+
+/* update AT_VECTOR_SIZE_ARCH if the number of NEW_AUX_ENT entries changes */
+#define ARCH_DLINFO							\
+do {									\
+	NEW_AUX_ENT(AT_SYSINFO_EHDR,					\
+		    (unsigned long)current->mm->context.vdso);		\
+} while (0)
+
+#define ARCH_HAS_SETUP_ADDITIONAL_PAGES 1
+struct linux_binprm;
+extern int arch_setup_additional_pages(struct linux_binprm *bprm,
+				       int uses_interp);
+
+struct arch_elf_state {
+	int fp_abi;
+	int interp_fp_abi;
+};
+
+#define LOONGARCH_ABI_FP_ANY	(0)
+
+#define INIT_ARCH_ELF_STATE {			\
+	.fp_abi = LOONGARCH_ABI_FP_ANY,		\
+	.interp_fp_abi = LOONGARCH_ABI_FP_ANY,	\
+}
+
+#define elf_read_implies_exec(ex, exec_stk) (exec_stk == EXSTACK_DEFAULT)
+
+extern int arch_elf_pt_proc(void *ehdr, void *phdr, struct file *elf,
+			    bool is_interp, struct arch_elf_state *state);
+
+extern int arch_check_elf(void *ehdr, bool has_interpreter, void *interp_ehdr,
+			  struct arch_elf_state *state);
+
+extern void loongarch_set_personality_fcsr(struct arch_elf_state *state);
+
+#endif /* _ASM_ELF_H */
diff --git a/arch/loongarch/include/asm/exec.h b/arch/loongarch/include/asm/exec.h
new file mode 100644
index 000000000000..ba0220812ebb
--- /dev/null
+++ b/arch/loongarch/include/asm/exec.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_EXEC_H
+#define _ASM_EXEC_H
+
+extern unsigned long arch_align_stack(unsigned long sp);
+
+#endif /* _ASM_EXEC_H */
diff --git a/arch/loongarch/include/asm/module.h b/arch/loongarch/include/asm/module.h
new file mode 100644
index 000000000000..5d6aa32a17ef
--- /dev/null
+++ b/arch/loongarch/include/asm/module.h
@@ -0,0 +1,80 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_MODULE_H
+#define _ASM_MODULE_H
+
+#include <asm/inst.h>
+#include <asm-generic/module.h>
+
+#define RELA_STACK_DEPTH 16
+
+struct mod_section {
+	Elf_Shdr *shdr;
+	int num_entries;
+	int max_entries;
+};
+
+struct mod_arch_specific {
+	struct mod_section plt;
+	struct mod_section plt_idx;
+};
+
+struct plt_entry {
+	u32 inst_lu12iw;
+	u32 inst_lu32id;
+	u32 inst_lu52id;
+	u32 inst_jirl;
+};
+
+struct plt_idx_entry {
+	unsigned long symbol_addr;
+};
+
+Elf_Addr module_emit_plt_entry(struct module *mod, u64 val);
+
+static inline struct plt_entry emit_plt_entry(u64 val, u64 plt, u64 plt_idx)
+{
+	u32 lu12iw, lu32id, lu52id, jirl;
+
+	lu12iw = (lu12iw_op << 25 | (((val >> 12) & 0xfffff) << 5) | LOONGARCH_GPR_T1),
+	lu32id = larch_insn_gen_lu32id(LOONGARCH_GPR_T1, ADDR_IMM(val, LU32ID));
+	lu52id = larch_insn_gen_lu52id(LOONGARCH_GPR_T1, LOONGARCH_GPR_T1, ADDR_IMM(val, LU52ID));
+	jirl = larch_insn_gen_jirl(0, LOONGARCH_GPR_T1, 0, (val & 0xfff));
+
+	return (struct plt_entry) { lu12iw, lu32id, lu52id, jirl };
+}
+
+static inline struct plt_idx_entry emit_plt_idx_entry(u64 val)
+{
+	return (struct plt_idx_entry) { val };
+}
+
+static inline int get_plt_idx(u64 val, const struct mod_section *sec)
+{
+	int i;
+	struct plt_idx_entry *plt_idx = (struct plt_idx_entry *)sec->shdr->sh_addr;
+
+	for (i = 0; i < sec->num_entries; i++) {
+		if (plt_idx[i].symbol_addr == val)
+			return i;
+	}
+
+	return -1;
+}
+
+static inline struct plt_entry *get_plt_entry(u64 val,
+				      const struct mod_section *sec_plt,
+				      const struct mod_section *sec_plt_idx)
+{
+	int plt_idx = get_plt_idx(val, sec_plt_idx);
+	struct plt_entry *plt = (struct plt_entry *)sec_plt->shdr->sh_addr;
+
+	if (plt_idx < 0)
+		return NULL;
+
+	return plt + plt_idx;
+}
+
+#endif /* _ASM_MODULE_H */
diff --git a/arch/loongarch/include/asm/module.lds.h b/arch/loongarch/include/asm/module.lds.h
new file mode 100644
index 000000000000..31c1c0db11a3
--- /dev/null
+++ b/arch/loongarch/include/asm/module.lds.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2020-2022 Loongson Technology Corporation Limited */
+SECTIONS {
+	. = ALIGN(4);
+	.plt : { BYTE(0) }
+	.plt.idx : { BYTE(0) }
+}
diff --git a/arch/loongarch/include/asm/vermagic.h b/arch/loongarch/include/asm/vermagic.h
new file mode 100644
index 000000000000..8b47ccfe3aad
--- /dev/null
+++ b/arch/loongarch/include/asm/vermagic.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_VERMAGIC_H
+#define _ASM_VERMAGIC_H
+
+#define MODULE_PROC_FAMILY "LOONGARCH "
+
+#ifdef CONFIG_32BIT
+#define MODULE_KERNEL_TYPE "32BIT "
+#elif defined CONFIG_64BIT
+#define MODULE_KERNEL_TYPE "64BIT "
+#endif
+
+#define MODULE_ARCH_VERMAGIC \
+	MODULE_PROC_FAMILY MODULE_KERNEL_TYPE
+
+#endif /* _ASM_VERMAGIC_H */
diff --git a/arch/loongarch/include/uapi/asm/auxvec.h b/arch/loongarch/include/uapi/asm/auxvec.h
new file mode 100644
index 000000000000..922d9e6b5058
--- /dev/null
+++ b/arch/loongarch/include/uapi/asm/auxvec.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
+/*
+ * Author: Hanlu Li <lihanlu@loongson.cn>
+ *         Huacai Chen <chenhuacai@loongson.cn>
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#ifndef __ASM_AUXVEC_H
+#define __ASM_AUXVEC_H
+
+/* Location of VDSO image. */
+#define AT_SYSINFO_EHDR		33
+
+#define AT_VECTOR_SIZE_ARCH 1 /* entries in ARCH_DLINFO */
+
+#endif /* __ASM_AUXVEC_H */
diff --git a/arch/loongarch/include/uapi/asm/hwcap.h b/arch/loongarch/include/uapi/asm/hwcap.h
new file mode 100644
index 000000000000..8840b72fa8e8
--- /dev/null
+++ b/arch/loongarch/include/uapi/asm/hwcap.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _UAPI_ASM_HWCAP_H
+#define _UAPI_ASM_HWCAP_H
+
+/* HWCAP flags */
+#define HWCAP_LOONGARCH_CPUCFG		(1 << 0)
+#define HWCAP_LOONGARCH_LAM		(1 << 1)
+#define HWCAP_LOONGARCH_UAL		(1 << 2)
+#define HWCAP_LOONGARCH_FPU		(1 << 3)
+#define HWCAP_LOONGARCH_LSX		(1 << 4)
+#define HWCAP_LOONGARCH_LASX		(1 << 5)
+#define HWCAP_LOONGARCH_CRC32		(1 << 6)
+#define HWCAP_LOONGARCH_COMPLEX		(1 << 7)
+#define HWCAP_LOONGARCH_CRYPTO		(1 << 8)
+#define HWCAP_LOONGARCH_LVZ		(1 << 9)
+#define HWCAP_LOONGARCH_LBT_X86		(1 << 10)
+#define HWCAP_LOONGARCH_LBT_ARM		(1 << 11)
+#define HWCAP_LOONGARCH_LBT_MIPS	(1 << 12)
+
+#endif /* _UAPI_ASM_HWCAP_H */
diff --git a/arch/loongarch/kernel/elf.c b/arch/loongarch/kernel/elf.c
new file mode 100644
index 000000000000..183e94fc9c69
--- /dev/null
+++ b/arch/loongarch/kernel/elf.c
@@ -0,0 +1,30 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Author: Huacai Chen <chenhuacai@loongson.cn>
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <linux/binfmts.h>
+#include <linux/elf.h>
+#include <linux/export.h>
+#include <linux/sched.h>
+
+#include <asm/cpu-features.h>
+#include <asm/cpu-info.h>
+
+int arch_elf_pt_proc(void *_ehdr, void *_phdr, struct file *elf,
+		     bool is_interp, struct arch_elf_state *state)
+{
+	return 0;
+}
+
+int arch_check_elf(void *_ehdr, bool has_interpreter, void *_interp_ehdr,
+		   struct arch_elf_state *state)
+{
+	return 0;
+}
+
+void loongarch_set_personality_fcsr(struct arch_elf_state *state)
+{
+	current->thread.fpu.fcsr = boot_cpu_data.fpu_csr0;
+}
diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
new file mode 100644
index 000000000000..ceb969a597b6
--- /dev/null
+++ b/arch/loongarch/kernel/inst.c
@@ -0,0 +1,40 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <asm/inst.h>
+
+u32 larch_insn_gen_lu32id(enum loongarch_gpr rd, int imm)
+{
+	union loongarch_instruction insn;
+
+	insn.reg1i20_format.opcode = lu32id_op;
+	insn.reg1i20_format.rd = rd;
+	insn.reg1i20_format.simmediate = imm;
+
+	return insn.word;
+}
+
+u32 larch_insn_gen_lu52id(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm)
+{
+	union loongarch_instruction insn;
+
+	insn.reg2i12_format.opcode = lu52id_op;
+	insn.reg2i12_format.rd = rd;
+	insn.reg2i12_format.rj = rj;
+	insn.reg2i12_format.simmediate = imm;
+
+	return insn.word;
+}
+
+u32 larch_insn_gen_jirl(enum loongarch_gpr rd, enum loongarch_gpr rj, unsigned long pc, unsigned long dest)
+{
+	union loongarch_instruction insn;
+
+	insn.reg2i16_format.opcode = jirl_op;
+	insn.reg2i16_format.rd = rd;
+	insn.reg2i16_format.rj = rj;
+	insn.reg2i16_format.simmediate = (dest - pc) >> 2;
+
+	return insn.word;
+}
diff --git a/arch/loongarch/kernel/module-sections.c b/arch/loongarch/kernel/module-sections.c
new file mode 100644
index 000000000000..d0a752291639
--- /dev/null
+++ b/arch/loongarch/kernel/module-sections.c
@@ -0,0 +1,121 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <linux/elf.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+
+Elf_Addr module_emit_plt_entry(struct module *mod, u64 val)
+{
+	int nr;
+	struct mod_section *plt_sec = &mod->arch.plt;
+	struct mod_section *plt_idx_sec = &mod->arch.plt_idx;
+	struct plt_entry *plt = get_plt_entry(val, plt_sec, plt_idx_sec);
+	struct plt_idx_entry *plt_idx;
+
+	if (plt)
+		return (Elf_Addr)plt;
+
+	nr = plt_sec->num_entries;
+
+	/* There is no duplicate entry, create a new one */
+	plt = (struct plt_entry *)plt_sec->shdr->sh_addr;
+	plt[nr] = emit_plt_entry(val, (u64)&plt[nr], (u64)&plt_idx[nr]);
+	plt_idx = (struct plt_idx_entry *)plt_idx_sec->shdr->sh_addr;
+	plt_idx[nr] = emit_plt_idx_entry(val);
+
+	plt_sec->num_entries++;
+	plt_idx_sec->num_entries++;
+	BUG_ON(plt_sec->num_entries > plt_sec->max_entries);
+
+	return (Elf_Addr)&plt[nr];
+}
+
+static int is_rela_equal(const Elf_Rela *x, const Elf_Rela *y)
+{
+	return x->r_info == y->r_info && x->r_addend == y->r_addend;
+}
+
+static bool duplicate_rela(const Elf_Rela *rela, int idx)
+{
+	int i;
+
+	for (i = 0; i < idx; i++) {
+		if (is_rela_equal(&rela[i], &rela[idx]))
+			return true;
+	}
+
+	return false;
+}
+
+static void count_max_entries(Elf_Rela *relas, int num, unsigned int *plts)
+{
+	unsigned int i, type;
+
+	for (i = 0; i < num; i++) {
+		type = ELF_R_TYPE(relas[i].r_info);
+		if (type == R_LARCH_SOP_PUSH_PLT_PCREL) {
+			if (!duplicate_rela(relas, i))
+				(*plts)++;
+		}
+	}
+}
+
+int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
+			      char *secstrings, struct module *mod)
+{
+	unsigned int i, num_plts = 0;
+
+	/*
+	 * Find the empty .plt sections.
+	 */
+	for (i = 0; i < ehdr->e_shnum; i++) {
+		if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
+			mod->arch.plt.shdr = sechdrs + i;
+		else if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt.idx"))
+			mod->arch.plt_idx.shdr = sechdrs + i;
+	}
+
+	if (!mod->arch.plt.shdr) {
+		pr_err("%s: module PLT section(s) missing\n", mod->name);
+		return -ENOEXEC;
+	}
+	if (!mod->arch.plt_idx.shdr) {
+		pr_err("%s: module PLT.IDX section(s) missing\n", mod->name);
+		return -ENOEXEC;
+	}
+
+	/* Calculate the maxinum number of entries */
+	for (i = 0; i < ehdr->e_shnum; i++) {
+		int num_rela = sechdrs[i].sh_size / sizeof(Elf_Rela);
+		Elf_Rela *relas = (void *)ehdr + sechdrs[i].sh_offset;
+		Elf_Shdr *dst_sec = sechdrs + sechdrs[i].sh_info;
+
+		if (sechdrs[i].sh_type != SHT_RELA)
+			continue;
+
+		/* ignore relocations that operate on non-exec sections */
+		if (!(dst_sec->sh_flags & SHF_EXECINSTR))
+			continue;
+
+		count_max_entries(relas, num_rela, &num_plts);
+	}
+
+	mod->arch.plt.shdr->sh_type = SHT_NOBITS;
+	mod->arch.plt.shdr->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+	mod->arch.plt.shdr->sh_addralign = L1_CACHE_BYTES;
+	mod->arch.plt.shdr->sh_size = (num_plts + 1) * sizeof(struct plt_entry);
+	mod->arch.plt.num_entries = 0;
+	mod->arch.plt.max_entries = num_plts;
+
+	mod->arch.plt_idx.shdr->sh_type = SHT_NOBITS;
+	mod->arch.plt_idx.shdr->sh_flags = SHF_ALLOC;
+	mod->arch.plt_idx.shdr->sh_addralign = L1_CACHE_BYTES;
+	mod->arch.plt_idx.shdr->sh_size = (num_plts + 1) * sizeof(struct plt_idx_entry);
+	mod->arch.plt_idx.num_entries = 0;
+	mod->arch.plt_idx.max_entries = num_plts;
+
+	return 0;
+}
diff --git a/arch/loongarch/kernel/module.c b/arch/loongarch/kernel/module.c
new file mode 100644
index 000000000000..d2004bcfedcc
--- /dev/null
+++ b/arch/loongarch/kernel/module.c
@@ -0,0 +1,384 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Author: Hanlu Li <lihanlu@loongson.cn>
+ *         Huacai Chen <chenhuacai@loongson.cn>
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#define pr_fmt(fmt) "kmod: " fmt
+
+#include <linux/moduleloader.h>
+#include <linux/elf.h>
+#include <linux/mm.h>
+#include <linux/vmalloc.h>
+#include <linux/slab.h>
+#include <linux/fs.h>
+#include <linux/string.h>
+#include <linux/kernel.h>
+
+static int rela_stack_push(s64 stack_value, s64 *rela_stack, size_t *rela_stack_top)
+{
+	if (*rela_stack_top >= RELA_STACK_DEPTH)
+		return -ENOEXEC;
+
+	rela_stack[(*rela_stack_top)++] = stack_value;
+	pr_debug("%s stack_value = 0x%llx\n", __func__, stack_value);
+
+	return 0;
+}
+
+static int rela_stack_pop(s64 *stack_value, s64 *rela_stack, size_t *rela_stack_top)
+{
+	if (*rela_stack_top == 0)
+		return -ENOEXEC;
+
+	*stack_value = rela_stack[--(*rela_stack_top)];
+	pr_debug("%s stack_value = 0x%llx\n", __func__, *stack_value);
+
+	return 0;
+}
+
+static int apply_r_larch_none(struct module *mod, u32 *location, Elf_Addr v,
+			s64 *rela_stack, size_t *rela_stack_top, unsigned int type)
+{
+	return 0;
+}
+
+static int apply_r_larch_32(struct module *mod, u32 *location, Elf_Addr v,
+			s64 *rela_stack, size_t *rela_stack_top, unsigned int type)
+{
+	*location = v;
+	return 0;
+}
+
+static int apply_r_larch_64(struct module *mod, u32 *location, Elf_Addr v,
+			s64 *rela_stack, size_t *rela_stack_top, unsigned int type)
+{
+	*(Elf_Addr *)location = v;
+	return 0;
+}
+
+static int apply_r_larch_sop_push_pcrel(struct module *mod, u32 *location, Elf_Addr v,
+			s64 *rela_stack, size_t *rela_stack_top, unsigned int type)
+{
+	return rela_stack_push(v - (u64)location, rela_stack, rela_stack_top);
+}
+
+static int apply_r_larch_sop_push_absolute(struct module *mod, u32 *location, Elf_Addr v,
+			s64 *rela_stack, size_t *rela_stack_top, unsigned int type)
+{
+	return rela_stack_push(v, rela_stack, rela_stack_top);
+}
+
+static int apply_r_larch_sop_push_dup(struct module *mod, u32 *location, Elf_Addr v,
+			s64 *rela_stack, size_t *rela_stack_top, unsigned int type)
+{
+	int err = 0;
+	s64 opr1;
+
+	err = rela_stack_pop(&opr1, rela_stack, rela_stack_top);
+	if (err)
+		return err;
+	err = rela_stack_push(opr1, rela_stack, rela_stack_top);
+	if (err)
+		return err;
+	err = rela_stack_push(opr1, rela_stack, rela_stack_top);
+	if (err)
+		return err;
+
+	return 0;
+}
+
+static int apply_r_larch_sop_push_plt_pcrel(struct module *mod, u32 *location, Elf_Addr v,
+			s64 *rela_stack, size_t *rela_stack_top, unsigned int type)
+{
+	ptrdiff_t offset = (void *)v - (void *)location;
+
+	if (offset >= SZ_128M)
+		v = module_emit_plt_entry(mod, v);
+
+	if (offset < -SZ_128M)
+		v = module_emit_plt_entry(mod, v);
+
+	return apply_r_larch_sop_push_pcrel(mod, location, v, rela_stack, rela_stack_top, type);
+}
+
+static int apply_r_larch_sop(struct module *mod, u32 *location, Elf_Addr v,
+			s64 *rela_stack, size_t *rela_stack_top, unsigned int type)
+{
+	int err = 0;
+	s64 opr1, opr2, opr3;
+
+	if (type == R_LARCH_SOP_IF_ELSE) {
+		err = rela_stack_pop(&opr3, rela_stack, rela_stack_top);
+		if (err)
+			return err;
+	}
+
+	err = rela_stack_pop(&opr2, rela_stack, rela_stack_top);
+	if (err)
+		return err;
+	err = rela_stack_pop(&opr1, rela_stack, rela_stack_top);
+	if (err)
+		return err;
+
+	switch (type) {
+	case R_LARCH_SOP_AND:
+		err = rela_stack_push(opr1 & opr2, rela_stack, rela_stack_top);
+		break;
+	case R_LARCH_SOP_ADD:
+		err = rela_stack_push(opr1 + opr2, rela_stack, rela_stack_top);
+		break;
+	case R_LARCH_SOP_SUB:
+		err = rela_stack_push(opr1 - opr2, rela_stack, rela_stack_top);
+		break;
+	case R_LARCH_SOP_SL:
+		err = rela_stack_push(opr1 << opr2, rela_stack, rela_stack_top);
+		break;
+	case R_LARCH_SOP_SR:
+		err = rela_stack_push(opr1 >> opr2, rela_stack, rela_stack_top);
+		break;
+	case R_LARCH_SOP_IF_ELSE:
+		err = rela_stack_push(opr1 ? opr2 : opr3, rela_stack, rela_stack_top);
+		break;
+	default:
+		pr_err("%s: Unsupport relocation type %u handler\n", mod->name, type);
+		return -EINVAL;
+	}
+
+	return err;
+}
+
+static int apply_r_larch_sop_imm_field(struct module *mod, u32 *location, Elf_Addr v,
+			s64 *rela_stack, size_t *rela_stack_top, unsigned int type)
+{
+	int err = 0;
+	s64 opr1;
+
+	err = rela_stack_pop(&opr1, rela_stack, rela_stack_top);
+	if (err)
+		return err;
+
+	switch (type) {
+	case R_LARCH_SOP_POP_32_S_10_5:
+		/* check 5-bit signed */
+		if ((opr1 & ~(u64)0xf) && (opr1 & ~(u64)0xf) != ~(u64)0xf)
+			goto overflow;
+
+		/* (*(uint32_t *) PC) [14 ... 10] = opr [4 ... 0] */
+		*location = (*location & (~(u32)0x7c00)) | ((opr1 & 0x1f) << 10);
+		return 0;
+	case R_LARCH_SOP_POP_32_U_10_12:
+		/* check 12-bit unsigned */
+		if (opr1 & ~(u64)0xfff)
+			goto overflow;
+
+		/* (*(uint32_t *) PC) [21 ... 10] = opr [11 ... 0] */
+		*location = (*location & (~(u32)0x3ffc00)) | ((opr1 & 0xfff) << 10);
+		return 0;
+	case R_LARCH_SOP_POP_32_S_10_12:
+		/* check 12-bit signed */
+		if ((opr1 & ~(u64)0x7ff) && (opr1 & ~(u64)0x7ff) != ~(u64)0x7ff)
+			goto overflow;
+
+		/* (*(uint32_t *) PC) [21 ... 10] = opr [11 ... 0] */
+		*location = (*location & (~(u32)0x3ffc00)) | ((opr1 & 0xfff) << 10);
+		return 0;
+	case R_LARCH_SOP_POP_32_S_10_16:
+		/* check 16-bit signed */
+		if ((opr1 & ~(u64)0x7fff) && (opr1 & ~(u64)0x7fff) != ~(u64)0x7fff)
+			goto overflow;
+
+		/* (*(uint32_t *) PC) [25 ... 10] = opr [15 ... 0] */
+		*location = (*location & 0xfc0003ff) | ((opr1 & 0xffff) << 10);
+		return 0;
+	case R_LARCH_SOP_POP_32_S_10_16_S2:
+		if (opr1 % 4)
+			goto unaligned;
+
+		opr1 >>= 2;
+		/* check 18-bit signed */
+		if ((opr1 & ~(u64)0x7fff) && (opr1 & ~(u64)0x7fff) != ~(u64)0x7fff)
+			goto overflow;
+
+		/* (*(uint32_t *) PC) [25 ... 10] = opr [17 ... 2] */
+		*location = (*location & 0xfc0003ff) | ((opr1 & 0xffff) << 10);
+		return 0;
+	case R_LARCH_SOP_POP_32_S_5_20:
+		/* check 20-bit signed */
+		if ((opr1 & ~(u64)0x7ffff) && (opr1 & ~(u64)0x7ffff) != ~(u64)0x7ffff)
+			goto overflow;
+
+		/* (*(uint32_t *) PC) [24 ... 5] = opr [19 ... 0] */
+		*location = (*location & (~(u32)0x1ffffe0)) | ((opr1 & 0xfffff) << 5);
+		return 0;
+	case R_LARCH_SOP_POP_32_S_0_5_10_16_S2:
+		/* check 4-aligned */
+		if (opr1 % 4)
+			goto unaligned;
+
+		opr1 >>= 2;
+		/* check 23-bit signed */
+		if ((opr1 & ~(u64)0xfffff) && (opr1 & ~(u64)0xfffff) != ~(u64)0xfffff)
+			goto overflow;
+
+		/*
+		 * (*(uint32_t *) PC) [4 ... 0] = opr [22 ... 18]
+		 * (*(uint32_t *) PC) [25 ... 10] = opr [17 ... 2]
+		 */
+		*location = (*location & 0xfc0003e0)
+			| ((opr1 & 0x1f0000) >> 16) | ((opr1 & 0xffff) << 10);
+		return 0;
+	case R_LARCH_SOP_POP_32_S_0_10_10_16_S2:
+		/* check 4-aligned */
+		if (opr1 % 4)
+			goto unaligned;
+
+		opr1 >>= 2;
+		/* check 28-bit signed */
+		if ((opr1 & ~(u64)0x1ffffff) && (opr1 & ~(u64)0x1ffffff) != ~(u64)0x1ffffff)
+			goto overflow;
+
+		/*
+		 * (*(uint32_t *) PC) [9 ... 0] = opr [27 ... 18]
+		 * (*(uint32_t *) PC) [25 ... 10] = opr [17 ... 2]
+		 */
+		*location = (*location & 0xfc000000)
+			| ((opr1 & 0x3ff0000) >> 16) | ((opr1 & 0xffff) << 10);
+		return 0;
+	case R_LARCH_SOP_POP_32_U:
+		/* check 32-bit unsigned */
+		if (opr1 & ~(u64)0xffffffff)
+			goto overflow;
+
+		/* (*(uint32_t *) PC) = opr */
+		*location = (u32)opr1;
+		return 0;
+	default:
+		pr_err("%s: Unsupport relocation type %u handler\n", mod->name, type);
+		return -EINVAL;
+	}
+
+overflow:
+	pr_err("module %s: opr1 = 0x%llx overflow! dangerous %s relocation\n",
+		mod->name, opr1, __func__);
+	return -ENOEXEC;
+
+unaligned:
+	pr_err("module %s: opr1 = 0x%llx unaligned! dangerous %s relocation\n",
+		mod->name, opr1, __func__);
+	return -ENOEXEC;
+}
+
+static int apply_r_larch_add_sub(struct module *mod, u32 *location, Elf_Addr v,
+			s64 *rela_stack, size_t *rela_stack_top, unsigned int type)
+{
+	switch (type) {
+	case R_LARCH_ADD32:
+		*(s32 *)location += v;
+		return 0;
+	case R_LARCH_ADD64:
+		*(s64 *)location += v;
+		return 0;
+	case R_LARCH_SUB32:
+		*(s32 *)location -= v;
+		return 0;
+	case R_LARCH_SUB64:
+		*(s64 *)location -= v;
+		return 0;
+	default:
+		pr_err("%s: Unsupport relocation type %u handler\n", mod->name, type);
+		return -EINVAL;
+	}
+}
+
+/*
+ * reloc_handlers_rela() - Apply a particular relocation to a module
+ * @mod: the module to apply the reloc to
+ * @location: the address at which the reloc is to be applied
+ * @v: the value of the reloc, with addend for RELA-style
+ * @rela_stack: the stack used for store relocation info, LOCAL to THIS module
+ * @rela_stac_top: where the stack operation(pop/push) applies to
+ *
+ * Return: 0 upon success, else -ERRNO
+ */
+typedef int (*reloc_rela_handler)(struct module *mod, u32 *location, Elf_Addr v,
+			s64 *rela_stack, size_t *rela_stack_top, unsigned int type);
+
+/* The handlers for known reloc types */
+static reloc_rela_handler reloc_rela_handlers[] = {
+	[R_LARCH_NONE ... R_LARCH_SUB64]		     = apply_r_larch_none,
+
+	[R_LARCH_32]					     = apply_r_larch_32,
+	[R_LARCH_64]					     = apply_r_larch_64,
+	[R_LARCH_SOP_PUSH_PCREL]			     = apply_r_larch_sop_push_pcrel,
+	[R_LARCH_SOP_PUSH_ABSOLUTE]			     = apply_r_larch_sop_push_absolute,
+	[R_LARCH_SOP_PUSH_DUP]				     = apply_r_larch_sop_push_dup,
+	[R_LARCH_SOP_PUSH_PLT_PCREL]			     = apply_r_larch_sop_push_plt_pcrel,
+	[R_LARCH_SOP_SUB ... R_LARCH_SOP_IF_ELSE] 	     = apply_r_larch_sop,
+	[R_LARCH_SOP_POP_32_S_10_5 ... R_LARCH_SOP_POP_32_U] = apply_r_larch_sop_imm_field,
+	[R_LARCH_ADD32 ... R_LARCH_SUB64]		     = apply_r_larch_add_sub,
+};
+
+int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
+		       unsigned int symindex, unsigned int relsec,
+		       struct module *mod)
+{
+	int i, err;
+	unsigned int type;
+	s64 rela_stack[RELA_STACK_DEPTH];
+	size_t rela_stack_top = 0;
+	reloc_rela_handler handler;
+	void *location;
+	Elf_Addr v;
+	Elf_Sym *sym;
+	Elf_Rela *rel = (void *) sechdrs[relsec].sh_addr;
+
+	pr_debug("%s: Applying relocate section %u to %u\n", __func__, relsec,
+	       sechdrs[relsec].sh_info);
+
+	rela_stack_top = 0;
+	for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rel); i++) {
+		/* This is where to make the change */
+		location = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr + rel[i].r_offset;
+		/* This is the symbol it is referring to */
+		sym = (Elf_Sym *)sechdrs[symindex].sh_addr + ELF_R_SYM(rel[i].r_info);
+		if (IS_ERR_VALUE(sym->st_value)) {
+			/* Ignore unresolved weak symbol */
+			if (ELF_ST_BIND(sym->st_info) == STB_WEAK)
+				continue;
+			pr_warn("%s: Unknown symbol %s\n", mod->name, strtab + sym->st_name);
+			return -ENOENT;
+		}
+
+		type = ELF_R_TYPE(rel[i].r_info);
+
+		if (type < ARRAY_SIZE(reloc_rela_handlers))
+			handler = reloc_rela_handlers[type];
+		else
+			handler = NULL;
+
+		if (!handler) {
+			pr_err("%s: Unknown relocation type %u\n", mod->name, type);
+			return -EINVAL;
+		}
+
+		pr_debug("type %d st_value %llx r_addend %llx loc %llx\n",
+		       (int)ELF64_R_TYPE(rel[i].r_info),
+		       sym->st_value, rel[i].r_addend, (u64)location);
+
+		v = sym->st_value + rel[i].r_addend;
+		err = handler(mod, location, v, rela_stack, &rela_stack_top, type);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+void *module_alloc(unsigned long size)
+{
+	return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
+			GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE, __builtin_return_address(0));
+}
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 16/24] LoongArch: Add misc common routines
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (14 preceding siblings ...)
  2022-04-30  9:05 ` [PATCH V9 15/24] LoongArch: Add elf and module support Huacai Chen
@ 2022-04-30  9:05 ` Huacai Chen
  2022-04-30  9:50   ` Arnd Bergmann
  2022-04-30  9:05 ` [PATCH V9 17/24] LoongArch: Add some library functions Huacai Chen
                   ` (8 subsequent siblings)
  24 siblings, 1 reply; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:05 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen

This patch adds some misc common routines for LoongArch, including:
asm-offsets routines, cmpxchg and futex functions, i/o memory access
functions, rtc functions, frame-buffer functions, procfs information
display, etc.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/include/asm/asm-offsets.h |   5 +
 arch/loongarch/include/asm/fb.h          |  23 ++
 arch/loongarch/include/asm/futex.h       | 107 ++++++++++
 arch/loongarch/include/asm/io.h          | 129 ++++++++++++
 arch/loongarch/include/uapi/asm/swab.h   |  52 +++++
 arch/loongarch/kernel/asm-offsets.c      | 254 +++++++++++++++++++++++
 arch/loongarch/kernel/cmpxchg.c          | 102 +++++++++
 arch/loongarch/kernel/io.c               |  94 +++++++++
 arch/loongarch/kernel/proc.c             | 122 +++++++++++
 arch/loongarch/kernel/rtc.c              |  36 ++++
 10 files changed, 924 insertions(+)
 create mode 100644 arch/loongarch/include/asm/asm-offsets.h
 create mode 100644 arch/loongarch/include/asm/fb.h
 create mode 100644 arch/loongarch/include/asm/futex.h
 create mode 100644 arch/loongarch/include/asm/io.h
 create mode 100644 arch/loongarch/include/uapi/asm/swab.h
 create mode 100644 arch/loongarch/kernel/asm-offsets.c
 create mode 100644 arch/loongarch/kernel/cmpxchg.c
 create mode 100644 arch/loongarch/kernel/io.c
 create mode 100644 arch/loongarch/kernel/proc.c
 create mode 100644 arch/loongarch/kernel/rtc.c

diff --git a/arch/loongarch/include/asm/asm-offsets.h b/arch/loongarch/include/asm/asm-offsets.h
new file mode 100644
index 000000000000..d9ad88d293e7
--- /dev/null
+++ b/arch/loongarch/include/asm/asm-offsets.h
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <generated/asm-offsets.h>
diff --git a/arch/loongarch/include/asm/fb.h b/arch/loongarch/include/asm/fb.h
new file mode 100644
index 000000000000..3116bde8772d
--- /dev/null
+++ b/arch/loongarch/include/asm/fb.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_FB_H_
+#define _ASM_FB_H_
+
+#include <linux/fb.h>
+#include <linux/fs.h>
+#include <asm/page.h>
+
+static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
+				unsigned long off)
+{
+	vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
+}
+
+static inline int fb_is_primary_device(struct fb_info *info)
+{
+	return 0;
+}
+
+#endif /* _ASM_FB_H_ */
diff --git a/arch/loongarch/include/asm/futex.h b/arch/loongarch/include/asm/futex.h
new file mode 100644
index 000000000000..b27d55f92db7
--- /dev/null
+++ b/arch/loongarch/include/asm/futex.h
@@ -0,0 +1,107 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_FUTEX_H
+#define _ASM_FUTEX_H
+
+#include <linux/futex.h>
+#include <linux/uaccess.h>
+#include <asm/barrier.h>
+#include <asm/compiler.h>
+#include <asm/errno.h>
+
+#define __futex_atomic_op(insn, ret, oldval, uaddr, oparg)		\
+{									\
+	__asm__ __volatile__(						\
+	"1:	ll.w	%1, %4 # __futex_atomic_op\n"		\
+	"	" insn	"				\n"	\
+	"2:	sc.w	$t0, %2				\n"	\
+	"	beq	$t0, $zero, 1b			\n"	\
+	"3:						\n"	\
+	"	.section .fixup,\"ax\"			\n"	\
+	"4:	li.w	%0, %6				\n"	\
+	"	b	3b				\n"	\
+	"	.previous				\n"	\
+	"	.section __ex_table,\"a\"		\n"	\
+	"	"__UA_ADDR "\t1b, 4b			\n"	\
+	"	"__UA_ADDR "\t2b, 4b			\n"	\
+	"	.previous				\n"	\
+	: "=r" (ret), "=&r" (oldval),				\
+	  "=ZC" (*uaddr)					\
+	: "0" (0), "ZC" (*uaddr), "Jr" (oparg),			\
+	  "i" (-EFAULT)						\
+	: "memory", "t0");					\
+}
+
+static inline int
+arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
+{
+	int oldval = 0, ret = 0;
+
+	pagefault_disable();
+
+	switch (op) {
+	case FUTEX_OP_SET:
+		__futex_atomic_op("move $t0, %z5", ret, oldval, uaddr, oparg);
+		break;
+	case FUTEX_OP_ADD:
+		__futex_atomic_op("add.w $t0, %1, %z5", ret, oldval, uaddr, oparg);
+		break;
+	case FUTEX_OP_OR:
+		__futex_atomic_op("or	$t0, %1, %z5", ret, oldval, uaddr, oparg);
+		break;
+	case FUTEX_OP_ANDN:
+		__futex_atomic_op("and	$t0, %1, %z5", ret, oldval, uaddr, ~oparg);
+		break;
+	case FUTEX_OP_XOR:
+		__futex_atomic_op("xor	$t0, %1, %z5", ret, oldval, uaddr, oparg);
+		break;
+	default:
+		ret = -ENOSYS;
+	}
+
+	pagefault_enable();
+
+	if (!ret)
+		*oval = oldval;
+
+	return ret;
+}
+
+static inline int
+futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval, u32 newval)
+{
+	int ret = 0;
+	u32 val = 0;
+
+	if (!access_ok(uaddr, sizeof(u32)))
+		return -EFAULT;
+
+	__asm__ __volatile__(
+	"# futex_atomic_cmpxchg_inatomic			\n"
+	"1:	ll.w	%1, %3					\n"
+	"	bne	%1, %z4, 3f				\n"
+	"	or	$t0, %z5, $zero				\n"
+	"2:	sc.w	$t0, %2					\n"
+	"	beq	$zero, $t0, 1b				\n"
+	"3:							\n"
+	"	.section .fixup,\"ax\"				\n"
+	"4:	li.d	%0, %6					\n"
+	"	b	3b					\n"
+	"	.previous					\n"
+	"	.section __ex_table,\"a\"			\n"
+	"	"__UA_ADDR "\t1b, 4b				\n"
+	"	"__UA_ADDR "\t2b, 4b				\n"
+	"	.previous					\n"
+	: "+r" (ret), "=&r" (val), "=" GCC_OFF_SMALL_ASM() (*uaddr)
+	: GCC_OFF_SMALL_ASM() (*uaddr), "Jr" (oldval), "Jr" (newval),
+	  "i" (-EFAULT)
+	: "memory", "t0");
+
+	*uval = val;
+
+	return ret;
+}
+
+#endif /* _ASM_FUTEX_H */
diff --git a/arch/loongarch/include/asm/io.h b/arch/loongarch/include/asm/io.h
new file mode 100644
index 000000000000..78df61109f90
--- /dev/null
+++ b/arch/loongarch/include/asm/io.h
@@ -0,0 +1,129 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_IO_H
+#define _ASM_IO_H
+
+#define ARCH_HAS_IOREMAP_WC
+
+#include <linux/compiler.h>
+#include <linux/kernel.h>
+#include <linux/types.h>
+
+#include <asm/addrspace.h>
+#include <asm/bug.h>
+#include <asm/byteorder.h>
+#include <asm/cpu.h>
+#include <asm/page.h>
+#include <asm/pgtable-bits.h>
+#include <asm/string.h>
+
+/*
+ * On LoongArch, I/O ports mappring is following:
+ *
+ *		|	  ....		|
+ *		|-----------------------|
+ *		| pci io ports(64K~32M)	|
+ *		|-----------------------|
+ *		| isa io ports(0  ~16K)	|
+ * PCI_IOBASE ->|-----------------------|
+ *		|	  ....		|
+ */
+#define PCI_IOBASE	((void __iomem *)(vm_map_base + (2 * PAGE_SIZE)))
+#define PCI_IOSIZE	SZ_32M
+#define ISA_IOSIZE	SZ_16K
+#define IO_SPACE_LIMIT	(PCI_IOSIZE - 1)
+
+/*
+ * Change "struct page" to physical address.
+ */
+#define page_to_phys(page)	((phys_addr_t)page_to_pfn(page) << PAGE_SHIFT)
+
+extern void __init __iomem *early_ioremap(u64 phys_addr, unsigned long size);
+extern void __init early_iounmap(void __iomem *addr, unsigned long size);
+
+#define early_memremap early_ioremap
+#define early_memunmap early_iounmap
+
+static inline void __iomem *ioremap_prot(phys_addr_t offset, unsigned long size,
+					 unsigned long prot_val)
+{
+	if (prot_val == _CACHE_CC)
+		return (void __iomem *)(unsigned long)(CAC_BASE + offset);
+	else
+		return (void __iomem *)(unsigned long)(UNCAC_BASE + offset);
+}
+
+/*
+ * ioremap -   map bus memory into CPU space
+ * @offset:    bus address of the memory
+ * @size:      size of the resource to map
+ *
+ * ioremap performs a platform specific sequence of operations to
+ * make bus memory CPU accessible via the readb/readw/readl/writeb/
+ * writew/writel functions and the other mmio helpers. The returned
+ * address is not guaranteed to be usable directly as a virtual
+ * address.
+ */
+#define ioremap(offset, size)					\
+	ioremap_prot((offset), (size), _CACHE_SUC)
+
+/*
+ * ioremap_wc - map bus memory into CPU space
+ * @offset:     bus address of the memory
+ * @size:       size of the resource to map
+ *
+ * ioremap_wc performs a platform specific sequence of operations to
+ * make bus memory CPU accessible via the readb/readw/readl/writeb/
+ * writew/writel functions and the other mmio helpers. The returned
+ * address is not guaranteed to be usable directly as a virtual
+ * address.
+ *
+ * This version of ioremap ensures that the memory is marked uncachable
+ * but accelerated by means of write-combining feature. It is specifically
+ * useful for PCIe prefetchable windows, which may vastly improve a
+ * communications performance. If it was determined on boot stage, what
+ * CPU CCA doesn't support WUC, the method shall fall-back to the
+ * _CACHE_SUC option (see cpu_probe() method).
+ */
+#define ioremap_wc(offset, size)				\
+	ioremap_prot((offset), (size), _CACHE_WUC)
+
+/*
+ * ioremap_cache -  map bus memory into CPU space
+ * @offset:	    bus address of the memory
+ * @size:	    size of the resource to map
+ *
+ * ioremap_cache performs a platform specific sequence of operations to
+ * make bus memory CPU accessible via the readb/readw/readl/writeb/
+ * writew/writel functions and the other mmio helpers. The returned
+ * address is not guaranteed to be usable directly as a virtual
+ * address.
+ *
+ * This version of ioremap ensures that the memory is marked cachable by
+ * the CPU.  Also enables full write-combining.	 Useful for some
+ * memory-like regions on I/O busses.
+ */
+#define ioremap_cache(offset, size)				\
+	ioremap_prot((offset), (size), _CACHE_CC)
+
+static inline void iounmap(const volatile void __iomem *addr)
+{
+}
+
+#define mmiowb() asm volatile ("dbar 0" ::: "memory")
+
+/*
+ * String version of I/O memory access operations.
+ */
+extern void __memset_io(volatile void __iomem *dst, int c, size_t count);
+extern void __memcpy_toio(volatile void __iomem *to, const void *from, size_t count);
+extern void __memcpy_fromio(void *to, const volatile void __iomem *from, size_t count);
+#define memset_io(c, v, l)     __memset_io((c), (v), (l))
+#define memcpy_fromio(a, c, l) __memcpy_fromio((a), (c), (l))
+#define memcpy_toio(c, a, l)   __memcpy_toio((c), (a), (l))
+
+#include <asm-generic/io.h>
+
+#endif /* _ASM_IO_H */
diff --git a/arch/loongarch/include/uapi/asm/swab.h b/arch/loongarch/include/uapi/asm/swab.h
new file mode 100644
index 000000000000..95e02676b6fa
--- /dev/null
+++ b/arch/loongarch/include/uapi/asm/swab.h
@@ -0,0 +1,52 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Authors: Jun Yi <yijun@loongson.cn>
+ *          Huacai Chen <chenhuacai@loongson.cn>
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_SWAB_H
+#define _ASM_SWAB_H
+
+#include <linux/compiler.h>
+#include <linux/types.h>
+
+#define __SWAB_64_THRU_32__
+
+static inline __attribute_const__ __u16 __arch_swab16(__u16 x)
+{
+	__asm__(
+	"	revb.2h	%0, %1			\n"
+	: "=r" (x)
+	: "r" (x));
+
+	return x;
+}
+#define __arch_swab16 __arch_swab16
+
+static inline __attribute_const__ __u32 __arch_swab32(__u32 x)
+{
+	__asm__(
+	"	revb.2h	%0, %1			\n"
+	"	rotri.w	%0, %0, 16		\n"
+	: "=r" (x)
+	: "r" (x));
+
+	return x;
+}
+#define __arch_swab32 __arch_swab32
+
+#ifdef __loongarch64
+static inline __attribute_const__ __u64 __arch_swab64(__u64 x)
+{
+	__asm__(
+	"	revb.4h	%0, %1			\n"
+	"	revh.d	%0, %0			\n"
+	: "=r" (x)
+	: "r" (x));
+
+	return x;
+}
+#define __arch_swab64 __arch_swab64
+#endif /* __loongarch64 */
+#endif /* _ASM_SWAB_H */
diff --git a/arch/loongarch/kernel/asm-offsets.c b/arch/loongarch/kernel/asm-offsets.c
new file mode 100644
index 000000000000..3531e3c60a6e
--- /dev/null
+++ b/arch/loongarch/kernel/asm-offsets.c
@@ -0,0 +1,254 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * asm-offsets.c: Calculate pt_regs and task_struct offsets.
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/types.h>
+#include <linux/sched.h>
+#include <linux/mm.h>
+#include <linux/kbuild.h>
+#include <linux/suspend.h>
+#include <asm/cpu-info.h>
+#include <asm/ptrace.h>
+#include <asm/processor.h>
+
+void output_ptreg_defines(void)
+{
+	COMMENT("LoongArch pt_regs offsets.");
+	OFFSET(PT_R0, pt_regs, regs[0]);
+	OFFSET(PT_R1, pt_regs, regs[1]);
+	OFFSET(PT_R2, pt_regs, regs[2]);
+	OFFSET(PT_R3, pt_regs, regs[3]);
+	OFFSET(PT_R4, pt_regs, regs[4]);
+	OFFSET(PT_R5, pt_regs, regs[5]);
+	OFFSET(PT_R6, pt_regs, regs[6]);
+	OFFSET(PT_R7, pt_regs, regs[7]);
+	OFFSET(PT_R8, pt_regs, regs[8]);
+	OFFSET(PT_R9, pt_regs, regs[9]);
+	OFFSET(PT_R10, pt_regs, regs[10]);
+	OFFSET(PT_R11, pt_regs, regs[11]);
+	OFFSET(PT_R12, pt_regs, regs[12]);
+	OFFSET(PT_R13, pt_regs, regs[13]);
+	OFFSET(PT_R14, pt_regs, regs[14]);
+	OFFSET(PT_R15, pt_regs, regs[15]);
+	OFFSET(PT_R16, pt_regs, regs[16]);
+	OFFSET(PT_R17, pt_regs, regs[17]);
+	OFFSET(PT_R18, pt_regs, regs[18]);
+	OFFSET(PT_R19, pt_regs, regs[19]);
+	OFFSET(PT_R20, pt_regs, regs[20]);
+	OFFSET(PT_R21, pt_regs, regs[21]);
+	OFFSET(PT_R22, pt_regs, regs[22]);
+	OFFSET(PT_R23, pt_regs, regs[23]);
+	OFFSET(PT_R24, pt_regs, regs[24]);
+	OFFSET(PT_R25, pt_regs, regs[25]);
+	OFFSET(PT_R26, pt_regs, regs[26]);
+	OFFSET(PT_R27, pt_regs, regs[27]);
+	OFFSET(PT_R28, pt_regs, regs[28]);
+	OFFSET(PT_R29, pt_regs, regs[29]);
+	OFFSET(PT_R30, pt_regs, regs[30]);
+	OFFSET(PT_R31, pt_regs, regs[31]);
+	OFFSET(PT_CRMD, pt_regs, csr_crmd);
+	OFFSET(PT_PRMD, pt_regs, csr_prmd);
+	OFFSET(PT_EUEN, pt_regs, csr_euen);
+	OFFSET(PT_ECFG, pt_regs, csr_ecfg);
+	OFFSET(PT_ESTAT, pt_regs, csr_estat);
+	OFFSET(PT_ERA, pt_regs, csr_era);
+	OFFSET(PT_BVADDR, pt_regs, csr_badvaddr);
+	OFFSET(PT_ORIG_A0, pt_regs, orig_a0);
+	DEFINE(PT_SIZE, sizeof(struct pt_regs));
+	BLANK();
+}
+
+void output_task_defines(void)
+{
+	COMMENT("LoongArch task_struct offsets.");
+	OFFSET(TASK_STATE, task_struct, __state);
+	OFFSET(TASK_THREAD_INFO, task_struct, stack);
+	OFFSET(TASK_FLAGS, task_struct, flags);
+	OFFSET(TASK_MM, task_struct, mm);
+	OFFSET(TASK_PID, task_struct, pid);
+	DEFINE(TASK_STRUCT_SIZE, sizeof(struct task_struct));
+	BLANK();
+}
+
+void output_thread_info_defines(void)
+{
+	COMMENT("LoongArch thread_info offsets.");
+	OFFSET(TI_TASK, thread_info, task);
+	OFFSET(TI_FLAGS, thread_info, flags);
+	OFFSET(TI_TP_VALUE, thread_info, tp_value);
+	OFFSET(TI_CPU, thread_info, cpu);
+	OFFSET(TI_PRE_COUNT, thread_info, preempt_count);
+	OFFSET(TI_REGS, thread_info, regs);
+	DEFINE(_THREAD_SIZE, THREAD_SIZE);
+	DEFINE(_THREAD_MASK, THREAD_MASK);
+	DEFINE(_IRQ_STACK_SIZE, IRQ_STACK_SIZE);
+	DEFINE(_IRQ_STACK_START, IRQ_STACK_START);
+	BLANK();
+}
+
+void output_thread_defines(void)
+{
+	COMMENT("LoongArch specific thread_struct offsets.");
+	OFFSET(THREAD_REG01, task_struct, thread.reg01);
+	OFFSET(THREAD_REG03, task_struct, thread.reg03);
+	OFFSET(THREAD_REG22, task_struct, thread.reg22);
+	OFFSET(THREAD_REG23, task_struct, thread.reg23);
+	OFFSET(THREAD_REG24, task_struct, thread.reg24);
+	OFFSET(THREAD_REG25, task_struct, thread.reg25);
+	OFFSET(THREAD_REG26, task_struct, thread.reg26);
+	OFFSET(THREAD_REG27, task_struct, thread.reg27);
+	OFFSET(THREAD_REG28, task_struct, thread.reg28);
+	OFFSET(THREAD_REG29, task_struct, thread.reg29);
+	OFFSET(THREAD_REG30, task_struct, thread.reg30);
+	OFFSET(THREAD_REG31, task_struct, thread.reg31);
+	OFFSET(THREAD_CSRCRMD, task_struct,
+	       thread.csr_crmd);
+	OFFSET(THREAD_CSRPRMD, task_struct,
+	       thread.csr_prmd);
+	OFFSET(THREAD_CSREUEN, task_struct,
+	       thread.csr_euen);
+	OFFSET(THREAD_CSRECFG, task_struct,
+	       thread.csr_ecfg);
+
+	OFFSET(THREAD_SCR0, task_struct, thread.scr0);
+	OFFSET(THREAD_SCR1, task_struct, thread.scr1);
+	OFFSET(THREAD_SCR2, task_struct, thread.scr2);
+	OFFSET(THREAD_SCR3, task_struct, thread.scr3);
+
+	OFFSET(THREAD_EFLAGS, task_struct, thread.eflags);
+
+	OFFSET(THREAD_FPU, task_struct, thread.fpu);
+
+	OFFSET(THREAD_BVADDR, task_struct, \
+	       thread.csr_badvaddr);
+	OFFSET(THREAD_ECODE, task_struct, \
+	       thread.error_code);
+	OFFSET(THREAD_TRAPNO, task_struct, thread.trap_nr);
+	BLANK();
+}
+
+void output_thread_fpu_defines(void)
+{
+	OFFSET(THREAD_FPR0, loongarch_fpu, fpr[0]);
+	OFFSET(THREAD_FPR1, loongarch_fpu, fpr[1]);
+	OFFSET(THREAD_FPR2, loongarch_fpu, fpr[2]);
+	OFFSET(THREAD_FPR3, loongarch_fpu, fpr[3]);
+	OFFSET(THREAD_FPR4, loongarch_fpu, fpr[4]);
+	OFFSET(THREAD_FPR5, loongarch_fpu, fpr[5]);
+	OFFSET(THREAD_FPR6, loongarch_fpu, fpr[6]);
+	OFFSET(THREAD_FPR7, loongarch_fpu, fpr[7]);
+	OFFSET(THREAD_FPR8, loongarch_fpu, fpr[8]);
+	OFFSET(THREAD_FPR9, loongarch_fpu, fpr[9]);
+	OFFSET(THREAD_FPR10, loongarch_fpu, fpr[10]);
+	OFFSET(THREAD_FPR11, loongarch_fpu, fpr[11]);
+	OFFSET(THREAD_FPR12, loongarch_fpu, fpr[12]);
+	OFFSET(THREAD_FPR13, loongarch_fpu, fpr[13]);
+	OFFSET(THREAD_FPR14, loongarch_fpu, fpr[14]);
+	OFFSET(THREAD_FPR15, loongarch_fpu, fpr[15]);
+	OFFSET(THREAD_FPR16, loongarch_fpu, fpr[16]);
+	OFFSET(THREAD_FPR17, loongarch_fpu, fpr[17]);
+	OFFSET(THREAD_FPR18, loongarch_fpu, fpr[18]);
+	OFFSET(THREAD_FPR19, loongarch_fpu, fpr[19]);
+	OFFSET(THREAD_FPR20, loongarch_fpu, fpr[20]);
+	OFFSET(THREAD_FPR21, loongarch_fpu, fpr[21]);
+	OFFSET(THREAD_FPR22, loongarch_fpu, fpr[22]);
+	OFFSET(THREAD_FPR23, loongarch_fpu, fpr[23]);
+	OFFSET(THREAD_FPR24, loongarch_fpu, fpr[24]);
+	OFFSET(THREAD_FPR25, loongarch_fpu, fpr[25]);
+	OFFSET(THREAD_FPR26, loongarch_fpu, fpr[26]);
+	OFFSET(THREAD_FPR27, loongarch_fpu, fpr[27]);
+	OFFSET(THREAD_FPR28, loongarch_fpu, fpr[28]);
+	OFFSET(THREAD_FPR29, loongarch_fpu, fpr[29]);
+	OFFSET(THREAD_FPR30, loongarch_fpu, fpr[30]);
+	OFFSET(THREAD_FPR31, loongarch_fpu, fpr[31]);
+
+	OFFSET(THREAD_FCSR, loongarch_fpu, fcsr);
+	OFFSET(THREAD_FCC,  loongarch_fpu, fcc);
+	OFFSET(THREAD_VCSR, loongarch_fpu, vcsr);
+	BLANK();
+}
+
+void output_mm_defines(void)
+{
+	COMMENT("Size of struct page");
+	DEFINE(STRUCT_PAGE_SIZE, sizeof(struct page));
+	BLANK();
+	COMMENT("Linux mm_struct offsets.");
+	OFFSET(MM_USERS, mm_struct, mm_users);
+	OFFSET(MM_PGD, mm_struct, pgd);
+	OFFSET(MM_CONTEXT, mm_struct, context);
+	BLANK();
+	DEFINE(_PGD_T_SIZE, sizeof(pgd_t));
+	DEFINE(_PMD_T_SIZE, sizeof(pmd_t));
+	DEFINE(_PTE_T_SIZE, sizeof(pte_t));
+	BLANK();
+	DEFINE(_PGD_T_LOG2, PGD_T_LOG2);
+#ifndef __PAGETABLE_PMD_FOLDED
+	DEFINE(_PMD_T_LOG2, PMD_T_LOG2);
+#endif
+	DEFINE(_PTE_T_LOG2, PTE_T_LOG2);
+	BLANK();
+	DEFINE(_PGD_ORDER, PGD_ORDER);
+#ifndef __PAGETABLE_PMD_FOLDED
+	DEFINE(_PMD_ORDER, PMD_ORDER);
+#endif
+	DEFINE(_PTE_ORDER, PTE_ORDER);
+	BLANK();
+	DEFINE(_PMD_SHIFT, PMD_SHIFT);
+	DEFINE(_PGDIR_SHIFT, PGDIR_SHIFT);
+	BLANK();
+	DEFINE(_PTRS_PER_PGD, PTRS_PER_PGD);
+	DEFINE(_PTRS_PER_PMD, PTRS_PER_PMD);
+	DEFINE(_PTRS_PER_PTE, PTRS_PER_PTE);
+	BLANK();
+	DEFINE(_PAGE_SHIFT, PAGE_SHIFT);
+	DEFINE(_PAGE_SIZE, PAGE_SIZE);
+	BLANK();
+}
+
+void output_sc_defines(void)
+{
+	COMMENT("Linux sigcontext offsets.");
+	OFFSET(SC_REGS, sigcontext, sc_regs);
+	OFFSET(SC_PC, sigcontext, sc_pc);
+	BLANK();
+}
+
+void output_signal_defines(void)
+{
+	COMMENT("Linux signal numbers.");
+	DEFINE(_SIGHUP, SIGHUP);
+	DEFINE(_SIGINT, SIGINT);
+	DEFINE(_SIGQUIT, SIGQUIT);
+	DEFINE(_SIGILL, SIGILL);
+	DEFINE(_SIGTRAP, SIGTRAP);
+	DEFINE(_SIGIOT, SIGIOT);
+	DEFINE(_SIGABRT, SIGABRT);
+	DEFINE(_SIGFPE, SIGFPE);
+	DEFINE(_SIGKILL, SIGKILL);
+	DEFINE(_SIGBUS, SIGBUS);
+	DEFINE(_SIGSEGV, SIGSEGV);
+	DEFINE(_SIGSYS, SIGSYS);
+	DEFINE(_SIGPIPE, SIGPIPE);
+	DEFINE(_SIGALRM, SIGALRM);
+	DEFINE(_SIGTERM, SIGTERM);
+	DEFINE(_SIGUSR1, SIGUSR1);
+	DEFINE(_SIGUSR2, SIGUSR2);
+	DEFINE(_SIGCHLD, SIGCHLD);
+	DEFINE(_SIGPWR, SIGPWR);
+	DEFINE(_SIGWINCH, SIGWINCH);
+	DEFINE(_SIGURG, SIGURG);
+	DEFINE(_SIGIO, SIGIO);
+	DEFINE(_SIGSTOP, SIGSTOP);
+	DEFINE(_SIGTSTP, SIGTSTP);
+	DEFINE(_SIGCONT, SIGCONT);
+	DEFINE(_SIGTTIN, SIGTTIN);
+	DEFINE(_SIGTTOU, SIGTTOU);
+	DEFINE(_SIGVTALRM, SIGVTALRM);
+	DEFINE(_SIGPROF, SIGPROF);
+	DEFINE(_SIGXCPU, SIGXCPU);
+	DEFINE(_SIGXFSZ, SIGXFSZ);
+	BLANK();
+}
diff --git a/arch/loongarch/kernel/cmpxchg.c b/arch/loongarch/kernel/cmpxchg.c
new file mode 100644
index 000000000000..7994489adc79
--- /dev/null
+++ b/arch/loongarch/kernel/cmpxchg.c
@@ -0,0 +1,102 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Author: Huacai Chen <chenhuacai@loongson.cn>
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ *
+ * Derived from MIPS:
+ * Copyright (C) 2017 Imagination Technologies
+ * Author: Paul Burton <paul.burton@mips.com>
+ */
+
+#include <linux/bug.h>
+#include <asm/barrier.h>
+#include <asm/cmpxchg.h>
+
+unsigned long __xchg_small(volatile void *ptr, unsigned long val, unsigned int size)
+{
+	u32 old32, mask, temp;
+	volatile u32 *ptr32;
+	unsigned int shift;
+
+	/* Check that ptr is naturally aligned */
+	WARN_ON((unsigned long)ptr & (size - 1));
+
+	/* Mask value to the correct size. */
+	mask = GENMASK((size * BITS_PER_BYTE) - 1, 0);
+	val &= mask;
+
+	/*
+	 * Calculate a shift & mask that correspond to the value we wish to
+	 * exchange within the naturally aligned 4 byte integerthat includes
+	 * it.
+	 */
+	shift = (unsigned long)ptr & 0x3;
+	shift *= BITS_PER_BYTE;
+	mask <<= shift;
+
+	/*
+	 * Calculate a pointer to the naturally aligned 4 byte integer that
+	 * includes our byte of interest, and load its value.
+	 */
+	ptr32 = (volatile u32 *)((unsigned long)ptr & ~0x3);
+
+	asm volatile (
+	"1:	ll.w		%0, %3		\n"
+	"	andn		%1, %0, %4	\n"
+	"	or		%1, %1, %5	\n"
+	"	sc.w		%1, %2		\n"
+	"	beqz		%1, 1b		\n"
+	: "=&r" (old32), "=&r" (temp), "=" GCC_OFF_SMALL_ASM() (*ptr32)
+	: GCC_OFF_SMALL_ASM() (*ptr32), "Jr" (mask), "Jr" (val << shift)
+	: "memory");
+
+	return (old32 & mask) >> shift;
+}
+
+unsigned long __cmpxchg_small(volatile void *ptr, unsigned long old,
+			      unsigned long new, unsigned int size)
+{
+	u32 old32, mask, temp;
+	volatile u32 *ptr32;
+	unsigned int shift;
+
+	/* Check that ptr is naturally aligned */
+	WARN_ON((unsigned long)ptr & (size - 1));
+
+	/* Mask inputs to the correct size. */
+	mask = GENMASK((size * BITS_PER_BYTE) - 1, 0);
+	old &= mask;
+	new &= mask;
+
+	/*
+	 * Calculate a shift & mask that correspond to the value we wish to
+	 * compare & exchange within the naturally aligned 4 byte integer
+	 * that includes it.
+	 */
+	shift = (unsigned long)ptr & 0x3;
+	shift *= BITS_PER_BYTE;
+	old <<= shift;
+	new <<= shift;
+	mask <<= shift;
+
+	/*
+	 * Calculate a pointer to the naturally aligned 4 byte integer that
+	 * includes our byte of interest, and load its value.
+	 */
+	ptr32 = (volatile u32 *)((unsigned long)ptr & ~0x3);
+
+	asm volatile (
+	"1:	ll.w		%0, %3		\n"
+	"	and		%1, %0, %4	\n"
+	"	bne		%1, %5, 2f	\n"
+	"	andn		%1, %0, %4	\n"
+	"	or		%1, %1, %6	\n"
+	"	sc.w		%1, %2		\n"
+	"	beqz		%1, 1b		\n"
+	"2:					\n"
+	: "=&r" (old32), "=&r" (temp), "=" GCC_OFF_SMALL_ASM() (*ptr32)
+	: GCC_OFF_SMALL_ASM() (*ptr32), "Jr" (mask), "Jr" (old), "Jr" (new)
+	: "memory");
+
+	return (old32 & mask) >> shift;
+}
diff --git a/arch/loongarch/kernel/io.c b/arch/loongarch/kernel/io.c
new file mode 100644
index 000000000000..cb85bda5a6ad
--- /dev/null
+++ b/arch/loongarch/kernel/io.c
@@ -0,0 +1,94 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/export.h>
+#include <linux/types.h>
+#include <linux/io.h>
+
+/*
+ * Copy data from IO memory space to "real" memory space.
+ */
+void __memcpy_fromio(void *to, const volatile void __iomem *from, size_t count)
+{
+	while (count && !IS_ALIGNED((unsigned long)from, 8)) {
+		*(u8 *)to = __raw_readb(from);
+		from++;
+		to++;
+		count--;
+	}
+
+	while (count >= 8) {
+		*(u64 *)to = __raw_readq(from);
+		from += 8;
+		to += 8;
+		count -= 8;
+	}
+
+	while (count) {
+		*(u8 *)to = __raw_readb(from);
+		from++;
+		to++;
+		count--;
+	}
+}
+EXPORT_SYMBOL(__memcpy_fromio);
+
+/*
+ * Copy data from "real" memory space to IO memory space.
+ */
+void __memcpy_toio(volatile void __iomem *to, const void *from, size_t count)
+{
+	while (count && !IS_ALIGNED((unsigned long)to, 8)) {
+		__raw_writeb(*(u8 *)from, to);
+		from++;
+		to++;
+		count--;
+	}
+
+	while (count >= 8) {
+		__raw_writeq(*(u64 *)from, to);
+		from += 8;
+		to += 8;
+		count -= 8;
+	}
+
+	while (count) {
+		__raw_writeb(*(u8 *)from, to);
+		from++;
+		to++;
+		count--;
+	}
+}
+EXPORT_SYMBOL(__memcpy_toio);
+
+/*
+ * "memset" on IO memory space.
+ */
+void __memset_io(volatile void __iomem *dst, int c, size_t count)
+{
+	u64 qc = (u8)c;
+
+	qc |= qc << 8;
+	qc |= qc << 16;
+	qc |= qc << 32;
+
+	while (count && !IS_ALIGNED((unsigned long)dst, 8)) {
+		__raw_writeb(c, dst);
+		dst++;
+		count--;
+	}
+
+	while (count >= 8) {
+		__raw_writeq(qc, dst);
+		dst += 8;
+		count -= 8;
+	}
+
+	while (count) {
+		__raw_writeb(c, dst);
+		dst++;
+		count--;
+	}
+}
+EXPORT_SYMBOL(__memset_io);
diff --git a/arch/loongarch/kernel/proc.c b/arch/loongarch/kernel/proc.c
new file mode 100644
index 000000000000..1f2ee5b15b50
--- /dev/null
+++ b/arch/loongarch/kernel/proc.c
@@ -0,0 +1,122 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/delay.h>
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/seq_file.h>
+#include <asm/bootinfo.h>
+#include <asm/cpu.h>
+#include <asm/cpu-features.h>
+#include <asm/idle.h>
+#include <asm/processor.h>
+#include <asm/time.h>
+
+/*
+ * No lock; only written during early bootup by CPU 0.
+ */
+static RAW_NOTIFIER_HEAD(proc_cpuinfo_chain);
+
+int __ref register_proc_cpuinfo_notifier(struct notifier_block *nb)
+{
+	return raw_notifier_chain_register(&proc_cpuinfo_chain, nb);
+}
+
+int proc_cpuinfo_notifier_call_chain(unsigned long val, void *v)
+{
+	return raw_notifier_call_chain(&proc_cpuinfo_chain, val, v);
+}
+
+static int show_cpuinfo(struct seq_file *m, void *v)
+{
+	unsigned long n = (unsigned long) v - 1;
+	unsigned int version = cpu_data[n].processor_id & 0xff;
+	unsigned int fp_version = cpu_data[n].fpu_vers;
+	struct proc_cpuinfo_notifier_args proc_cpuinfo_notifier_args;
+
+	/*
+	 * For the first processor also print the system type
+	 */
+	if (n == 0)
+		seq_printf(m, "system type\t\t: %s\n", get_system_type());
+
+	seq_printf(m, "processor\t\t: %ld\n", n);
+	seq_printf(m, "package\t\t\t: %d\n", cpu_data[n].package);
+	seq_printf(m, "core\t\t\t: %d\n", cpu_core(&cpu_data[n]));
+	seq_printf(m, "CPU Family\t\t: %s\n", __cpu_family[n]);
+	seq_printf(m, "Model Name\t\t: %s\n", __cpu_full_name[n]);
+	seq_printf(m, "CPU Revision\t\t: 0x%02x\n", version);
+	seq_printf(m, "FPU Revision\t\t: 0x%02x\n", fp_version);
+	seq_printf(m, "CPU MHz\t\t\t: %llu.%02llu\n",
+		      cpu_clock_freq / 1000000, (cpu_clock_freq / 10000) % 100);
+	seq_printf(m, "BogoMIPS\t\t: %llu.%02llu\n",
+		      (lpj_fine * cpu_clock_freq / const_clock_freq) / (500000/HZ),
+		      ((lpj_fine * cpu_clock_freq / const_clock_freq) / (5000/HZ)) % 100);
+	seq_printf(m, "TLB Entries\t\t: %d\n", cpu_data[n].tlbsize);
+	seq_printf(m, "Address Sizes\t\t: %d bits physical, %d bits virtual\n",
+		      cpu_pabits + 1, cpu_vabits + 1);
+
+	seq_printf(m, "ISA\t\t\t:");
+	if (cpu_has_loongarch32)
+		seq_printf(m, "%s", " loongarch32");
+	if (cpu_has_loongarch64)
+		seq_printf(m, "%s", " loongarch64");
+	seq_printf(m, "\n");
+
+	seq_printf(m, "Features\t\t:");
+	if (cpu_has_cpucfg)	seq_printf(m, "%s", " cpucfg");
+	if (cpu_has_lam)	seq_printf(m, "%s", " lam");
+	if (cpu_has_ual)	seq_printf(m, "%s", " ual");
+	if (cpu_has_fpu)	seq_printf(m, "%s", " fpu");
+	if (cpu_has_lsx)	seq_printf(m, "%s", " lsx");
+	if (cpu_has_lasx)	seq_printf(m, "%s", " lasx");
+	if (cpu_has_complex)	seq_printf(m, "%s", " complex");
+	if (cpu_has_crypto)	seq_printf(m, "%s", " crypto");
+	if (cpu_has_lvz)	seq_printf(m, "%s", " lvz");
+	if (cpu_has_lbt_x86)	seq_printf(m, "%s", " lbt_x86");
+	if (cpu_has_lbt_arm)	seq_printf(m, "%s", " lbt_arm");
+	if (cpu_has_lbt_mips)	seq_printf(m, "%s", " lbt_mips");
+	seq_printf(m, "\n");
+
+	seq_printf(m, "Hardware Watchpoint\t: %s",
+		      cpu_has_watch ? "yes, " : "no\n");
+	if (cpu_has_watch) {
+		seq_printf(m, "iwatch count: %d, dwatch count: %d\n",
+		      cpu_data[n].watch_ireg_count, cpu_data[n].watch_dreg_count);
+	}
+
+	proc_cpuinfo_notifier_args.m = m;
+	proc_cpuinfo_notifier_args.n = n;
+
+	raw_notifier_call_chain(&proc_cpuinfo_chain, 0,
+				&proc_cpuinfo_notifier_args);
+
+	seq_printf(m, "\n");
+
+	return 0;
+}
+
+static void *c_start(struct seq_file *m, loff_t *pos)
+{
+	unsigned long i = *pos;
+
+	return i < NR_CPUS ? (void *)(i + 1) : NULL;
+}
+
+static void *c_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	++*pos;
+	return c_start(m, pos);
+}
+
+static void c_stop(struct seq_file *m, void *v)
+{
+}
+
+const struct seq_operations cpuinfo_op = {
+	.start	= c_start,
+	.next	= c_next,
+	.stop	= c_stop,
+	.show	= show_cpuinfo,
+};
diff --git a/arch/loongarch/kernel/rtc.c b/arch/loongarch/kernel/rtc.c
new file mode 100644
index 000000000000..d7568385219f
--- /dev/null
+++ b/arch/loongarch/kernel/rtc.c
@@ -0,0 +1,36 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/platform_device.h>
+#include <asm/loongson.h>
+
+#define RTC_TOYREAD0    0x2C
+#define RTC_YEAR        0x30
+
+unsigned long loongson_get_rtc_time(void)
+{
+	unsigned int year, mon, day, hour, min, sec;
+	unsigned int value;
+
+	value = ls7a_readl(LS7A_RTC_REG_BASE + RTC_TOYREAD0);
+	sec = (value >> 4) & 0x3f;
+	min = (value >> 10) & 0x3f;
+	hour = (value >> 16) & 0x1f;
+	day = (value >> 21) & 0x1f;
+	mon = (value >> 26) & 0x3f;
+	year = ls7a_readl(LS7A_RTC_REG_BASE + RTC_YEAR);
+
+	year = 1900 + year;
+
+	return mktime64(year, mon, day, hour, min, sec);
+}
+
+void read_persistent_clock64(struct timespec64 *ts)
+{
+	ts->tv_sec = loongson_get_rtc_time();
+	ts->tv_nsec = 0;
+}
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 17/24] LoongArch: Add some library functions
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (15 preceding siblings ...)
  2022-04-30  9:05 ` [PATCH V9 16/24] LoongArch: Add misc common routines Huacai Chen
@ 2022-04-30  9:05 ` Huacai Chen
  2022-05-01 10:55   ` Guo Ren
  2022-04-30  9:05 ` [PATCH V9 18/24] LoongArch: Add PCI controller support Huacai Chen
                   ` (7 subsequent siblings)
  24 siblings, 1 reply; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:05 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen

This patch adds some library functions for LoongArch, including: delay,
memset, memcpy, memmove, copy_user, strncpy_user, strnlen_user and tlb
dump functions.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/include/asm/delay.h  |  26 +++++++
 arch/loongarch/include/asm/string.h |  17 +++++
 arch/loongarch/lib/clear_user.S     |  43 +++++++++++
 arch/loongarch/lib/copy_user.S      |  47 ++++++++++++
 arch/loongarch/lib/delay.c          |  43 +++++++++++
 arch/loongarch/lib/dump_tlb.c       | 111 ++++++++++++++++++++++++++++
 arch/loongarch/lib/memcpy.S         |  32 ++++++++
 arch/loongarch/lib/memmove.S        |  45 +++++++++++
 arch/loongarch/lib/memset.S         |  30 ++++++++
 9 files changed, 394 insertions(+)
 create mode 100644 arch/loongarch/include/asm/delay.h
 create mode 100644 arch/loongarch/include/asm/string.h
 create mode 100644 arch/loongarch/lib/clear_user.S
 create mode 100644 arch/loongarch/lib/copy_user.S
 create mode 100644 arch/loongarch/lib/delay.c
 create mode 100644 arch/loongarch/lib/dump_tlb.c
 create mode 100644 arch/loongarch/lib/memcpy.S
 create mode 100644 arch/loongarch/lib/memmove.S
 create mode 100644 arch/loongarch/lib/memset.S

diff --git a/arch/loongarch/include/asm/delay.h b/arch/loongarch/include/asm/delay.h
new file mode 100644
index 000000000000..016b3aca65cb
--- /dev/null
+++ b/arch/loongarch/include/asm/delay.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_DELAY_H
+#define _ASM_DELAY_H
+
+#include <linux/param.h>
+
+extern void __delay(unsigned long loops);
+extern void __ndelay(unsigned long ns);
+extern void __udelay(unsigned long us);
+
+#define ndelay(ns) __ndelay(ns)
+#define udelay(us) __udelay(us)
+
+/* make sure "usecs *= ..." in udelay do not overflow. */
+#if HZ >= 1000
+#define MAX_UDELAY_MS	1
+#elif HZ <= 200
+#define MAX_UDELAY_MS	5
+#else
+#define MAX_UDELAY_MS	(1000 / HZ)
+#endif
+
+#endif /* _ASM_DELAY_H */
diff --git a/arch/loongarch/include/asm/string.h b/arch/loongarch/include/asm/string.h
new file mode 100644
index 000000000000..7b29cc9c70aa
--- /dev/null
+++ b/arch/loongarch/include/asm/string.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_STRING_H
+#define _ASM_STRING_H
+
+#define __HAVE_ARCH_MEMSET
+extern void *memset(void *__s, int __c, size_t __count);
+
+#define __HAVE_ARCH_MEMCPY
+extern void *memcpy(void *__to, __const__ void *__from, size_t __n);
+
+#define __HAVE_ARCH_MEMMOVE
+extern void *memmove(void *__dest, __const__ void *__src, size_t __n);
+
+#endif /* _ASM_STRING_H */
diff --git a/arch/loongarch/lib/clear_user.S b/arch/loongarch/lib/clear_user.S
new file mode 100644
index 000000000000..b8168d22ac80
--- /dev/null
+++ b/arch/loongarch/lib/clear_user.S
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <asm/asm.h>
+#include <asm/asmmacro.h>
+#include <asm/export.h>
+#include <asm/regdef.h>
+
+.macro fixup_ex from, to, offset, fix
+.if \fix
+	.section .fixup, "ax"
+\to:	addi.d	v0, a1, \offset
+	jr	ra
+	.previous
+.endif
+	.section __ex_table, "a"
+	PTR	\from\()b, \to\()b
+	.previous
+.endm
+
+/*
+ * unsigned long __clear_user(void *addr, size_t size)
+ *
+ * a0: addr
+ * a1: size
+ */
+SYM_FUNC_START(__clear_user)
+	beqz	a1, 2f
+
+1:	st.b	zero, a0, 0
+	addi.d	a0, a0, 1
+	addi.d	a1, a1, -1
+	bgt	a1, zero, 1b
+
+2:	move	v0, a1
+	jr	ra
+
+	fixup_ex 1, 3, 0, 1
+SYM_FUNC_END(__clear_user)
+
+EXPORT_SYMBOL(__clear_user)
diff --git a/arch/loongarch/lib/copy_user.S b/arch/loongarch/lib/copy_user.S
new file mode 100644
index 000000000000..43ed26304954
--- /dev/null
+++ b/arch/loongarch/lib/copy_user.S
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <asm/asm.h>
+#include <asm/asmmacro.h>
+#include <asm/export.h>
+#include <asm/regdef.h>
+
+.macro fixup_ex from, to, offset, fix
+.if \fix
+	.section .fixup, "ax"
+\to:	addi.d	v0, a2, \offset
+	jr	ra
+	.previous
+.endif
+	.section __ex_table, "a"
+	PTR	\from\()b, \to\()b
+	.previous
+.endm
+
+/*
+ * unsigned long __copy_user(void *to, const void *from, size_t n)
+ *
+ * a0: to
+ * a1: from
+ * a2: n
+ */
+SYM_FUNC_START(__copy_user)
+	beqz	a2, 3f
+
+1:	ld.b	t0, a1, 0
+2:	st.b	t0, a0, 0
+	addi.d	a0, a0, 1
+	addi.d	a1, a1, 1
+	addi.d	a2, a2, -1
+	bgt	a2, zero, 1b
+
+3:	move	v0, a2
+	jr	ra
+
+	fixup_ex 1, 4, 0, 1
+	fixup_ex 2, 4, 0, 0
+SYM_FUNC_END(__copy_user)
+
+EXPORT_SYMBOL(__copy_user)
diff --git a/arch/loongarch/lib/delay.c b/arch/loongarch/lib/delay.c
new file mode 100644
index 000000000000..5d856694fcfe
--- /dev/null
+++ b/arch/loongarch/lib/delay.c
@@ -0,0 +1,43 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/delay.h>
+#include <linux/export.h>
+#include <linux/smp.h>
+#include <linux/timex.h>
+
+#include <asm/compiler.h>
+#include <asm/processor.h>
+
+void __delay(unsigned long cycles)
+{
+	u64 t0 = get_cycles();
+
+	while ((unsigned long)(get_cycles() - t0) < cycles)
+		cpu_relax();
+}
+EXPORT_SYMBOL(__delay);
+
+/*
+ * Division by multiplication: you don't have to worry about
+ * loss of precision.
+ *
+ * Use only for very small delays ( < 1 msec).	Should probably use a
+ * lookup table, really, as the multiplications take much too long with
+ * short delays.  This is a "reasonable" implementation, though (and the
+ * first constant multiplications gets optimized away if the delay is
+ * a constant)
+ */
+
+void __udelay(unsigned long us)
+{
+	__delay((us * 0x000010c7ull * HZ * lpj_fine) >> 32);
+}
+EXPORT_SYMBOL(__udelay);
+
+void __ndelay(unsigned long ns)
+{
+	__delay((ns * 0x00000005ull * HZ * lpj_fine) >> 32);
+}
+EXPORT_SYMBOL(__ndelay);
diff --git a/arch/loongarch/lib/dump_tlb.c b/arch/loongarch/lib/dump_tlb.c
new file mode 100644
index 000000000000..cda2c6bc7f09
--- /dev/null
+++ b/arch/loongarch/lib/dump_tlb.c
@@ -0,0 +1,111 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ *
+ * Derived from MIPS:
+ * Copyright (C) 1994, 1995 by Waldorf Electronics, written by Ralf Baechle.
+ * Copyright (C) 1999 by Silicon Graphics, Inc.
+ */
+#include <linux/kernel.h>
+#include <linux/mm.h>
+
+#include <asm/loongarch.h>
+#include <asm/page.h>
+#include <asm/pgtable.h>
+#include <asm/tlb.h>
+
+void dump_tlb_regs(void)
+{
+	const int field = 2 * sizeof(unsigned long);
+
+	pr_info("Index    : %0x\n", read_csr_tlbidx());
+	pr_info("PageSize : %0x\n", read_csr_pagesize());
+	pr_info("EntryHi  : %0*llx\n", field, read_csr_entryhi());
+	pr_info("EntryLo0 : %0*llx\n", field, read_csr_entrylo0());
+	pr_info("EntryLo1 : %0*llx\n", field, read_csr_entrylo1());
+}
+
+static void dump_tlb(int first, int last)
+{
+	unsigned long s_entryhi, entryhi, asid;
+	unsigned long long entrylo0, entrylo1, pa;
+	unsigned int index;
+	unsigned int s_index, s_asid;
+	unsigned int pagesize, c0, c1, i;
+	unsigned long asidmask = cpu_asid_mask(&current_cpu_data);
+	int pwidth = 11;
+	int vwidth = 11;
+	int asidwidth = DIV_ROUND_UP(ilog2(asidmask) + 1, 4);
+
+	s_entryhi = read_csr_entryhi();
+	s_index = read_csr_tlbidx();
+	s_asid = read_csr_asid();
+
+	for (i = first; i <= last; i++) {
+		write_csr_index(i);
+		tlb_read();
+		pagesize = read_csr_pagesize();
+		entryhi	 = read_csr_entryhi();
+		entrylo0 = read_csr_entrylo0();
+		entrylo1 = read_csr_entrylo1();
+		index = read_csr_tlbidx();
+		asid = read_csr_asid();
+
+		/* EHINV bit marks entire entry as invalid */
+		if (index & CSR_TLBIDX_EHINV)
+			continue;
+		/*
+		 * ASID takes effect in absence of G (global) bit.
+		 */
+		if (!((entrylo0 | entrylo1) & ENTRYLO_G) &&
+		    asid != s_asid)
+			continue;
+
+		/*
+		 * Only print entries in use
+		 */
+		pr_info("Index: %2d pgsize=%x ", i, (1 << pagesize));
+
+		c0 = (entrylo0 & ENTRYLO_C) >> ENTRYLO_C_SHIFT;
+		c1 = (entrylo1 & ENTRYLO_C) >> ENTRYLO_C_SHIFT;
+
+		pr_cont("va=%0*lx asid=%0*lx",
+			vwidth, (entryhi & ~0x1fffUL), asidwidth, asid & asidmask);
+
+		/* NR/NX are in awkward places, so mask them off separately */
+		pa = entrylo0 & ~(ENTRYLO_NR | ENTRYLO_NX);
+		pa = pa & PAGE_MASK;
+		pr_cont("\n\t[");
+		pr_cont("ri=%d xi=%d ",
+			(entrylo0 & ENTRYLO_NR) ? 1 : 0,
+			(entrylo0 & ENTRYLO_NX) ? 1 : 0);
+		pr_cont("pa=%0*llx c=%d d=%d v=%d g=%d plv=%lld] [",
+			pwidth, pa, c0,
+			(entrylo0 & ENTRYLO_D) ? 1 : 0,
+			(entrylo0 & ENTRYLO_V) ? 1 : 0,
+			(entrylo0 & ENTRYLO_G) ? 1 : 0,
+			(entrylo0 & ENTRYLO_PLV) >> ENTRYLO_PLV_SHIFT);
+		/* NR/NX are in awkward places, so mask them off separately */
+		pa = entrylo1 & ~(ENTRYLO_NR | ENTRYLO_NX);
+		pa = pa & PAGE_MASK;
+		pr_cont("ri=%d xi=%d ",
+			(entrylo1 & ENTRYLO_NR) ? 1 : 0,
+			(entrylo1 & ENTRYLO_NX) ? 1 : 0);
+		pr_cont("pa=%0*llx c=%d d=%d v=%d g=%d plv=%lld]\n",
+			pwidth, pa, c1,
+			(entrylo1 & ENTRYLO_D) ? 1 : 0,
+			(entrylo1 & ENTRYLO_V) ? 1 : 0,
+			(entrylo1 & ENTRYLO_G) ? 1 : 0,
+			(entrylo1 & ENTRYLO_PLV) >> ENTRYLO_PLV_SHIFT);
+	}
+	pr_info("\n");
+
+	write_csr_entryhi(s_entryhi);
+	write_csr_tlbidx(s_index);
+	write_csr_asid(s_asid);
+}
+
+void dump_tlb_all(void)
+{
+	dump_tlb(0, current_cpu_data.tlbsize - 1);
+}
diff --git a/arch/loongarch/lib/memcpy.S b/arch/loongarch/lib/memcpy.S
new file mode 100644
index 000000000000..d53f1148d26b
--- /dev/null
+++ b/arch/loongarch/lib/memcpy.S
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <asm/asmmacro.h>
+#include <asm/export.h>
+#include <asm/regdef.h>
+
+/*
+ * void *memcpy(void *dst, const void *src, size_t n)
+ *
+ * a0: dst
+ * a1: src
+ * a2: n
+ */
+SYM_FUNC_START(memcpy)
+	move	a3, a0
+	beqz	a2, 2f
+
+1:	ld.b	t0, a1, 0
+	st.b	t0, a0, 0
+	addi.d	a0, a0, 1
+	addi.d	a1, a1, 1
+	addi.d	a2, a2, -1
+	bgt	a2, zero, 1b
+
+2:	move	v0, a3
+	jr	ra
+SYM_FUNC_END(memcpy)
+
+EXPORT_SYMBOL(memcpy)
diff --git a/arch/loongarch/lib/memmove.S b/arch/loongarch/lib/memmove.S
new file mode 100644
index 000000000000..18907d83a83b
--- /dev/null
+++ b/arch/loongarch/lib/memmove.S
@@ -0,0 +1,45 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <asm/asmmacro.h>
+#include <asm/export.h>
+#include <asm/regdef.h>
+
+/*
+ * void *rmemcpy(void *dst, const void *src, size_t n)
+ *
+ * a0: dst
+ * a1: src
+ * a2: n
+ */
+SYM_FUNC_START(rmemcpy)
+	move	a3, a0
+	beqz	a2, 2f
+
+	add.d	a0, a0, a2
+	add.d	a1, a1, a2
+
+1:	ld.b	t0, a1, -1
+	st.b	t0, a0, -1
+	addi.d	a0, a0, -1
+	addi.d	a1, a1, -1
+	addi.d	a2, a2, -1
+	bgt	a2, zero, 1b
+
+2:	move	v0, a3
+	jr	ra
+SYM_FUNC_END(rmemcpy)
+
+SYM_FUNC_START(memmove)
+	blt	a0, a1, 1f	/* dst < src, memcpy */
+	blt	a1, a0, 2f	/* src < dst, rmemcpy */
+	jr	ra		/* dst == src, return */
+
+1:	b	memcpy
+
+2:	b	rmemcpy
+SYM_FUNC_END(memmove)
+
+EXPORT_SYMBOL(memmove)
diff --git a/arch/loongarch/lib/memset.S b/arch/loongarch/lib/memset.S
new file mode 100644
index 000000000000..3fc3e7da5263
--- /dev/null
+++ b/arch/loongarch/lib/memset.S
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <asm/asmmacro.h>
+#include <asm/export.h>
+#include <asm/regdef.h>
+
+/*
+ * void *memset(void *s, int c, size_t n)
+ *
+ * a0: s
+ * a1: c
+ * a2: n
+ */
+SYM_FUNC_START(memset)
+	move	a3, a0
+	beqz	a2, 2f
+
+1:	st.b	a1, a0, 0
+	addi.d	a0, a0, 1
+	addi.d	a2, a2, -1
+	bgt	a2, zero, 1b
+
+2:	move	v0, a3
+	jr	ra
+SYM_FUNC_END(memset)
+
+EXPORT_SYMBOL(memset)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 18/24] LoongArch: Add PCI controller support
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (16 preceding siblings ...)
  2022-04-30  9:05 ` [PATCH V9 17/24] LoongArch: Add some library functions Huacai Chen
@ 2022-04-30  9:05 ` Huacai Chen
  2022-04-30  9:05 ` [PATCH V9 19/24] LoongArch: Add VDSO and VSYSCALL support Huacai Chen
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:05 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen,
	Jianmin Lv

Loongson64 based systems are PC-like systems which use PCI/PCIe as its
I/O bus, This patch adds the PCI host controller support for LoongArch.

Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/include/asm/dma.h |  13 +++
 arch/loongarch/include/asm/pci.h |  40 +++++++
 arch/loongarch/pci/acpi.c        | 172 +++++++++++++++++++++++++++++++
 arch/loongarch/pci/pci.c         |  98 ++++++++++++++++++
 4 files changed, 323 insertions(+)
 create mode 100644 arch/loongarch/include/asm/dma.h
 create mode 100644 arch/loongarch/include/asm/pci.h
 create mode 100644 arch/loongarch/pci/acpi.c
 create mode 100644 arch/loongarch/pci/pci.c

diff --git a/arch/loongarch/include/asm/dma.h b/arch/loongarch/include/asm/dma.h
new file mode 100644
index 000000000000..c61fc72483ff
--- /dev/null
+++ b/arch/loongarch/include/asm/dma.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM_DMA_H
+#define __ASM_DMA_H
+
+#define MAX_DMA_ADDRESS	PAGE_OFFSET
+#define MAX_DMA32_PFN	(1UL << (32 - PAGE_SHIFT))
+
+extern int isa_dma_bridge_buggy;
+
+#endif
diff --git a/arch/loongarch/include/asm/pci.h b/arch/loongarch/include/asm/pci.h
new file mode 100644
index 000000000000..5616ad2678ba
--- /dev/null
+++ b/arch/loongarch/include/asm/pci.h
@@ -0,0 +1,40 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_PCI_H
+#define _ASM_PCI_H
+
+#include <linux/ioport.h>
+#include <linux/list.h>
+#include <linux/types.h>
+#include <asm/io.h>
+
+#define PCIBIOS_MIN_IO		0x4000
+#define PCIBIOS_MIN_MEM		0x20000000
+#define PCIBIOS_MIN_CARDBUS_IO	0x4000
+
+#define HAVE_PCI_MMAP
+#define ARCH_GENERIC_PCI_MMAP_RESOURCE
+
+extern phys_addr_t mcfg_addr_init(int node);
+
+static inline int pci_proc_domain(struct pci_bus *bus)
+{
+	return 1; /* always show the domain in /proc */
+}
+
+/*
+ * Can be used to override the logic in pci_scan_bus for skipping
+ * already-configured bus numbers - to be used for buggy BIOSes
+ * or architectures with incomplete PCI setup by the loader
+ */
+static inline unsigned int pcibios_assign_all_busses(void)
+{
+	return 0;
+}
+
+/* generic pci stuff */
+#include <asm-generic/pci.h>
+
+#endif /* _ASM_PCI_H */
diff --git a/arch/loongarch/pci/acpi.c b/arch/loongarch/pci/acpi.c
new file mode 100644
index 000000000000..7cabb8f37218
--- /dev/null
+++ b/arch/loongarch/pci/acpi.c
@@ -0,0 +1,172 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/pci.h>
+#include <linux/acpi.h>
+#include <linux/init.h>
+#include <linux/irq.h>
+#include <linux/slab.h>
+#include <linux/pci-acpi.h>
+#include <linux/pci-ecam.h>
+
+#include <asm/pci.h>
+#include <asm/loongson.h>
+
+struct pci_root_info {
+	struct acpi_pci_root_info common;
+	struct pci_config_window *cfg;
+};
+
+void pcibios_add_bus(struct pci_bus *bus)
+{
+	acpi_pci_add_bus(bus);
+}
+
+int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
+{
+	struct pci_config_window *cfg = bridge->bus->sysdata;
+	struct acpi_device *adev = to_acpi_device(cfg->parent);
+
+	ACPI_COMPANION_SET(&bridge->dev, adev);
+
+	return 0;
+}
+
+int acpi_pci_bus_find_domain_nr(struct pci_bus *bus)
+{
+	struct pci_config_window *cfg = bus->sysdata;
+	struct acpi_device *adev = to_acpi_device(cfg->parent);
+	struct acpi_pci_root *root = acpi_driver_data(adev);
+
+	return root->segment;
+}
+
+static void acpi_release_root_info(struct acpi_pci_root_info *ci)
+{
+	struct pci_root_info *info;
+
+	info = container_of(ci, struct pci_root_info, common);
+	pci_ecam_free(info->cfg);
+	kfree(ci->ops);
+	kfree(info);
+}
+
+static int acpi_prepare_root_resources(struct acpi_pci_root_info *ci)
+{
+	int status;
+	struct resource_entry *entry, *tmp;
+	struct acpi_device *device = ci->bridge;
+
+	status = acpi_pci_probe_root_resources(ci);
+	if (status > 0) {
+		resource_list_for_each_entry_safe(entry, tmp, &ci->resources) {
+			if (entry->res->flags & IORESOURCE_MEM) {
+				entry->offset = ci->root->mcfg_addr & GENMASK_ULL(63, 40);
+				entry->res->start |= entry->offset;
+				entry->res->end   |= entry->offset;
+			}
+		}
+		return status;
+	}
+
+	resource_list_for_each_entry_safe(entry, tmp, &ci->resources) {
+		dev_dbg(&device->dev,
+			   "host bridge window %pR (ignored)\n", entry->res);
+		resource_list_destroy_entry(entry);
+	}
+
+	return 0;
+}
+
+/*
+ * Lookup the bus range for the domain in MCFG, and set up config space
+ * mapping.
+ */
+static struct pci_config_window *
+pci_acpi_setup_ecam_mapping(struct acpi_pci_root *root)
+{
+	int ret, bus_shift;
+	u16 seg = root->segment;
+	struct device *dev = &root->device->dev;
+	struct resource cfgres;
+	struct resource *bus_res = &root->secondary;
+	struct pci_config_window *cfg;
+	const struct pci_ecam_ops *ecam_ops;
+
+	ret = pci_mcfg_lookup(root, &cfgres, &ecam_ops);
+	if (ret < 0) {
+		dev_err(dev, "%04x:%pR ECAM region not found, use default value\n", seg, bus_res);
+		ecam_ops = &loongson_pci_ecam_ops;
+		root->mcfg_addr = mcfg_addr_init(0);
+	}
+
+	bus_shift = ecam_ops->bus_shift ? : 20;
+
+	cfgres.start = root->mcfg_addr + (bus_res->start << bus_shift);
+	cfgres.end = cfgres.start + (resource_size(bus_res) << bus_shift) - 1;
+	cfgres.flags = IORESOURCE_MEM;
+
+	cfg = pci_ecam_create(dev, &cfgres, bus_res, ecam_ops);
+	if (IS_ERR(cfg)) {
+		dev_err(dev, "%04x:%pR error %ld mapping ECAM\n", seg, bus_res, PTR_ERR(cfg));
+		return NULL;
+	}
+
+	return cfg;
+}
+
+struct pci_bus *pci_acpi_scan_root(struct acpi_pci_root *root)
+{
+	struct pci_bus *bus;
+	struct pci_root_info *info;
+	struct acpi_pci_root_ops *root_ops;
+	int domain = root->segment;
+	int busnum = root->secondary.start;
+
+	info = kzalloc(sizeof(*info), GFP_KERNEL);
+	if (!info) {
+		pr_warn("pci_bus %04x:%02x: ignored (out of memory)\n", domain, busnum);
+		return NULL;
+	}
+
+	root_ops = kzalloc(sizeof(*root_ops), GFP_KERNEL);
+	if (!root_ops) {
+		kfree(info);
+		return NULL;
+	}
+
+	info->cfg = pci_acpi_setup_ecam_mapping(root);
+	if (!info->cfg) {
+		kfree(info);
+		kfree(root_ops);
+		return NULL;
+	}
+
+	root_ops->release_info = acpi_release_root_info;
+	root_ops->prepare_resources = acpi_prepare_root_resources;
+	root_ops->pci_ops = (struct pci_ops *)&info->cfg->ops->pci_ops;
+
+	bus = pci_find_bus(domain, busnum);
+	if (bus) {
+		memcpy(bus->sysdata, info->cfg, sizeof(struct pci_config_window));
+		kfree(info);
+	} else {
+		struct pci_bus *child;
+
+		bus = acpi_pci_root_create(root, root_ops,
+					   &info->common, info->cfg);
+		if (!bus) {
+			kfree(info);
+			kfree(root_ops);
+			return NULL;
+		}
+
+		pci_bus_size_bridges(bus);
+		pci_bus_assign_resources(bus);
+		list_for_each_entry(child, &bus->children, node)
+			pcie_bus_configure_settings(child);
+	}
+
+	return bus;
+}
diff --git a/arch/loongarch/pci/pci.c b/arch/loongarch/pci/pci.c
new file mode 100644
index 000000000000..56dd7d982abf
--- /dev/null
+++ b/arch/loongarch/pci/pci.c
@@ -0,0 +1,98 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/kernel.h>
+#include <linux/export.h>
+#include <linux/init.h>
+#include <linux/acpi.h>
+#include <linux/types.h>
+#include <linux/pci.h>
+#include <linux/vgaarb.h>
+#include <asm/loongson.h>
+
+#define PCI_DEVICE_ID_LOONGSON_HOST     0x7a00
+#define PCI_DEVICE_ID_LOONGSON_DC1      0x7a06
+#define PCI_DEVICE_ID_LOONGSON_DC2      0x7a36
+
+int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 *val)
+{
+	struct pci_bus *bus_tmp = pci_find_bus(domain, bus);
+
+	if (bus_tmp)
+		return bus_tmp->ops->read(bus_tmp, devfn, reg, len, val);
+	return -EINVAL;
+}
+
+int raw_pci_write(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 val)
+{
+	struct pci_bus *bus_tmp = pci_find_bus(domain, bus);
+
+	if (bus_tmp)
+		return bus_tmp->ops->write(bus_tmp, devfn, reg, len, val);
+	return -EINVAL;
+}
+
+phys_addr_t mcfg_addr_init(int node)
+{
+	return (((u64)node << 44) | MCFG_EXT_PCICFG_BASE);
+}
+
+static int __init pcibios_init(void)
+{
+	unsigned int lsize;
+
+	/*
+	 * Set PCI cacheline size to that of the highest level in the
+	 * cache hierarchy.
+	 */
+	lsize = cpu_dcache_line_size();
+	lsize = cpu_vcache_line_size() ? : lsize;
+	lsize = cpu_scache_line_size() ? : lsize;
+
+	BUG_ON(!lsize);
+
+	pci_dfl_cache_line_size = lsize >> 2;
+
+	pr_debug("PCI: pci_cache_line_size set to %d bytes\n", lsize);
+
+	return 0;
+}
+
+subsys_initcall(pcibios_init);
+
+int pcibios_device_add(struct pci_dev *dev)
+{
+	int id = pci_domain_nr(dev->bus);
+
+	dev_set_msi_domain(&dev->dev, pch_msi_domain[id]);
+
+	return 0;
+}
+
+int pcibios_alloc_irq(struct pci_dev *dev)
+{
+	if (acpi_disabled)
+		return 0;
+	if (pci_dev_msi_enabled(dev))
+		return 0;
+	return acpi_pci_irq_enable(dev);
+}
+
+static void pci_fixup_vgadev(struct pci_dev *pdev)
+{
+	struct pci_dev *devp = NULL;
+
+	while ((devp = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, devp))) {
+		if (devp->vendor != PCI_VENDOR_ID_LOONGSON) {
+			vga_set_default_device(devp);
+			dev_info(&pdev->dev,
+				"Overriding boot device as %X:%X\n",
+				devp->vendor, devp->device);
+		}
+	}
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, PCI_DEVICE_ID_LOONGSON_DC1, pci_fixup_vgadev);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, PCI_DEVICE_ID_LOONGSON_DC2, pci_fixup_vgadev);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 19/24] LoongArch: Add VDSO and VSYSCALL support
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (17 preceding siblings ...)
  2022-04-30  9:05 ` [PATCH V9 18/24] LoongArch: Add PCI controller support Huacai Chen
@ 2022-04-30  9:05 ` Huacai Chen
  2022-04-30  9:05 ` [PATCH V9 20/24] LoongArch: Add efistub booting support Huacai Chen
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:05 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen

This patch adds VDSO and VSYSCALL support (gettimeofday and its friends)
for LoongArch.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/Makefile                       |   2 +
 arch/loongarch/include/asm/vdso.h             |  38 +++++
 arch/loongarch/include/asm/vdso/clocksource.h |   8 +
 .../loongarch/include/asm/vdso/gettimeofday.h |  99 +++++++++++++
 arch/loongarch/include/asm/vdso/processor.h   |  14 ++
 arch/loongarch/include/asm/vdso/vdso.h        |  30 ++++
 arch/loongarch/include/asm/vdso/vsyscall.h    |  27 ++++
 arch/loongarch/kernel/vdso.c                  | 138 ++++++++++++++++++
 arch/loongarch/vdso/Makefile                  |  96 ++++++++++++
 arch/loongarch/vdso/elf.S                     |  15 ++
 arch/loongarch/vdso/gen_vdso_offsets.sh       |  13 ++
 arch/loongarch/vdso/sigreturn.S               |  24 +++
 arch/loongarch/vdso/vdso.S                    |  22 +++
 arch/loongarch/vdso/vdso.lds.S                |  72 +++++++++
 arch/loongarch/vdso/vgettimeofday.c           |  25 ++++
 15 files changed, 623 insertions(+)
 create mode 100644 arch/loongarch/include/asm/vdso.h
 create mode 100644 arch/loongarch/include/asm/vdso/clocksource.h
 create mode 100644 arch/loongarch/include/asm/vdso/gettimeofday.h
 create mode 100644 arch/loongarch/include/asm/vdso/processor.h
 create mode 100644 arch/loongarch/include/asm/vdso/vdso.h
 create mode 100644 arch/loongarch/include/asm/vdso/vsyscall.h
 create mode 100644 arch/loongarch/kernel/vdso.c
 create mode 100644 arch/loongarch/vdso/Makefile
 create mode 100644 arch/loongarch/vdso/elf.S
 create mode 100755 arch/loongarch/vdso/gen_vdso_offsets.sh
 create mode 100644 arch/loongarch/vdso/sigreturn.S
 create mode 100644 arch/loongarch/vdso/vdso.S
 create mode 100644 arch/loongarch/vdso/vdso.lds.S
 create mode 100644 arch/loongarch/vdso/vgettimeofday.c

diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
index 0a40e79b3265..c4b3f53cd276 100644
--- a/arch/loongarch/Makefile
+++ b/arch/loongarch/Makefile
@@ -76,9 +76,11 @@ head-y := arch/loongarch/kernel/head.o
 
 libs-y += arch/loongarch/lib/
 
+ifeq ($(KBUILD_EXTMOD),)
 prepare: vdso_prepare
 vdso_prepare: prepare0
 	$(Q)$(MAKE) $(build)=arch/loongarch/vdso include/generated/vdso-offsets.h
+endif
 
 PHONY += vdso_install
 vdso_install:
diff --git a/arch/loongarch/include/asm/vdso.h b/arch/loongarch/include/asm/vdso.h
new file mode 100644
index 000000000000..996bddae12dc
--- /dev/null
+++ b/arch/loongarch/include/asm/vdso.h
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Author: Huacai Chen <chenhuacai@loongson.cn>
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#ifndef __ASM_VDSO_H
+#define __ASM_VDSO_H
+
+#include <linux/mm_types.h>
+#include <vdso/datapage.h>
+
+#include <asm/barrier.h>
+
+/**
+ * struct loongarch_vdso_info - Details of a VDSO image.
+ * @vdso: Pointer to VDSO image (page-aligned).
+ * @size: Size of the VDSO image (page-aligned).
+ * @off_rt_sigreturn: Offset of the rt_sigreturn() trampoline.
+ * @code_mapping: Special mapping structure for vdso code.
+ * @code_mapping: Special mapping structure for vdso data.
+ *
+ * This structure contains details of a VDSO image, including the image data
+ * and offsets of certain symbols required by the kernel. It is generated as
+ * part of the VDSO build process, aside from the mapping page array, which is
+ * populated at runtime.
+ */
+struct loongarch_vdso_info {
+	void *vdso;
+	unsigned long size;
+	unsigned long offset_sigreturn;
+	struct vm_special_mapping code_mapping;
+	struct vm_special_mapping data_mapping;
+};
+
+extern struct loongarch_vdso_info vdso_info;
+
+#endif /* __ASM_VDSO_H */
diff --git a/arch/loongarch/include/asm/vdso/clocksource.h b/arch/loongarch/include/asm/vdso/clocksource.h
new file mode 100644
index 000000000000..13cd580d406d
--- /dev/null
+++ b/arch/loongarch/include/asm/vdso/clocksource.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef __ASM_VDSOCLOCKSOURCE_H
+#define __ASM_VDSOCLOCKSOURCE_H
+
+#define VDSO_ARCH_CLOCKMODES	\
+	VDSO_CLOCKMODE_CPU
+
+#endif /* __ASM_VDSOCLOCKSOURCE_H */
diff --git a/arch/loongarch/include/asm/vdso/gettimeofday.h b/arch/loongarch/include/asm/vdso/gettimeofday.h
new file mode 100644
index 000000000000..5fc5a746b1c4
--- /dev/null
+++ b/arch/loongarch/include/asm/vdso/gettimeofday.h
@@ -0,0 +1,99 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Author: Huacai Chen <chenhuacai@loongson.cn>
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM_VDSO_GETTIMEOFDAY_H
+#define __ASM_VDSO_GETTIMEOFDAY_H
+
+#ifndef __ASSEMBLY__
+
+#include <asm/unistd.h>
+#include <asm/vdso/vdso.h>
+
+#define VDSO_HAS_CLOCK_GETRES		1
+
+static __always_inline long gettimeofday_fallback(
+				struct __kernel_old_timeval *_tv,
+				struct timezone *_tz)
+{
+	register struct __kernel_old_timeval *tv asm("a0") = _tv;
+	register struct timezone *tz asm("a1") = _tz;
+	register long nr asm("a7") = __NR_gettimeofday;
+	register long ret asm("v0");
+
+	asm volatile(
+	"       syscall 0\n"
+	: "=r" (ret)
+	: "r" (nr), "r" (tv), "r" (tz)
+	: "$t0", "$t1", "$t2", "$t3", "$t4", "$t5", "$t6", "$t7",
+	  "$t8", "memory");
+
+	return ret;
+}
+
+static __always_inline long clock_gettime_fallback(
+					clockid_t _clkid,
+					struct __kernel_timespec *_ts)
+{
+	register clockid_t clkid asm("a0") = _clkid;
+	register struct __kernel_timespec *ts asm("a1") = _ts;
+	register long nr asm("a7") = __NR_clock_gettime;
+	register long ret asm("v0");
+
+	asm volatile(
+	"       syscall 0\n"
+	: "=r" (ret)
+	: "r" (nr), "r" (clkid), "r" (ts)
+	: "$t0", "$t1", "$t2", "$t3", "$t4", "$t5", "$t6", "$t7",
+	  "$t8", "memory");
+
+	return ret;
+}
+
+static __always_inline int clock_getres_fallback(
+					clockid_t _clkid,
+					struct __kernel_timespec *_ts)
+{
+	register clockid_t clkid asm("a0") = _clkid;
+	register struct __kernel_timespec *ts asm("a1") = _ts;
+	register long nr asm("a7") = __NR_clock_getres;
+	register long ret asm("v0");
+
+	asm volatile(
+	"       syscall 0\n"
+	: "=r" (ret)
+	: "r" (nr), "r" (clkid), "r" (ts)
+	: "$t0", "$t1", "$t2", "$t3", "$t4", "$t5", "$t6", "$t7",
+	  "$t8", "memory");
+
+	return ret;
+}
+
+static __always_inline u64 __arch_get_hw_counter(s32 clock_mode,
+						 const struct vdso_data *vd)
+{
+	unsigned int count;
+
+	__asm__ __volatile__(
+	"	rdtime.d %0, $zero\n"
+	: "=r" (count));
+
+	return count;
+}
+
+static inline bool loongarch_vdso_hres_capable(void)
+{
+	return true;
+}
+#define __arch_vdso_hres_capable loongarch_vdso_hres_capable
+
+static __always_inline const struct vdso_data *__arch_get_vdso_data(void)
+{
+	return get_vdso_data();
+}
+
+#endif /* !__ASSEMBLY__ */
+
+#endif /* __ASM_VDSO_GETTIMEOFDAY_H */
diff --git a/arch/loongarch/include/asm/vdso/processor.h b/arch/loongarch/include/asm/vdso/processor.h
new file mode 100644
index 000000000000..ef5770b343a0
--- /dev/null
+++ b/arch/loongarch/include/asm/vdso/processor.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM_VDSO_PROCESSOR_H
+#define __ASM_VDSO_PROCESSOR_H
+
+#ifndef __ASSEMBLY__
+
+#define cpu_relax()	barrier()
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* __ASM_VDSO_PROCESSOR_H */
diff --git a/arch/loongarch/include/asm/vdso/vdso.h b/arch/loongarch/include/asm/vdso/vdso.h
new file mode 100644
index 000000000000..5a01643a65b3
--- /dev/null
+++ b/arch/loongarch/include/asm/vdso/vdso.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Author: Huacai Chen <chenhuacai@loongson.cn>
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#ifndef __ASSEMBLY__
+
+#include <asm/asm.h>
+#include <asm/page.h>
+
+static inline unsigned long get_vdso_base(void)
+{
+	unsigned long addr;
+
+	__asm__(
+	" la.pcrel %0, _start\n"
+	: "=r" (addr)
+	:
+	:);
+
+	return addr;
+}
+
+static inline const struct vdso_data *get_vdso_data(void)
+{
+	return (const struct vdso_data *)(get_vdso_base() - PAGE_SIZE);
+}
+
+#endif /* __ASSEMBLY__ */
diff --git a/arch/loongarch/include/asm/vdso/vsyscall.h b/arch/loongarch/include/asm/vdso/vsyscall.h
new file mode 100644
index 000000000000..5de615383a22
--- /dev/null
+++ b/arch/loongarch/include/asm/vdso/vsyscall.h
@@ -0,0 +1,27 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_VDSO_VSYSCALL_H
+#define __ASM_VDSO_VSYSCALL_H
+
+#ifndef __ASSEMBLY__
+
+#include <linux/timekeeper_internal.h>
+#include <vdso/datapage.h>
+
+extern struct vdso_data *vdso_data;
+
+/*
+ * Update the vDSO data page to keep in sync with kernel timekeeping.
+ */
+static __always_inline
+struct vdso_data *__loongarch_get_k_vdso_data(void)
+{
+	return vdso_data;
+}
+#define __arch_get_k_vdso_data __loongarch_get_k_vdso_data
+
+/* The asm-generic header needs to be included after the definitions above */
+#include <asm-generic/vdso/vsyscall.h>
+
+#endif /* !__ASSEMBLY__ */
+
+#endif /* __ASM_VDSO_VSYSCALL_H */
diff --git a/arch/loongarch/kernel/vdso.c b/arch/loongarch/kernel/vdso.c
new file mode 100644
index 000000000000..e20c8ca87473
--- /dev/null
+++ b/arch/loongarch/kernel/vdso.c
@@ -0,0 +1,138 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Author: Huacai Chen <chenhuacai@loongson.cn>
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <linux/binfmts.h>
+#include <linux/elf.h>
+#include <linux/err.h>
+#include <linux/init.h>
+#include <linux/ioport.h>
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/random.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/timekeeper_internal.h>
+
+#include <asm/page.h>
+#include <asm/vdso.h>
+#include <vdso/helpers.h>
+#include <vdso/vsyscall.h>
+#include <generated/vdso-offsets.h>
+
+extern char vdso_start[], vdso_end[];
+
+/* Kernel-provided data used by the VDSO. */
+static union loongarch_vdso_data {
+	u8 page[PAGE_SIZE];
+	struct vdso_data data[CS_BASES];
+} loongarch_vdso_data __page_aligned_data;
+struct vdso_data *vdso_data = loongarch_vdso_data.data;
+static struct page *vdso_pages[] = { NULL };
+
+static int vdso_mremap(const struct vm_special_mapping *sm, struct vm_area_struct *new_vma)
+{
+	current->mm->context.vdso = (void *)(new_vma->vm_start);
+
+	return 0;
+}
+
+struct loongarch_vdso_info vdso_info = {
+	.vdso = vdso_start,
+	.size = PAGE_SIZE,
+	.code_mapping = {
+		.name = "[vdso]",
+		.pages = vdso_pages,
+		.mremap = vdso_mremap,
+	},
+	.data_mapping = {
+		.name = "[vvar]",
+	},
+	.offset_sigreturn = vdso_offset_sigreturn,
+};
+
+static int __init init_vdso(void)
+{
+	unsigned long i, pfn;
+
+	BUG_ON(!PAGE_ALIGNED(vdso_info.vdso));
+	BUG_ON(!PAGE_ALIGNED(vdso_info.size));
+
+	pfn = __phys_to_pfn(__pa_symbol(vdso_info.vdso));
+	for (i = 0; i < vdso_info.size / PAGE_SIZE; i++)
+		vdso_info.code_mapping.pages[i] = pfn_to_page(pfn + i);
+
+	return 0;
+}
+subsys_initcall(init_vdso);
+
+static unsigned long vdso_base(void)
+{
+	unsigned long base = STACK_TOP;
+
+	if (current->flags & PF_RANDOMIZE) {
+		base += get_random_int() & (VDSO_RANDOMIZE_SIZE - 1);
+		base = PAGE_ALIGN(base);
+	}
+
+	return base;
+}
+
+int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
+{
+	int ret;
+	unsigned long vvar_size, size, data_addr, vdso_addr;
+	struct mm_struct *mm = current->mm;
+	struct vm_area_struct *vma;
+	struct loongarch_vdso_info *info = current->thread.vdso;
+
+	if (mmap_write_lock_killable(mm))
+		return -EINTR;
+
+	/*
+	 * Determine total area size. This includes the VDSO data itself
+	 * and the data page.
+	 */
+	vvar_size = PAGE_SIZE;
+	size = vvar_size + info->size;
+
+	data_addr = get_unmapped_area(NULL, vdso_base(), size, 0, 0);
+	if (IS_ERR_VALUE(data_addr)) {
+		ret = data_addr;
+		goto out;
+	}
+	vdso_addr = data_addr + PAGE_SIZE;
+
+	vma = _install_special_mapping(mm, data_addr, vvar_size,
+				       VM_READ | VM_MAYREAD,
+				       &info->data_mapping);
+	if (IS_ERR(vma)) {
+		ret = PTR_ERR(vma);
+		goto out;
+	}
+
+	/* Map VDSO data page. */
+	ret = remap_pfn_range(vma, data_addr,
+			      virt_to_phys(vdso_data) >> PAGE_SHIFT,
+			      PAGE_SIZE, PAGE_READONLY);
+	if (ret)
+		goto out;
+
+	/* Map VDSO code page. */
+	vma = _install_special_mapping(mm, vdso_addr, info->size,
+				       VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC,
+				       &info->code_mapping);
+	if (IS_ERR(vma)) {
+		ret = PTR_ERR(vma);
+		goto out;
+	}
+
+	mm->context.vdso = (void *)vdso_addr;
+	ret = 0;
+
+out:
+	mmap_write_unlock(mm);
+	return ret;
+}
diff --git a/arch/loongarch/vdso/Makefile b/arch/loongarch/vdso/Makefile
new file mode 100644
index 000000000000..6b6e16732c60
--- /dev/null
+++ b/arch/loongarch/vdso/Makefile
@@ -0,0 +1,96 @@
+# SPDX-License-Identifier: GPL-2.0
+# Objects to go into the VDSO.
+
+# Absolute relocation type $(ARCH_REL_TYPE_ABS) needs to be defined before
+# the inclusion of generic Makefile.
+ARCH_REL_TYPE_ABS := R_LARCH_32|R_LARCH_64|R_LARCH_MARK_LA|R_LARCH_JUMP_SLOT
+include $(srctree)/lib/vdso/Makefile
+
+obj-vdso-y := elf.o vgettimeofday.o sigreturn.o
+
+# Common compiler flags between ABIs.
+ccflags-vdso := \
+	$(filter -I%,$(KBUILD_CFLAGS)) \
+	$(filter -E%,$(KBUILD_CFLAGS)) \
+	$(filter -march=%,$(KBUILD_CFLAGS)) \
+	$(filter -m%-float,$(KBUILD_CFLAGS)) \
+	-D__VDSO__
+
+ifeq ($(cc-name),clang)
+ccflags-vdso += $(filter --target=%,$(KBUILD_CFLAGS))
+endif
+
+cflags-vdso := $(ccflags-vdso) \
+	$(filter -W%,$(filter-out -Wa$(comma)%,$(KBUILD_CFLAGS))) \
+	-O2 -g -fno-strict-aliasing -fno-common -fno-builtin -G0 \
+	-fno-stack-protector -fno-jump-tables -DDISABLE_BRANCH_PROFILING \
+	$(call cc-option, -fno-asynchronous-unwind-tables) \
+	$(call cc-option, -fno-stack-protector)
+aflags-vdso := $(ccflags-vdso) \
+	-D__ASSEMBLY__ -Wa,-gdwarf-2
+
+ifneq ($(c-gettimeofday-y),)
+  CFLAGS_vgettimeofday.o += -include $(c-gettimeofday-y)
+endif
+
+# VDSO linker flags.
+ldflags-y := -Bsymbolic --no-undefined -soname=linux-vdso.so.1 \
+	$(filter -E%,$(KBUILD_CFLAGS)) -nostdlib -shared \
+	--hash-style=sysv --build-id -T
+
+GCOV_PROFILE := n
+
+#
+# Shared build commands.
+#
+
+quiet_cmd_vdsold_and_vdso_check = LD      $@
+      cmd_vdsold_and_vdso_check = $(cmd_ld); $(cmd_vdso_check)
+
+quiet_cmd_vdsoas_o_S = AS       $@
+      cmd_vdsoas_o_S = $(CC) $(a_flags) -c -o $@ $<
+
+# Generate VDSO offsets using helper script
+gen-vdsosym := $(srctree)/$(src)/gen_vdso_offsets.sh
+quiet_cmd_vdsosym = VDSOSYM $@
+      cmd_vdsosym = $(NM) $< | $(gen-vdsosym) | LC_ALL=C sort > $@
+
+include/generated/vdso-offsets.h: $(obj)/vdso.so.dbg FORCE
+	$(call if_changed,vdsosym)
+
+#
+# Build native VDSO.
+#
+
+native-abi := $(filter -mabi=%,$(KBUILD_CFLAGS))
+
+targets += $(obj-vdso-y)
+targets += vdso.lds vdso.so.dbg vdso.so
+
+obj-vdso := $(obj-vdso-y:%.o=$(obj)/%.o)
+
+$(obj-vdso): KBUILD_CFLAGS := $(cflags-vdso) $(native-abi)
+$(obj-vdso): KBUILD_AFLAGS := $(aflags-vdso) $(native-abi)
+
+$(obj)/vdso.lds: KBUILD_CPPFLAGS := $(ccflags-vdso) $(native-abi)
+
+$(obj)/vdso.so.dbg: $(obj)/vdso.lds $(obj-vdso) FORCE
+	$(call if_changed,vdsold_and_vdso_check)
+
+$(obj)/vdso.so: OBJCOPYFLAGS := -S
+$(obj)/vdso.so: $(obj)/vdso.so.dbg FORCE
+	$(call if_changed,objcopy)
+
+obj-y += vdso.o
+
+$(obj)/vdso.o : $(obj)/vdso.so
+
+# install commands for the unstripped file
+quiet_cmd_vdso_install = INSTALL $@
+      cmd_vdso_install = cp $(obj)/$@.dbg $(MODLIB)/vdso/$@
+
+vdso.so: $(obj)/vdso.so.dbg
+	@mkdir -p $(MODLIB)/vdso
+	$(call cmd,vdso_install)
+
+vdso_install: vdso.so
diff --git a/arch/loongarch/vdso/elf.S b/arch/loongarch/vdso/elf.S
new file mode 100644
index 000000000000..9bb21b9f9583
--- /dev/null
+++ b/arch/loongarch/vdso/elf.S
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Author: Huacai Chen <chenhuacai@loongson.cn>
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <asm/vdso/vdso.h>
+
+#include <linux/elfnote.h>
+#include <linux/version.h>
+
+ELFNOTE_START(Linux, 0, "a")
+	.long LINUX_VERSION_CODE
+ELFNOTE_END
diff --git a/arch/loongarch/vdso/gen_vdso_offsets.sh b/arch/loongarch/vdso/gen_vdso_offsets.sh
new file mode 100755
index 000000000000..1bb4e12642ff
--- /dev/null
+++ b/arch/loongarch/vdso/gen_vdso_offsets.sh
@@ -0,0 +1,13 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+
+#
+# Derived from RISC-V and ARM64:
+# Author: Will Deacon <will.deacon@arm.com>
+#
+# Match symbols in the DSO that look like VDSO_*; produce a header file
+# of constant offsets into the shared object.
+#
+
+LC_ALL=C sed -n -e 's/^00*/0/' -e \
+'s/^\([0-9a-fA-F]*\) . VDSO_\([a-zA-Z0-9_]*\)$/\#define vdso_offset_\2\t0x\1/p'
diff --git a/arch/loongarch/vdso/sigreturn.S b/arch/loongarch/vdso/sigreturn.S
new file mode 100644
index 000000000000..9cb3c58fad03
--- /dev/null
+++ b/arch/loongarch/vdso/sigreturn.S
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Author: Huacai Chen <chenhuacai@loongson.cn>
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <asm/vdso/vdso.h>
+
+#include <linux/linkage.h>
+#include <uapi/asm/unistd.h>
+
+#include <asm/regdef.h>
+#include <asm/asm.h>
+
+	.section	.text
+	.cfi_sections	.debug_frame
+
+SYM_FUNC_START(__vdso_rt_sigreturn)
+
+	li.w	a7, __NR_rt_sigreturn
+	syscall	0
+
+SYM_FUNC_END(__vdso_rt_sigreturn)
diff --git a/arch/loongarch/vdso/vdso.S b/arch/loongarch/vdso/vdso.S
new file mode 100644
index 000000000000..46789bade6ff
--- /dev/null
+++ b/arch/loongarch/vdso/vdso.S
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ *
+ * Derived from RISC-V:
+ * Copyright (C) 2014 Regents of the University of California
+ */
+
+#include <linux/init.h>
+#include <linux/linkage.h>
+#include <asm/page.h>
+
+	__PAGE_ALIGNED_DATA
+
+	.globl vdso_start, vdso_end
+	.balign PAGE_SIZE
+vdso_start:
+	.incbin "arch/loongarch/vdso/vdso.so"
+	.balign PAGE_SIZE
+vdso_end:
+
+	.previous
diff --git a/arch/loongarch/vdso/vdso.lds.S b/arch/loongarch/vdso/vdso.lds.S
new file mode 100644
index 000000000000..955f02de4a2d
--- /dev/null
+++ b/arch/loongarch/vdso/vdso.lds.S
@@ -0,0 +1,72 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Author: Huacai Chen <chenhuacai@loongson.cn>
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+OUTPUT_FORMAT("elf64-loongarch", "elf64-loongarch", "elf64-loongarch")
+
+OUTPUT_ARCH(loongarch)
+
+SECTIONS
+{
+	PROVIDE(_start = .);
+	. = SIZEOF_HEADERS;
+
+	.hash		: { *(.hash) }			:text
+	.gnu.hash	: { *(.gnu.hash) }
+	.dynsym		: { *(.dynsym) }
+	.dynstr		: { *(.dynstr) }
+	.gnu.version	: { *(.gnu.version) }
+	.gnu.version_d	: { *(.gnu.version_d) }
+	.gnu.version_r	: { *(.gnu.version_r) }
+
+	.note		: { *(.note.*) }		:text :note
+
+	.text		: { *(.text*) }			:text
+	PROVIDE (__etext = .);
+	PROVIDE (_etext = .);
+	PROVIDE (etext = .);
+
+	.eh_frame_hdr	: { *(.eh_frame_hdr) }		:text :eh_frame_hdr
+	.eh_frame	: { KEEP (*(.eh_frame)) }	:text
+
+	.dynamic	: { *(.dynamic) }		:text :dynamic
+
+	.rodata		: { *(.rodata*) }		:text
+
+	_end = .;
+	PROVIDE(end = .);
+
+	/DISCARD/	: {
+		*(.gnu.attributes)
+		*(.note.GNU-stack)
+		*(.data .data.* .gnu.linkonce.d.* .sdata*)
+		*(.bss .sbss .dynbss .dynsbss)
+	}
+}
+
+PHDRS
+{
+	text		PT_LOAD		FLAGS(5) FILEHDR PHDRS; /* PF_R|PF_X */
+	dynamic		PT_DYNAMIC	FLAGS(4);		/* PF_R */
+	note		PT_NOTE		FLAGS(4);		/* PF_R */
+	eh_frame_hdr	PT_GNU_EH_FRAME;
+}
+
+VERSION
+{
+	LINUX_5.10 {
+	global:
+		__vdso_clock_getres;
+		__vdso_clock_gettime;
+		__vdso_gettimeofday;
+		__vdso_rt_sigreturn;
+	local: *;
+	};
+}
+
+/*
+ * Make the sigreturn code visible to the kernel.
+ */
+VDSO_sigreturn		= __vdso_rt_sigreturn;
diff --git a/arch/loongarch/vdso/vgettimeofday.c b/arch/loongarch/vdso/vgettimeofday.c
new file mode 100644
index 000000000000..b1f4548dae92
--- /dev/null
+++ b/arch/loongarch/vdso/vgettimeofday.c
@@ -0,0 +1,25 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * LoongArch userspace implementations of gettimeofday() and similar.
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/types.h>
+
+int __vdso_clock_gettime(clockid_t clock,
+			 struct __kernel_timespec *ts)
+{
+	return __cvdso_clock_gettime(clock, ts);
+}
+
+int __vdso_gettimeofday(struct __kernel_old_timeval *tv,
+			struct timezone *tz)
+{
+	return __cvdso_gettimeofday(tv, tz);
+}
+
+int __vdso_clock_getres(clockid_t clock_id,
+			struct __kernel_timespec *res)
+{
+	return __cvdso_clock_getres(clock_id, res);
+}
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 20/24] LoongArch: Add efistub booting support
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (18 preceding siblings ...)
  2022-04-30  9:05 ` [PATCH V9 19/24] LoongArch: Add VDSO and VSYSCALL support Huacai Chen
@ 2022-04-30  9:05 ` Huacai Chen
  2022-04-30  9:56   ` Arnd Bergmann
  2022-04-30  9:05 ` [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support Huacai Chen
                   ` (4 subsequent siblings)
  24 siblings, 1 reply; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:05 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen

This patch adds efistub booting support, which is the standard UEFI boot
protocol for us to use.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/Kbuild                         |   3 +
 arch/loongarch/Kconfig                        |   8 +
 arch/loongarch/Makefile                       |  18 +-
 arch/loongarch/boot/Makefile                  |  23 +
 arch/loongarch/kernel/efi-header.S            | 100 +++++
 arch/loongarch/kernel/head.S                  |  44 +-
 arch/loongarch/kernel/image-vars.h            |  30 ++
 arch/loongarch/kernel/vmlinux.lds.S           |  23 +-
 drivers/firmware/efi/Kconfig                  |   4 +-
 drivers/firmware/efi/libstub/Makefile         |  14 +-
 drivers/firmware/efi/libstub/loongarch-stub.c | 425 ++++++++++++++++++
 include/linux/pe.h                            |   1 +
 12 files changed, 680 insertions(+), 13 deletions(-)
 create mode 100644 arch/loongarch/boot/Makefile
 create mode 100644 arch/loongarch/kernel/efi-header.S
 create mode 100644 arch/loongarch/kernel/image-vars.h
 create mode 100644 drivers/firmware/efi/libstub/loongarch-stub.c

diff --git a/arch/loongarch/Kbuild b/arch/loongarch/Kbuild
index 1ad35aabdd16..ab5373d0a24f 100644
--- a/arch/loongarch/Kbuild
+++ b/arch/loongarch/Kbuild
@@ -1,3 +1,6 @@
 obj-y += kernel/
 obj-y += mm/
 obj-y += vdso/
+
+# for cleaning
+subdir- += boot
diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 44b763046893..55225ee5f868 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -265,6 +265,14 @@ config EFI
 	  resultant kernel should continue to boot on existing non-EFI
 	  platforms.
 
+config EFI_STUB
+	bool "EFI boot stub support"
+	default y
+	depends on EFI
+	help
+	  This kernel feature allows the kernel to be loaded directly by
+	  EFI firmware without the use of a bootloader.
+
 config FORCE_MAX_ZONEORDER
 	int "Maximum zone order"
 	range 14 64 if PAGE_SIZE_64KB
diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
index c4b3f53cd276..d88a792dafbe 100644
--- a/arch/loongarch/Makefile
+++ b/arch/loongarch/Makefile
@@ -3,6 +3,14 @@
 # Author: Huacai Chen <chenhuacai@loongson.cn>
 # Copyright (C) 2020-2022 Loongson Technology Corporation Limited
 
+boot	:= arch/loongarch/boot
+
+ifndef CONFIG_EFI_STUB
+KBUILD_IMAGE	= $(boot)/vmlinux
+else
+KBUILD_IMAGE	= $(boot)/vmlinux.efi
+endif
+
 #
 # Select the object file format to substitute into the linker script.
 #
@@ -30,8 +38,6 @@ ld-emul			= $(64bit-emul)
 cflags-y		+= -mabi=lp64s
 endif
 
-all-y			:= vmlinux
-
 #
 # GCC uses -G0 -mabicalls -fpic as default.  We don't want PIC in the kernel
 # code since it only slows down the whole thing.  At some point we might make
@@ -75,6 +81,7 @@ endif
 head-y := arch/loongarch/kernel/head.o
 
 libs-y += arch/loongarch/lib/
+libs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
 
 ifeq ($(KBUILD_EXTMOD),)
 prepare: vdso_prepare
@@ -86,12 +93,13 @@ PHONY += vdso_install
 vdso_install:
 	$(Q)$(MAKE) $(build)=arch/loongarch/vdso $@
 
-all:	$(all-y)
+all:	$(KBUILD_IMAGE)
 
-CLEAN_FILES += vmlinux
+$(KBUILD_IMAGE): vmlinux
+	$(Q)$(MAKE) $(build)=$(boot) $(bootvars-y) $@
 
 install:
-	$(Q)install -D -m 755 vmlinux $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
+	$(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
 	$(Q)install -D -m 644 .config $(INSTALL_PATH)/config-$(KERNELRELEASE)
 	$(Q)install -D -m 644 System.map $(INSTALL_PATH)/System.map-$(KERNELRELEASE)
 
diff --git a/arch/loongarch/boot/Makefile b/arch/loongarch/boot/Makefile
new file mode 100644
index 000000000000..66f2293c34b2
--- /dev/null
+++ b/arch/loongarch/boot/Makefile
@@ -0,0 +1,23 @@
+#
+# arch/loongarch/boot/Makefile
+#
+# Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+#
+
+drop-sections := .comment .note .options .note.gnu.build-id
+strip-flags   := $(addprefix --remove-section=,$(drop-sections)) -S
+
+targets := vmlinux
+quiet_cmd_strip = STRIP	  $@
+      cmd_strip = $(STRIP) -s $@
+
+$(obj)/vmlinux: vmlinux FORCE
+	$(call if_changed,copy)
+	$(call if_changed,strip)
+
+targets += vmlinux.efi
+quiet_cmd_eficopy = OBJCOPY $@
+      cmd_eficopy = $(OBJCOPY) -O binary $(strip-flags) $< $@
+
+$(obj)/vmlinux.efi: $(obj)/vmlinux FORCE
+	$(call if_changed,eficopy)
diff --git a/arch/loongarch/kernel/efi-header.S b/arch/loongarch/kernel/efi-header.S
new file mode 100644
index 000000000000..ceb44524944a
--- /dev/null
+++ b/arch/loongarch/kernel/efi-header.S
@@ -0,0 +1,100 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <linux/pe.h>
+#include <linux/sizes.h>
+
+	.macro	__EFI_PE_HEADER
+	.long	PE_MAGIC
+coff_header:
+	.short	IMAGE_FILE_MACHINE_LOONGARCH		/* Machine */
+	.short	section_count				/* NumberOfSections */
+	.long	0 					/* TimeDateStamp */
+	.long	0					/* PointerToSymbolTable */
+	.long	0					/* NumberOfSymbols */
+	.short	section_table - optional_header		/* SizeOfOptionalHeader */
+	.short	IMAGE_FILE_DEBUG_STRIPPED | \
+		IMAGE_FILE_EXECUTABLE_IMAGE | \
+		IMAGE_FILE_LINE_NUMS_STRIPPED		/* Characteristics */
+
+optional_header:
+	.short	PE_OPT_MAGIC_PE32PLUS			/* PE32+ format */
+	.byte	0x02					/* MajorLinkerVersion */
+	.byte	0x14					/* MinorLinkerVersion */
+	.long	__inittext_end - efi_header_end		/* SizeOfCode */
+	.long	_end - __initdata_begin			/* SizeOfInitializedData */
+	.long	0					/* SizeOfUninitializedData */
+	.long	__efistub_efi_pe_entry - _head		/* AddressOfEntryPoint */
+	.long	efi_header_end - _head			/* BaseOfCode */
+
+extra_header_fields:
+	.quad	0					/* ImageBase */
+	.long	PECOFF_SEGMENT_ALIGN			/* SectionAlignment */
+	.long	PECOFF_FILE_ALIGN			/* FileAlignment */
+	.short	0					/* MajorOperatingSystemVersion */
+	.short	0					/* MinorOperatingSystemVersion */
+	.short	0					/* MajorImageVersion */
+	.short	0					/* MinorImageVersion */
+	.short	0					/* MajorSubsystemVersion */
+	.short	0					/* MinorSubsystemVersion */
+	.long	0					/* Win32VersionValue */
+
+	.long	_end - _head				/* SizeOfImage */
+
+	/* Everything before the kernel image is considered part of the header */
+	.long	efi_header_end - _head			/* SizeOfHeaders */
+	.long	0					/* CheckSum */
+	.short	IMAGE_SUBSYSTEM_EFI_APPLICATION		/* Subsystem */
+	.short	0					/* DllCharacteristics */
+	.quad	0					/* SizeOfStackReserve */
+	.quad	0					/* SizeOfStackCommit */
+	.quad	0					/* SizeOfHeapReserve */
+	.quad	0					/* SizeOfHeapCommit */
+	.long	0					/* LoaderFlags */
+	.long	(section_table - .) / 8			/* NumberOfRvaAndSizes */
+
+	.quad	0					/* ExportTable */
+	.quad	0					/* ImportTable */
+	.quad	0					/* ResourceTable */
+	.quad	0					/* ExceptionTable */
+	.quad	0					/* CertificationTable */
+	.quad	0					/* BaseRelocationTable */
+
+	/* Section table */
+section_table:
+	.ascii	".text\0\0\0"
+	.long	__inittext_end - efi_header_end		/* VirtualSize */
+	.long	efi_header_end - _head			/* VirtualAddress */
+	.long	__inittext_end - efi_header_end		/* SizeOfRawData */
+	.long	efi_header_end - _head			/* PointerToRawData */
+
+	.long	0					/* PointerToRelocations */
+	.long	0					/* PointerToLineNumbers */
+	.short	0					/* NumberOfRelocations */
+	.short	0					/* NumberOfLineNumbers */
+	.long	IMAGE_SCN_CNT_CODE | \
+		IMAGE_SCN_MEM_READ | \
+		IMAGE_SCN_MEM_EXECUTE			/* Characteristics */
+
+	.ascii	".data\0\0\0"
+	.long	_end - __initdata_begin			/* VirtualSize */
+	.long	__initdata_begin - _head		/* VirtualAddress */
+	.long	_edata - __initdata_begin		/* SizeOfRawData */
+	.long	__initdata_begin - _head		/* PointerToRawData */
+
+	.long	0					/* PointerToRelocations */
+	.long	0					/* PointerToLineNumbers */
+	.short	0					/* NumberOfRelocations */
+	.short	0					/* NumberOfLineNumbers */
+	.long	IMAGE_SCN_CNT_INITIALIZED_DATA | \
+		IMAGE_SCN_MEM_READ | \
+		IMAGE_SCN_MEM_WRITE			/* Characteristics */
+
+	.org 0x20e
+	.word kernel_version - 512 -  _head
+
+	.set	section_count, (. - section_table) / 40
+efi_header_end:
+	.endm
diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
index b4a0b28da3e7..361b72e8bfc5 100644
--- a/arch/loongarch/kernel/head.S
+++ b/arch/loongarch/kernel/head.S
@@ -11,11 +11,53 @@
 #include <asm/regdef.h>
 #include <asm/loongarch.h>
 #include <asm/stackframe.h>
+#include <generated/compile.h>
+#include <generated/utsrelease.h>
 
-SYM_ENTRY(_stext, SYM_L_GLOBAL, SYM_A_NONE)
+#ifdef CONFIG_EFI_STUB
+
+#include "efi-header.S"
+
+	__HEAD
+
+_head:
+	/* "MZ", MS-DOS header */
+	.word	MZ_MAGIC
+	.org	0x28
+	.ascii	"Loongson\0"
+	.org	0x3c
+	/* Offset to the PE header */
+	.long	pe_header - _head
+
+pe_header:
+	__EFI_PE_HEADER
+
+kernel_asize:
+	.long _end - _text
+
+kernel_fsize:
+	.long _edata - _text
+
+kernel_vaddr:
+	.quad VMLINUX_LOAD_ADDRESS
+
+kernel_offset:
+	.long kernel_offset - _text
+
+kernel_version:
+	.ascii  UTS_RELEASE " (" LINUX_COMPILE_BY "@" LINUX_COMPILE_HOST ") " UTS_VERSION "\0"
+
+SYM_L_GLOBAL(kernel_asize)
+SYM_L_GLOBAL(kernel_fsize)
+SYM_L_GLOBAL(kernel_vaddr)
+SYM_L_GLOBAL(kernel_offset)
+
+#endif
 
 	__REF
 
+SYM_ENTRY(_stext, SYM_L_GLOBAL, SYM_A_NONE)
+
 SYM_CODE_START(kernel_entry)			# kernel entry point
 
 	/* Config direct window and set PG */
diff --git a/arch/loongarch/kernel/image-vars.h b/arch/loongarch/kernel/image-vars.h
new file mode 100644
index 000000000000..0162402b6212
--- /dev/null
+++ b/arch/loongarch/kernel/image-vars.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __LOONGARCH_KERNEL_IMAGE_VARS_H
+#define __LOONGARCH_KERNEL_IMAGE_VARS_H
+
+#ifdef CONFIG_EFI_STUB
+
+__efistub_memcmp		= memcmp;
+__efistub_memcpy		= memcpy;
+__efistub_memmove		= memmove;
+__efistub_memset		= memset;
+__efistub_strcat		= strcat;
+__efistub_strcmp		= strcmp;
+__efistub_strlen		= strlen;
+__efistub_strncat		= strncat;
+__efistub_strnstr		= strnstr;
+__efistub_strnlen		= strnlen;
+__efistub_strpbrk		= strpbrk;
+__efistub_strsep		= strsep;
+__efistub_kernel_entry		= kernel_entry;
+__efistub_kernel_asize		= kernel_asize;
+__efistub_kernel_fsize		= kernel_fsize;
+__efistub_kernel_vaddr		= kernel_vaddr;
+__efistub_kernel_offset		= kernel_offset;
+
+#endif
+
+#endif /* __LOONGARCH_KERNEL_IMAGE_VARS_H */
diff --git a/arch/loongarch/kernel/vmlinux.lds.S b/arch/loongarch/kernel/vmlinux.lds.S
index 02abfaaa4892..7da4c4d7c50d 100644
--- a/arch/loongarch/kernel/vmlinux.lds.S
+++ b/arch/loongarch/kernel/vmlinux.lds.S
@@ -12,6 +12,14 @@
 #define BSS_FIRST_SECTIONS *(.bss..swapper_pg_dir)
 
 #include <asm-generic/vmlinux.lds.h>
+#include "image-vars.h"
+
+/*
+ * Max avaliable Page Size is 64K, so we set SectionAlignment
+ * field of EFI application to 64K.
+ */
+PECOFF_FILE_ALIGN = 0x200;
+PECOFF_SEGMENT_ALIGN = 0x10000;
 
 OUTPUT_ARCH(loongarch)
 ENTRY(kernel_entry)
@@ -27,6 +35,9 @@ SECTIONS
 	. = VMLINUX_LOAD_ADDRESS;
 
 	_text = .;
+	HEAD_TEXT_SECTION
+
+	. = ALIGN(PECOFF_SEGMENT_ALIGN);
 	.text : {
 		TEXT_TEXT
 		SCHED_TEXT
@@ -38,11 +49,12 @@ SECTIONS
 		*(.fixup)
 		*(.gnu.warning)
 	} :text = 0
+	. = ALIGN(PECOFF_SEGMENT_ALIGN);
 	_etext = .;
 
 	EXCEPTION_TABLE(16)
 
-	. = ALIGN(PAGE_SIZE);
+	. = ALIGN(PECOFF_SEGMENT_ALIGN);
 	__init_begin = .;
 	__inittext_begin = .;
 
@@ -51,6 +63,7 @@ SECTIONS
 		EXIT_TEXT
 	}
 
+	. = ALIGN(PECOFF_SEGMENT_ALIGN);
 	__inittext_end = .;
 
 	__initdata_begin = .;
@@ -60,6 +73,10 @@ SECTIONS
 		EXIT_DATA
 	}
 
+	.init.bss : {
+		*(.init.bss)
+	}
+	. = ALIGN(PECOFF_SEGMENT_ALIGN);
 	__initdata_end = .;
 
 	__init_end = .;
@@ -71,11 +88,11 @@ SECTIONS
 	.sdata : {
 		*(.sdata)
 	}
-
-	. = ALIGN(SZ_64K);
+	.edata_padding : { BYTE(0); . = ALIGN(PECOFF_FILE_ALIGN); }
 	_edata =  .;
 
 	BSS_SECTION(0, SZ_64K, 8)
+	. = ALIGN(PECOFF_SEGMENT_ALIGN);
 
 	_end = .;
 
diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig
index 2c3dac5ecb36..ecb4e0b1295a 100644
--- a/drivers/firmware/efi/Kconfig
+++ b/drivers/firmware/efi/Kconfig
@@ -121,9 +121,9 @@ config EFI_ARMSTUB_DTB_LOADER
 
 config EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER
 	bool "Enable the command line initrd loader" if !X86
-	depends on EFI_STUB && (EFI_GENERIC_STUB || X86)
-	default y if X86
 	depends on !RISCV
+	depends on EFI_STUB && (EFI_GENERIC_STUB || X86 || LOONGARCH)
+	default y if (X86 || LOONGARCH)
 	help
 	  Select this config option to add support for the initrd= command
 	  line parameter, allowing an initrd that resides on the same volume
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index d0537573501e..663e9d317299 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -26,6 +26,8 @@ cflags-$(CONFIG_ARM)		:= $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
 				   $(call cc-option,-mno-single-pic-base)
 cflags-$(CONFIG_RISCV)		:= $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
 				   -fpic
+cflags-$(CONFIG_LOONGARCH)	:= $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
+				   -fpic
 
 cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
 
@@ -55,7 +57,7 @@ KCOV_INSTRUMENT			:= n
 lib-y				:= efi-stub-helper.o gop.o secureboot.o tpm.o \
 				   file.o mem.o random.o randomalloc.o pci.o \
 				   skip_spaces.o lib-cmdline.o lib-ctype.o \
-				   alignedmem.o relocate.o vsprintf.o
+				   alignedmem.o relocate.o string.o vsprintf.o
 
 # include the stub's generic dependencies from lib/ when building for ARM/arm64
 efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c
@@ -63,13 +65,15 @@ efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c
 $(obj)/lib-%.o: $(srctree)/lib/%.c FORCE
 	$(call if_changed_rule,cc_o_c)
 
-lib-$(CONFIG_EFI_GENERIC_STUB)	+= efi-stub.o fdt.o string.o \
+lib-$(CONFIG_EFI_GENERIC_STUB)	+= efi-stub.o fdt.o \
 				   $(patsubst %.c,lib-%.o,$(efi-deps-y))
 
 lib-$(CONFIG_ARM)		+= arm32-stub.o
 lib-$(CONFIG_ARM64)		+= arm64-stub.o
 lib-$(CONFIG_X86)		+= x86-stub.o
 lib-$(CONFIG_RISCV)		+= riscv-stub.o
+lib-$(CONFIG_LOONGARCH)		+= loongarch-stub.o
+
 CFLAGS_arm32-stub.o		:= -DTEXT_OFFSET=$(TEXT_OFFSET)
 
 # Even when -mbranch-protection=none is set, Clang will generate a
@@ -125,6 +129,12 @@ STUBCOPY_FLAGS-$(CONFIG_RISCV)	+= --prefix-alloc-sections=.init \
 				   --prefix-symbols=__efistub_
 STUBCOPY_RELOC-$(CONFIG_RISCV)	:= R_RISCV_HI20
 
+# For LoongArch, keep all the symbols in .init section and make sure that no
+# absolute symbols references doesn't exist.
+STUBCOPY_FLAGS-$(CONFIG_LOONGARCH)	+= --prefix-alloc-sections=.init \
+					   --prefix-symbols=__efistub_
+STUBCOPY_RELOC-$(CONFIG_LOONGARCH)	:= R_LARCH_MARK_LA
+
 $(obj)/%.stub.o: $(obj)/%.o FORCE
 	$(call if_changed,stubcopy)
 
diff --git a/drivers/firmware/efi/libstub/loongarch-stub.c b/drivers/firmware/efi/libstub/loongarch-stub.c
new file mode 100644
index 000000000000..399641a0b0cb
--- /dev/null
+++ b/drivers/firmware/efi/libstub/loongarch-stub.c
@@ -0,0 +1,425 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Author: Yun Liu <liuyun@loongson.cn>
+ *         Huacai Chen <chenhuacai@loongson.cn>
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <linux/efi.h>
+#include <linux/sort.h>
+#include <asm/efi.h>
+#include <asm/addrspace.h>
+#include <asm/boot_param.h>
+#include "efistub.h"
+
+#define MAX_ARG_COUNT		128
+#define CMDLINE_MAX_SIZE	0x200
+
+static int argc;
+static char **argv;
+const efi_system_table_t *efi_system_table;
+static efi_guid_t screen_info_guid = LINUX_EFI_LARCH_SCREEN_INFO_TABLE_GUID;
+static unsigned int map_entry[LOONGSON3_BOOT_MEM_MAP_MAX];
+static struct efi_mmap mmap_array[EFI_MAX_MEMORY_TYPE][LOONGSON3_BOOT_MEM_MAP_MAX];
+
+struct exit_boot_struct {
+	struct boot_params *bp;
+	unsigned int *runtime_entry_count;
+};
+
+typedef void (*kernel_entry_t)(int argc, char *argv[], struct boot_params *boot_p);
+
+extern int kernel_asize;
+extern int kernel_fsize;
+extern int kernel_offset;
+extern unsigned long kernel_vaddr;
+extern kernel_entry_t kernel_entry;
+
+unsigned char efi_crc8(char *buff, int size)
+{
+	int sum, cnt;
+
+	for (sum = 0, cnt = 0; cnt < size; cnt++)
+		sum = (char) (sum + *(buff + cnt));
+
+	return (char)(0x100 - sum);
+}
+
+struct screen_info *alloc_screen_info(void)
+{
+	efi_status_t status;
+	struct screen_info *si;
+
+	status = efi_bs_call(allocate_pool,
+			EFI_RUNTIME_SERVICES_DATA, sizeof(*si), (void **)&si);
+	if (status != EFI_SUCCESS)
+		return NULL;
+
+	status = efi_bs_call(install_configuration_table, &screen_info_guid, si);
+	if (status == EFI_SUCCESS)
+		return si;
+
+	efi_bs_call(free_pool, si);
+
+	return NULL;
+}
+
+static void setup_graphics(void)
+{
+	unsigned long size;
+	efi_status_t status;
+	efi_guid_t gop_proto = EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID;
+	void **gop_handle = NULL;
+	struct screen_info *si = NULL;
+
+	size = 0;
+	status = efi_bs_call(locate_handle, EFI_LOCATE_BY_PROTOCOL,
+				&gop_proto, NULL, &size, gop_handle);
+	if (status == EFI_BUFFER_TOO_SMALL) {
+		si = alloc_screen_info();
+		efi_setup_gop(si, &gop_proto, size);
+	}
+}
+
+struct boot_params *bootparams_init(efi_system_table_t *sys_table)
+{
+	efi_status_t status;
+	struct boot_params *p;
+	unsigned char sig[8] = {'B', 'P', 'I', '0', '1', '0', '0', '2'};
+
+	status = efi_bs_call(allocate_pool, EFI_RUNTIME_SERVICES_DATA, SZ_64K, (void **)&p);
+	if (status != EFI_SUCCESS)
+		return NULL;
+
+	memset(p, 0, SZ_64K);
+	memcpy(&p->signature, sig, sizeof(long));
+
+	return p;
+}
+
+static unsigned long convert_priv_cmdline(char *cmdline_ptr,
+		unsigned long rd_addr, unsigned long rd_size)
+{
+	unsigned int rdprev_size;
+	unsigned int cmdline_size;
+	efi_status_t status;
+	char *pstr, *substr;
+	char *initrd_ptr = NULL;
+	char convert_str[CMDLINE_MAX_SIZE];
+	static char cmdline_array[CMDLINE_MAX_SIZE];
+
+	cmdline_size = strlen(cmdline_ptr);
+	snprintf(cmdline_array, CMDLINE_MAX_SIZE, "kernel ");
+
+	initrd_ptr = strstr(cmdline_ptr, "initrd=");
+	if (!initrd_ptr) {
+		snprintf(cmdline_array, CMDLINE_MAX_SIZE, "kernel %s", cmdline_ptr);
+		goto completed;
+	}
+	snprintf(convert_str, CMDLINE_MAX_SIZE, " initrd=0x%lx,0x%lx", rd_addr, rd_size);
+	rdprev_size = cmdline_size - strlen(initrd_ptr);
+	strncat(cmdline_array, cmdline_ptr, rdprev_size);
+
+	cmdline_ptr = strnstr(initrd_ptr, " ", CMDLINE_MAX_SIZE);
+	strcat(cmdline_array, convert_str);
+	if (!cmdline_ptr)
+		goto completed;
+
+	strcat(cmdline_array, cmdline_ptr);
+
+completed:
+	status = efi_allocate_pages((MAX_ARG_COUNT + 1) * (sizeof(char *)),
+					(unsigned long *)&argv, ULONG_MAX);
+	if (status != EFI_SUCCESS) {
+		efi_err("Alloc argv mmap_array error\n");
+		return status;
+	}
+
+	argc = 0;
+	pstr = cmdline_array;
+
+	substr = strsep(&pstr, " \t");
+	while (substr != NULL) {
+		if (strlen(substr)) {
+			argv[argc++] = substr;
+			if (argc == MAX_ARG_COUNT) {
+				efi_err("Argv mmap_array full!\n");
+				break;
+			}
+		}
+		substr = strsep(&pstr, " \t");
+	}
+
+	return EFI_SUCCESS;
+}
+
+unsigned int efi_memmap_sort(struct loongsonlist_mem_map *memmap,
+			unsigned int index, unsigned int mem_type)
+{
+	unsigned int i, t;
+	unsigned long msize;
+
+	for (i = 0; i < map_entry[mem_type]; i = t) {
+		msize = mmap_array[mem_type][i].mem_size;
+		for (t = i + 1; t < map_entry[mem_type]; t++) {
+			if (mmap_array[mem_type][i].mem_start + msize <
+					mmap_array[mem_type][t].mem_start)
+				break;
+
+			msize += mmap_array[mem_type][t].mem_size;
+		}
+		memmap->map[index].mem_type = mem_type;
+		memmap->map[index].mem_start = mmap_array[mem_type][i].mem_start;
+		memmap->map[index].mem_size = msize;
+		memmap->map[index].attribute = mmap_array[mem_type][i].attribute;
+		index++;
+	}
+
+	return index;
+}
+
+static efi_status_t mk_mmap(struct efi_boot_memmap *map, struct boot_params *p)
+{
+	char checksum;
+	unsigned int i;
+	unsigned int nr_desc;
+	unsigned int mem_type;
+	unsigned long count;
+	efi_memory_desc_t *mem_desc;
+	struct loongsonlist_mem_map *mhp = NULL;
+
+	memset(map_entry, 0, sizeof(map_entry));
+	memset(mmap_array, 0, sizeof(mmap_array));
+
+	if (!strncmp((char *)p, "BPI", 3)) {
+		p->flags |= BPI_FLAGS_UEFI_SUPPORTED;
+		p->systemtable = (efi_system_table_t *)efi_system_table;
+		p->extlist_offset = sizeof(*p) + sizeof(unsigned long);
+		mhp = (struct loongsonlist_mem_map *)((char *)p + p->extlist_offset);
+
+		memcpy(&mhp->header.signature, "MEM", sizeof(unsigned long));
+		mhp->header.length = sizeof(*mhp);
+		mhp->desc_version = *map->desc_ver;
+		mhp->map_count = 0;
+	}
+	if (!(*(map->map_size)) || !(*(map->desc_size)) || !mhp) {
+		efi_err("get memory info error\n");
+		return EFI_INVALID_PARAMETER;
+	}
+	nr_desc = *(map->map_size) / *(map->desc_size);
+
+	/*
+	 * According to UEFI SPEC, mmap_buf is the accurate Memory Map
+	 * mmap_array now we can fill platform specific memory structure.
+	 */
+	for (i = 0; i < nr_desc; i++) {
+		mem_desc = (efi_memory_desc_t *)((void *)(*map->map) + (i * (*(map->desc_size))));
+		switch (mem_desc->type) {
+		case EFI_RESERVED_TYPE:
+		case EFI_RUNTIME_SERVICES_CODE:
+		case EFI_RUNTIME_SERVICES_DATA:
+		case EFI_MEMORY_MAPPED_IO:
+		case EFI_MEMORY_MAPPED_IO_PORT_SPACE:
+		case EFI_UNUSABLE_MEMORY:
+		case EFI_PAL_CODE:
+			mem_type = ADDRESS_TYPE_RESERVED;
+			break;
+
+		case EFI_ACPI_MEMORY_NVS:
+			mem_type = ADDRESS_TYPE_NVS;
+			break;
+
+		case EFI_ACPI_RECLAIM_MEMORY:
+			mem_type = ADDRESS_TYPE_ACPI;
+			break;
+
+		case EFI_LOADER_CODE:
+		case EFI_LOADER_DATA:
+		case EFI_PERSISTENT_MEMORY:
+		case EFI_BOOT_SERVICES_CODE:
+		case EFI_BOOT_SERVICES_DATA:
+		case EFI_CONVENTIONAL_MEMORY:
+			mem_type = ADDRESS_TYPE_SYSRAM;
+			break;
+
+		default:
+			continue;
+		}
+
+		mmap_array[mem_type][map_entry[mem_type]].mem_type = mem_type;
+		mmap_array[mem_type][map_entry[mem_type]].mem_start =
+						mem_desc->phys_addr & TO_PHYS_MASK;
+		mmap_array[mem_type][map_entry[mem_type]].mem_size =
+						mem_desc->num_pages << EFI_PAGE_SHIFT;
+		mmap_array[mem_type][map_entry[mem_type]].attribute =
+						mem_desc->attribute;
+		map_entry[mem_type]++;
+	}
+
+	count = mhp->map_count;
+	/* Sort EFI memmap and add to BPI for kernel */
+	for (i = 0; i < LOONGSON3_BOOT_MEM_MAP_MAX; i++) {
+		if (!map_entry[i])
+			continue;
+		count = efi_memmap_sort(mhp, count, i);
+	}
+
+	mhp->map_count = count;
+	mhp->header.checksum = 0;
+
+	checksum = efi_crc8((char *)mhp, mhp->header.length);
+	mhp->header.checksum = checksum;
+
+	return EFI_SUCCESS;
+}
+
+static efi_status_t exit_boot_func(struct efi_boot_memmap *map, void *priv)
+{
+	efi_status_t status;
+	struct exit_boot_struct *p = priv;
+
+	status = mk_mmap(map, p->bp);
+	if (status != EFI_SUCCESS) {
+		efi_err("Make kernel memory map failed!\n");
+		return status;
+	}
+
+	return EFI_SUCCESS;
+}
+
+static efi_status_t exit_boot_services(struct boot_params *boot_params, void *handle)
+{
+	unsigned int desc_version;
+	unsigned int runtime_entry_count = 0;
+	unsigned long map_size, key, desc_size, buff_size;
+	efi_status_t status;
+	efi_memory_desc_t *mem_map;
+	struct efi_boot_memmap map;
+	struct exit_boot_struct priv;
+
+	map.map			= &mem_map;
+	map.map_size		= &map_size;
+	map.desc_size		= &desc_size;
+	map.desc_ver		= &desc_version;
+	map.key_ptr		= &key;
+	map.buff_size		= &buff_size;
+	status = efi_get_memory_map(&map);
+	if (status != EFI_SUCCESS) {
+		efi_err("Unable to retrieve UEFI memory map.\n");
+		return status;
+	}
+
+	priv.bp = boot_params;
+	priv.runtime_entry_count = &runtime_entry_count;
+
+	/* Might as well exit boot services now */
+	status = efi_exit_boot_services(handle, &map, &priv, exit_boot_func);
+	if (status != EFI_SUCCESS)
+		return status;
+
+	return EFI_SUCCESS;
+}
+
+/*
+ * EFI entry point for the LoongArch EFI stub.
+ */
+efi_status_t __efiapi efi_pe_entry(efi_handle_t handle, efi_system_table_t *sys_table)
+{
+	unsigned int cmdline_size = 0;
+	unsigned long kernel_addr = 0;
+	unsigned long initrd_addr = 0;
+	unsigned long initrd_size = 0;
+	enum efi_secureboot_mode secure_boot;
+	char *cmdline_ptr = NULL;
+	struct boot_params *boot_p;
+	efi_status_t status;
+	efi_loaded_image_t *image;
+	efi_guid_t loaded_image_proto;
+	kernel_entry_t real_kernel_entry;
+
+	/* Config Direct Mapping */
+	csr_writeq(CSR_DMW0_INIT, LOONGARCH_CSR_DMWIN0);
+	csr_writeq(CSR_DMW1_INIT, LOONGARCH_CSR_DMWIN1);
+
+	efi_system_table = sys_table;
+	loaded_image_proto = LOADED_IMAGE_PROTOCOL_GUID;
+	kernel_addr = (unsigned long)&kernel_offset - kernel_offset;
+	real_kernel_entry = (kernel_entry_t)
+		((unsigned long)&kernel_entry - kernel_addr + kernel_vaddr);
+
+	/* Check if we were booted by the EFI firmware */
+	if (sys_table->hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE)
+		goto fail;
+
+	/*
+	 * Get a handle to the loaded image protocol.  This is used to get
+	 * information about the running image, such as size and the command
+	 * line.
+	 */
+	status = sys_table->boottime->handle_protocol(handle,
+					&loaded_image_proto, (void *)&image);
+	if (status != EFI_SUCCESS) {
+		efi_err("Failed to get loaded image protocol\n");
+		goto fail;
+	}
+
+	/* Get the command line from EFI, using the LOADED_IMAGE protocol. */
+	cmdline_ptr = efi_convert_cmdline(image, &cmdline_size);
+	if (!cmdline_ptr) {
+		efi_err("Getting command line failed!\n");
+		goto fail_free_cmdline;
+	}
+
+#ifdef CONFIG_CMDLINE_BOOL
+	if (cmdline_size == 0)
+		efi_parse_options(CONFIG_CMDLINE);
+#endif
+	if (!IS_ENABLED(CONFIG_CMDLINE_OVERRIDE) && cmdline_size > 0)
+		efi_parse_options(cmdline_ptr);
+
+	efi_info("Booting Linux Kernel...\n");
+
+	efi_relocate_kernel(&kernel_addr, kernel_fsize, kernel_asize,
+			    PHYSADDR(kernel_vaddr), SZ_2M, PHYSADDR(kernel_vaddr));
+
+	setup_graphics();
+	secure_boot = efi_get_secureboot();
+	efi_enable_reset_attack_mitigation();
+
+	status = efi_load_initrd(image, &initrd_addr, &initrd_size, SZ_4G, ULONG_MAX);
+	if (status != EFI_SUCCESS) {
+		efi_err("Failed get initrd addr!\n");
+		goto fail_free;
+	}
+
+	status = convert_priv_cmdline(cmdline_ptr, initrd_addr, initrd_size);
+	if (status != EFI_SUCCESS) {
+		efi_err("Covert cmdline failed!\n");
+		goto fail_free;
+	}
+
+	boot_p = bootparams_init(sys_table);
+	if (!boot_p) {
+		efi_err("Create BPI struct error!\n");
+		goto fail;
+	}
+
+	status = exit_boot_services(boot_p, handle);
+	if (status != EFI_SUCCESS) {
+		efi_err("exit_boot services failed!\n");
+		goto fail_free;
+	}
+
+	real_kernel_entry(argc, argv, boot_p);
+
+	return EFI_SUCCESS;
+
+fail_free:
+	efi_free(initrd_size, initrd_addr);
+
+fail_free_cmdline:
+	efi_free(cmdline_size, (unsigned long)cmdline_ptr);
+
+fail:
+	return status;
+}
diff --git a/include/linux/pe.h b/include/linux/pe.h
index daf09ffffe38..f4bb0b6a416d 100644
--- a/include/linux/pe.h
+++ b/include/linux/pe.h
@@ -65,6 +65,7 @@
 #define	IMAGE_FILE_MACHINE_SH5		0x01a8
 #define	IMAGE_FILE_MACHINE_THUMB	0x01c2
 #define	IMAGE_FILE_MACHINE_WCEMIPSV2	0x0169
+#define	IMAGE_FILE_MACHINE_LOONGARCH	0x6264
 
 /* flags */
 #define IMAGE_FILE_RELOCS_STRIPPED           0x0001
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (19 preceding siblings ...)
  2022-04-30  9:05 ` [PATCH V9 20/24] LoongArch: Add efistub booting support Huacai Chen
@ 2022-04-30  9:05 ` Huacai Chen
  2022-04-30 10:07     ` Arnd Bergmann
  2022-04-30  9:05 ` [PATCH V9 22/24] LoongArch: Add multi-processor (SMP) support Huacai Chen
                   ` (3 subsequent siblings)
  24 siblings, 1 reply; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:05 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen

This patch adds zboot (self-extracting compressed kernel) support, all
existing in-kernel compressing algorithm and efistub are supported.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/Kbuild                         |   2 +-
 arch/loongarch/Kconfig                        |  11 ++
 arch/loongarch/Makefile                       |  26 ++-
 arch/loongarch/boot/Makefile                  |  55 ++++++
 arch/loongarch/boot/boot.lds.S                |  64 +++++++
 arch/loongarch/boot/decompress.c              |  98 +++++++++++
 arch/loongarch/boot/string.c                  | 166 ++++++++++++++++++
 arch/loongarch/boot/zheader.S                 | 100 +++++++++++
 arch/loongarch/boot/zkernel.S                 |  99 +++++++++++
 arch/loongarch/tools/Makefile                 |  15 ++
 arch/loongarch/tools/calc_vmlinuz_load_addr.c |  51 ++++++
 arch/loongarch/tools/elf-entry.c              |  66 +++++++
 12 files changed, 749 insertions(+), 4 deletions(-)
 create mode 100644 arch/loongarch/boot/boot.lds.S
 create mode 100644 arch/loongarch/boot/decompress.c
 create mode 100644 arch/loongarch/boot/string.c
 create mode 100644 arch/loongarch/boot/zheader.S
 create mode 100644 arch/loongarch/boot/zkernel.S
 create mode 100644 arch/loongarch/tools/Makefile
 create mode 100644 arch/loongarch/tools/calc_vmlinuz_load_addr.c
 create mode 100644 arch/loongarch/tools/elf-entry.c

diff --git a/arch/loongarch/Kbuild b/arch/loongarch/Kbuild
index ab5373d0a24f..d907fdd7ca08 100644
--- a/arch/loongarch/Kbuild
+++ b/arch/loongarch/Kbuild
@@ -3,4 +3,4 @@ obj-y += mm/
 obj-y += vdso/
 
 # for cleaning
-subdir- += boot
+subdir- += boot tools
diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 55225ee5f868..6c1042746b2d 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -107,6 +107,7 @@ config LOONGARCH
 	select PERF_USE_VMALLOC
 	select RTC_LIB
 	select SPARSE_IRQ
+	select SYS_SUPPORTS_ZBOOT
 	select SYSCTL_EXCEPTION_TRACE
 	select SWIOTLB
 	select TRACE_IRQFLAGS_SUPPORT
@@ -143,6 +144,16 @@ config LOCKDEP_SUPPORT
 	bool
 	default y
 
+config SYS_SUPPORTS_ZBOOT
+	bool
+	select HAVE_KERNEL_GZIP
+	select HAVE_KERNEL_BZIP2
+	select HAVE_KERNEL_LZ4
+	select HAVE_KERNEL_LZMA
+	select HAVE_KERNEL_LZO
+	select HAVE_KERNEL_XZ
+	select HAVE_KERNEL_ZSTD
+
 config MACH_LOONGSON32
 	def_bool 32BIT
 
diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
index d88a792dafbe..1ed5b8466565 100644
--- a/arch/loongarch/Makefile
+++ b/arch/loongarch/Makefile
@@ -5,12 +5,31 @@
 
 boot	:= arch/loongarch/boot
 
+ifndef CONFIG_SYS_SUPPORTS_ZBOOT
+
 ifndef CONFIG_EFI_STUB
 KBUILD_IMAGE	= $(boot)/vmlinux
 else
 KBUILD_IMAGE	= $(boot)/vmlinux.efi
 endif
 
+else
+
+ifndef CONFIG_EFI_STUB
+KBUILD_IMAGE	= $(boot)/vmlinuz
+else
+KBUILD_IMAGE	= $(boot)/vmlinuz.efi
+endif
+
+endif
+
+load-y		= 0x9000000000200000
+bootvars-y	= VMLINUX_LOAD_ADDRESS=$(load-y)
+
+archscripts: scripts_basic
+	$(Q)$(MAKE) $(build)=arch/loongarch/tools elf-entry
+	$(Q)$(MAKE) $(build)=arch/loongarch/tools calc_vmlinuz_load_addr
+
 #
 # Select the object file format to substitute into the linker script.
 #
@@ -55,9 +74,6 @@ KBUILD_CFLAGS_MODULE		+= -fplt -Wa,-mla-global-with-abs,-mla-local-with-abs
 cflags-y += -ffreestanding
 cflags-y += $(call as-option,-Wa$(comma)-mno-fix-loongson3-llsc,)
 
-load-y		= 0x9000000000200000
-bootvars-y	= VMLINUX_LOAD_ADDRESS=$(load-y)
-
 drivers-$(CONFIG_PCI)		+= arch/loongarch/pci/
 
 KBUILD_AFLAGS	+= $(cflags-y)
@@ -99,7 +115,11 @@ $(KBUILD_IMAGE): vmlinux
 	$(Q)$(MAKE) $(build)=$(boot) $(bootvars-y) $@
 
 install:
+ifndef CONFIG_SYS_SUPPORTS_ZBOOT
 	$(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
+else
+	$(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinuz-$(KERNELRELEASE)
+endif
 	$(Q)install -D -m 644 .config $(INSTALL_PATH)/config-$(KERNELRELEASE)
 	$(Q)install -D -m 644 System.map $(INSTALL_PATH)/System.map-$(KERNELRELEASE)
 
diff --git a/arch/loongarch/boot/Makefile b/arch/loongarch/boot/Makefile
index 66f2293c34b2..c26a36004ae2 100644
--- a/arch/loongarch/boot/Makefile
+++ b/arch/loongarch/boot/Makefile
@@ -21,3 +21,58 @@ quiet_cmd_eficopy = OBJCOPY $@
 
 $(obj)/vmlinux.efi: $(obj)/vmlinux FORCE
 	$(call if_changed,eficopy)
+
+# zboot
+extra-y	+= boot.lds
+$(obj)/boot.lds: $(obj)/vmlinux.bin FORCE
+CPPFLAGS_boot.lds = $(KBUILD_CPPFLAGS) -DVMLINUZ_LOAD_ADDRESS=$(zload-y)
+
+entry-y	= $(shell $(objtree)/arch/loongarch/tools/elf-entry $(obj)/vmlinux)
+zload-y = $(shell $(objtree)/arch/loongarch/tools/calc_vmlinuz_load_addr \
+				$(obj)/vmlinux.bin $(VMLINUX_LOAD_ADDRESS))
+
+BOOT_HEAP_SIZE	:= 0x400000
+BOOT_STACK_SIZE	:= 0x002000
+
+KBUILD_AFLAGS := $(KBUILD_AFLAGS) -D__ASSEMBLY__ \
+	-DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \
+	-DBOOT_STACK_SIZE=$(BOOT_STACK_SIZE)
+
+KBUILD_CFLAGS := $(KBUILD_CFLAGS) -fpic -D__KERNEL__ \
+	-DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \
+	-DBOOT_STACK_SIZE=$(BOOT_STACK_SIZE)
+
+targets += vmlinux.bin
+OBJCOPYFLAGS_vmlinux.bin := $(OBJCOPYFLAGS) -O binary $(strip-flags)
+$(obj)/vmlinux.bin: $(obj)/vmlinux FORCE
+	$(call if_changed,objcopy)
+
+tool_$(CONFIG_KERNEL_GZIP)    = gzip
+tool_$(CONFIG_KERNEL_BZIP2)   = bzip2_with_size
+tool_$(CONFIG_KERNEL_LZ4)     = lz4_with_size
+tool_$(CONFIG_KERNEL_LZMA)    = lzma_with_size
+tool_$(CONFIG_KERNEL_LZO)     = lzo_with_size
+tool_$(CONFIG_KERNEL_XZ)      = xzkern_with_size
+tool_$(CONFIG_KERNEL_ZSTD)    = zstd22_with_size
+
+targets += vmlinux.bin.z
+$(obj)/vmlinux.bin.z: $(obj)/vmlinux.bin FORCE
+	$(call if_changed,$(tool_y))
+
+targets += $(notdir $(vmlinuzobjs-y))
+vmlinuzobjs-y := $(obj)/zkernel.o $(obj)/decompress.o $(obj)/string.o
+vmlinuzobjs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
+$(obj)/zkernel.o: $(obj)/vmlinux.bin.z
+AFLAGS_zkernel.o = $(KBUILD_AFLAGS) -DVMLINUZ_LOAD_ADDRESS=$(zload-y) -DKERNEL_ENTRY=$(entry-y)
+
+quiet_cmd_zld = LD      $@
+      cmd_zld = $(LD) $(KBUILD_LDFLAGS) -T $< $(vmlinuzobjs-y) -o $@
+
+targets += vmlinuz
+$(obj)/vmlinuz: $(src)/boot.lds $(vmlinuzobjs-y) FORCE
+	$(call if_changed,zld)
+	$(call if_changed,strip)
+
+targets += vmlinuz.efi
+$(obj)/vmlinuz.efi: $(obj)/vmlinuz FORCE
+	$(call if_changed,eficopy)
diff --git a/arch/loongarch/boot/boot.lds.S b/arch/loongarch/boot/boot.lds.S
new file mode 100644
index 000000000000..23e698782afd
--- /dev/null
+++ b/arch/loongarch/boot/boot.lds.S
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * ld.script for compressed kernel support of LoongArch
+ *
+ * Author: Huacai Chen <chenhuacai@loongson.cn>
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include "../kernel/image-vars.h"
+
+/*
+ * Max avaliable Page Size is 64K, so we set SectionAlignment
+ * field of EFI application to 64K.
+ */
+PECOFF_FILE_ALIGN = 0x200;
+PECOFF_SEGMENT_ALIGN = 0x10000;
+
+OUTPUT_ARCH(loongarch)
+ENTRY(kernel_entry)
+PHDRS {
+	text PT_LOAD FLAGS(7); /* RWX */
+}
+SECTIONS
+{
+	. = VMLINUZ_LOAD_ADDRESS;
+
+	_text = .;
+	.head.text : {
+		*(.head.text)
+	}
+
+	.text : {
+		*(.text)
+		*(.init.text)
+		*(.rodata)
+	}: text
+
+	. = ALIGN(PECOFF_SEGMENT_ALIGN);
+	_data = .;
+	.data : {
+		*(.data)
+		*(.init.data)
+		/* Put the compressed image here */
+		__image_begin = .;
+		*(.image)
+		__image_end = .;
+		CONSTRUCTORS
+		. = ALIGN(PECOFF_FILE_ALIGN);
+	}
+	_edata = .;
+
+	.bss : {
+		*(.bss)
+		*(.init.bss)
+	}
+	. = ALIGN(PECOFF_SEGMENT_ALIGN);
+	_end = .;
+
+	/DISCARD/ : {
+		*(.options)
+		*(.comment)
+		*(.note)
+	}
+}
diff --git a/arch/loongarch/boot/decompress.c b/arch/loongarch/boot/decompress.c
new file mode 100644
index 000000000000..8f55fcd8f285
--- /dev/null
+++ b/arch/loongarch/boot/decompress.c
@@ -0,0 +1,98 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Author: Huacai Chen <chenhuacai@loongson.cn>
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/libfdt.h>
+
+#include <asm/addrspace.h>
+
+/*
+ * These two variables specify the free mem region
+ * that can be used for temporary malloc area
+ */
+unsigned long free_mem_ptr;
+unsigned long free_mem_end_ptr;
+
+/* The linker tells us where the image is. */
+extern unsigned char __image_begin, __image_end;
+
+#define puts(s) do {} while (0)
+#define puthex(val) do {} while (0)
+
+void error(char *x)
+{
+	puts("\n\n");
+	puts(x);
+	puts("\n\n -- System halted");
+
+	while (1)
+		;	/* Halt */
+}
+
+/* activate the code for pre-boot environment */
+#define STATIC static
+
+#include "../../../../lib/ashldi3.c"
+
+#ifdef CONFIG_KERNEL_GZIP
+#include "../../../../lib/decompress_inflate.c"
+#endif
+
+#ifdef CONFIG_KERNEL_BZIP2
+#include "../../../../lib/decompress_bunzip2.c"
+#endif
+
+#ifdef CONFIG_KERNEL_LZ4
+#include "../../../../lib/decompress_unlz4.c"
+#endif
+
+#ifdef CONFIG_KERNEL_LZMA
+#include "../../../../lib/decompress_unlzma.c"
+#endif
+
+#ifdef CONFIG_KERNEL_LZO
+#include "../../../../lib/decompress_unlzo.c"
+#endif
+
+#ifdef CONFIG_KERNEL_XZ
+#include "../../../../lib/decompress_unxz.c"
+#endif
+
+#ifdef CONFIG_KERNEL_ZSTD
+#include "../../../../lib/decompress_unzstd.c"
+#endif
+
+void decompress_kernel(unsigned long boot_heap_start)
+{
+	unsigned long zimage_start, zimage_size;
+
+	zimage_start = (unsigned long)(&__image_begin);
+	zimage_size = (unsigned long)(&__image_end) -
+	    (unsigned long)(&__image_begin);
+
+	puts("zimage at:     ");
+	puthex(zimage_start);
+	puts(" ");
+	puthex(zimage_size + zimage_start);
+	puts("\n");
+
+	/* This area are prepared for mallocing when decompressing */
+	free_mem_ptr = boot_heap_start;
+	free_mem_end_ptr = boot_heap_start + BOOT_HEAP_SIZE;
+
+	/* Display standard Linux/LoongArch boot prompt */
+	puts("Uncompressing Linux at load address ");
+	puthex(VMLINUX_LOAD_ADDRESS);
+	puts("\n");
+
+	/* Decompress the kernel with according algorithm */
+	__decompress((char *)zimage_start, zimage_size, 0, 0,
+		   (void *)VMLINUX_LOAD_ADDRESS, 0, 0, error);
+
+	puts("Now, booting the kernel...\n");
+}
diff --git a/arch/loongarch/boot/string.c b/arch/loongarch/boot/string.c
new file mode 100644
index 000000000000..3f746e7c2bb5
--- /dev/null
+++ b/arch/loongarch/boot/string.c
@@ -0,0 +1,166 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * arch/loongarch/boot/string.c
+ *
+ * Very small subset of simple string routines
+ */
+
+#include <linux/types.h>
+
+void __weak *memset(void *s, int c, size_t n)
+{
+	int i;
+	char *ss = s;
+
+	for (i = 0; i < n; i++)
+		ss[i] = c;
+	return s;
+}
+
+void __weak *memcpy(void *dest, const void *src, size_t n)
+{
+	int i;
+	const char *s = src;
+	char *d = dest;
+
+	for (i = 0; i < n; i++)
+		d[i] = s[i];
+	return dest;
+}
+
+void __weak *memmove(void *dest, const void *src, size_t n)
+{
+	int i;
+	const char *s = src;
+	char *d = dest;
+
+	if (d < s) {
+		for (i = 0; i < n; i++)
+			d[i] = s[i];
+	} else if (d > s) {
+		for (i = n - 1; i >= 0; i--)
+			d[i] = s[i];
+	}
+
+	return dest;
+}
+
+int __weak memcmp(const void *cs, const void *ct, size_t count)
+{
+	int res = 0;
+	const unsigned char *su1, *su2;
+
+	for (su1 = cs, su2 = ct; 0 < count; ++su1, ++su2, count--) {
+		res = *su1 - *su2;
+		if (res != 0)
+			break;
+	}
+	return res;
+}
+
+int __weak strcmp(const char *str1, const char *str2)
+{
+	int delta = 0;
+	const unsigned char *s1 = (const unsigned char *)str1;
+	const unsigned char *s2 = (const unsigned char *)str2;
+
+	while (*s1 || *s2) {
+		delta = *s1 - *s2;
+		if (delta)
+			return delta;
+		s1++;
+		s2++;
+	}
+	return 0;
+}
+
+size_t __weak strlen(const char *s)
+{
+	const char *sc;
+
+	for (sc = s; *sc != '\0'; ++sc)
+		/* nothing */;
+	return sc - s;
+}
+
+size_t __weak strnlen(const char *s, size_t count)
+{
+	const char *sc;
+
+	for (sc = s; count-- && *sc != '\0'; ++sc)
+		/* nothing */;
+	return sc - s;
+}
+
+char * __weak strnstr(const char *s1, const char *s2, size_t len)
+{
+	size_t l2;
+
+	l2 = strlen(s2);
+	if (!l2)
+		return (char *)s1;
+	while (len >= l2) {
+		len--;
+		if (!memcmp(s1, s2, l2))
+			return (char *)s1;
+		s1++;
+	}
+	return NULL;
+}
+
+#undef strcat
+char * __weak strcat(char *dest, const char *src)
+{
+	char *tmp = dest;
+
+	while (*dest)
+		dest++;
+	while ((*dest++ = *src++) != '\0')
+		;
+	return tmp;
+}
+
+char * __weak strncat(char *dest, const char *src, size_t count)
+{
+	char *tmp = dest;
+
+	if (count) {
+		while (*dest)
+			dest++;
+		while ((*dest++ = *src++) != 0) {
+			if (--count == 0) {
+				*dest = '\0';
+				break;
+			}
+		}
+	}
+	return tmp;
+}
+
+char * __weak strpbrk(const char *cs, const char *ct)
+{
+	const char *sc1, *sc2;
+
+	for (sc1 = cs; *sc1 != '\0'; ++sc1) {
+		for (sc2 = ct; *sc2 != '\0'; ++sc2) {
+			if (*sc1 == *sc2)
+				return (char *)sc1;
+		}
+	}
+	return NULL;
+}
+
+char * __weak strsep(char **s, const char *ct)
+{
+	char *sbegin = *s;
+	char *end;
+
+	if (sbegin == NULL)
+		return NULL;
+
+	end = strpbrk(sbegin, ct);
+	if (end)
+		*end++ = '\0';
+	*s = end;
+	return sbegin;
+}
diff --git a/arch/loongarch/boot/zheader.S b/arch/loongarch/boot/zheader.S
new file mode 100644
index 000000000000..4bc50d953ec7
--- /dev/null
+++ b/arch/loongarch/boot/zheader.S
@@ -0,0 +1,100 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <linux/pe.h>
+#include <linux/sizes.h>
+
+	.macro	__EFI_PE_HEADER
+	.long	PE_MAGIC
+coff_header:
+	.short	IMAGE_FILE_MACHINE_LOONGARCH		/* Machine */
+	.short	section_count				/* NumberOfSections */
+	.long	0 					/* TimeDateStamp */
+	.long	0					/* PointerToSymbolTable */
+	.long	0					/* NumberOfSymbols */
+	.short	section_table - optional_header		/* SizeOfOptionalHeader */
+	.short	IMAGE_FILE_DEBUG_STRIPPED | \
+		IMAGE_FILE_EXECUTABLE_IMAGE | \
+		IMAGE_FILE_LINE_NUMS_STRIPPED		/* Characteristics */
+
+optional_header:
+	.short	PE_OPT_MAGIC_PE32PLUS			/* PE32+ format */
+	.byte	0x02					/* MajorLinkerVersion */
+	.byte	0x14					/* MinorLinkerVersion */
+	.long	_data - efi_header_end			/* SizeOfCode */
+	.long	_end - _data				/* SizeOfInitializedData */
+	.long	0					/* SizeOfUninitializedData */
+	.long	__efistub_efi_pe_entry - _head		/* AddressOfEntryPoint */
+	.long	efi_header_end - _head			/* BaseOfCode */
+
+extra_header_fields:
+	.quad	0					/* ImageBase */
+	.long	PECOFF_SEGMENT_ALIGN			/* SectionAlignment */
+	.long	PECOFF_FILE_ALIGN			/* FileAlignment */
+	.short	0					/* MajorOperatingSystemVersion */
+	.short	0					/* MinorOperatingSystemVersion */
+	.short	0					/* MajorImageVersion */
+	.short	0					/* MinorImageVersion */
+	.short	0					/* MajorSubsystemVersion */
+	.short	0					/* MinorSubsystemVersion */
+	.long	0					/* Win32VersionValue */
+
+	.long	_end - _head				/* SizeOfImage */
+
+	/* Everything before the kernel image is considered part of the header */
+	.long	efi_header_end - _head			/* SizeOfHeaders */
+	.long	0					/* CheckSum */
+	.short	IMAGE_SUBSYSTEM_EFI_APPLICATION		/* Subsystem */
+	.short	0					/* DllCharacteristics */
+	.quad	0					/* SizeOfStackReserve */
+	.quad	0					/* SizeOfStackCommit */
+	.quad	0					/* SizeOfHeapReserve */
+	.quad	0					/* SizeOfHeapCommit */
+	.long	0					/* LoaderFlags */
+	.long	(section_table - .) / 8			/* NumberOfRvaAndSizes */
+
+	.quad	0					/* ExportTable */
+	.quad	0					/* ImportTable */
+	.quad	0					/* ResourceTable */
+	.quad	0					/* ExceptionTable */
+	.quad	0					/* CertificationTable */
+	.quad	0					/* BaseRelocationTable */
+
+	/* Section table */
+section_table:
+	.ascii	".text\0\0\0"
+	.long	_data - efi_header_end			/* VirtualSize */
+	.long	efi_header_end - _head			/* VirtualAddress */
+	.long	_data - efi_header_end			/* SizeOfRawData */
+	.long	efi_header_end - _head			/* PointerToRawData */
+
+	.long	0					/* PointerToRelocations */
+	.long	0					/* PointerToLineNumbers */
+	.short	0					/* NumberOfRelocations */
+	.short	0					/* NumberOfLineNumbers */
+	.long	IMAGE_SCN_CNT_CODE | \
+		IMAGE_SCN_MEM_READ | \
+		IMAGE_SCN_MEM_EXECUTE			/* Characteristics */
+
+	.ascii	".data\0\0\0"
+	.long	_end - _data				/* VirtualSize */
+	.long	_data - _head				/* VirtualAddress */
+	.long	_edata - _data				/* SizeOfRawData */
+	.long	_data - _head				/* PointerToRawData */
+
+	.long	0					/* PointerToRelocations */
+	.long	0					/* PointerToLineNumbers */
+	.short	0					/* NumberOfRelocations */
+	.short	0					/* NumberOfLineNumbers */
+	.long	IMAGE_SCN_CNT_INITIALIZED_DATA | \
+		IMAGE_SCN_MEM_READ | \
+		IMAGE_SCN_MEM_WRITE			/* Characteristics */
+
+	.org 0x20e
+	.word kernel_version - 512 -  _head
+
+	.set	section_count, (. - section_table) / 40
+efi_header_end:
+	.endm
diff --git a/arch/loongarch/boot/zkernel.S b/arch/loongarch/boot/zkernel.S
new file mode 100644
index 000000000000..13a8a14a2328
--- /dev/null
+++ b/arch/loongarch/boot/zkernel.S
@@ -0,0 +1,99 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <linux/init.h>
+#include <linux/linkage.h>
+#include <asm/addrspace.h>
+#include <asm/asm.h>
+#include <asm/loongarch.h>
+#include <asm/regdef.h>
+#include <generated/compile.h>
+#include <generated/utsrelease.h>
+
+#ifdef CONFIG_EFI_STUB
+
+#include "zheader.S"
+
+	__HEAD
+
+_head:
+	/* "MZ", MS-DOS header */
+	.word	MZ_MAGIC
+	.org	0x28
+	.ascii	"Loongson\0"
+	.org	0x3c
+	/* Offset to the PE header */
+	.long	pe_header - _head
+
+pe_header:
+	__EFI_PE_HEADER
+
+kernel_asize:
+	.long _end - _text
+
+kernel_fsize:
+	.long _edata - _text
+
+kernel_vaddr:
+	.quad VMLINUZ_LOAD_ADDRESS
+
+kernel_offset:
+	.long kernel_offset - _text
+
+kernel_version:
+	.ascii  UTS_RELEASE " (" LINUX_COMPILE_BY "@" LINUX_COMPILE_HOST ") " UTS_VERSION "\0"
+
+SYM_L_GLOBAL(kernel_asize)
+SYM_L_GLOBAL(kernel_fsize)
+SYM_L_GLOBAL(kernel_vaddr)
+SYM_L_GLOBAL(kernel_offset)
+
+#endif
+
+	__INIT
+
+SYM_CODE_START(kernel_entry)
+	/* Save boot rom start args */
+	move	s0, a0
+	move	s1, a1
+	move	s2, a2
+	move	s3, a3
+
+	/* Config Direct Mapping */
+	li.d	t0, CSR_DMW0_INIT
+	csrwr	t0, LOONGARCH_CSR_DMWIN0
+	li.d	t0, CSR_DMW1_INIT
+	csrwr	t0, LOONGARCH_CSR_DMWIN1
+
+	/* Clear BSS */
+	la.abs	a0, _edata
+	la.abs	a2, _end
+1:	st.d	zero, a0, 0
+	addi.d	a0, a0, 8
+	bne	a2, a0, 1b
+
+	la.abs	a0, .heap	   /* heap address */
+	la.abs	sp, .stack + 8192  /* stack address */
+
+	la	ra, 2f
+	la	t4, decompress_kernel
+	jirl	zero, t4, 0
+2:
+	move	a0, s0
+	move	a1, s1
+	move	a2, s2
+	move	a3, s3
+	PTR_LI	t4, KERNEL_ENTRY
+	jirl	zero, t4, 0
+3:
+	b	3b
+SYM_CODE_END(kernel_entry)
+
+	.comm .heap, BOOT_HEAP_SIZE, 4
+	.comm .stack, BOOT_STACK_SIZE, 4
+
+	.align 4
+	.section .image, "a", %progbits
+	.incbin "arch/loongarch/boot/vmlinux.bin.z"
diff --git a/arch/loongarch/tools/Makefile b/arch/loongarch/tools/Makefile
new file mode 100644
index 000000000000..8a6181c82a91
--- /dev/null
+++ b/arch/loongarch/tools/Makefile
@@ -0,0 +1,15 @@
+#
+# arch/loongarch/boot/Makefile
+#
+# Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+#
+
+hostprogs := elf-entry
+PHONY += elf-entry
+elf-entry: $(obj)/elf-entry
+	@:
+
+hostprogs += calc_vmlinuz_load_addr
+PHONY += calc_vmlinuz_load_addr
+calc_vmlinuz_load_addr: $(obj)/calc_vmlinuz_load_addr
+	@:
diff --git a/arch/loongarch/tools/calc_vmlinuz_load_addr.c b/arch/loongarch/tools/calc_vmlinuz_load_addr.c
new file mode 100644
index 000000000000..5e2ca6b4dff6
--- /dev/null
+++ b/arch/loongarch/tools/calc_vmlinuz_load_addr.c
@@ -0,0 +1,51 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#include <errno.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/stat.h>
+
+int main(int argc, char *argv[])
+{
+	unsigned long long vmlinux_size, vmlinux_load_addr, vmlinuz_load_addr;
+	struct stat sb;
+
+	if (argc != 3) {
+		fprintf(stderr, "Usage: %s <pathname> <vmlinux_load_addr>\n", argv[0]);
+		return EXIT_FAILURE;
+	}
+
+	if (stat(argv[1], &sb) == -1) {
+		perror("stat");
+		return EXIT_FAILURE;
+	}
+
+	/* Convert hex characters to dec number */
+	errno = 0;
+	if (sscanf(argv[2], "%llx", &vmlinux_load_addr) != 1) {
+		if (errno != 0)
+			perror("sscanf");
+		else
+			fprintf(stderr, "No matching characters\n");
+
+		return EXIT_FAILURE;
+	}
+
+	vmlinux_size = (uint64_t)sb.st_size;
+	vmlinuz_load_addr = vmlinux_load_addr + vmlinux_size;
+
+	/*
+	 * Align with 64KB: KEXEC needs load sections to be aligned to PAGE_SIZE,
+	 * which may be as large as 64KB depending on the kernel configuration.
+	 */
+
+	vmlinuz_load_addr += (0x10000 - vmlinux_size % 0x10000);
+
+	printf("0x%llx\n", vmlinuz_load_addr);
+
+	return EXIT_SUCCESS;
+}
diff --git a/arch/loongarch/tools/elf-entry.c b/arch/loongarch/tools/elf-entry.c
new file mode 100644
index 000000000000..c80721e0dee1
--- /dev/null
+++ b/arch/loongarch/tools/elf-entry.c
@@ -0,0 +1,66 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <elf.h>
+#include <inttypes.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+__attribute__((noreturn))
+static void die(const char *msg)
+{
+	fputs(msg, stderr);
+	exit(EXIT_FAILURE);
+}
+
+int main(int argc, const char *argv[])
+{
+	uint64_t entry;
+	size_t nread;
+	FILE *file;
+	union {
+		Elf32_Ehdr ehdr32;
+		Elf64_Ehdr ehdr64;
+	} hdr;
+
+	if (argc != 2)
+		die("Usage: elf-entry <elf-file>\n");
+
+	file = fopen(argv[1], "r");
+	if (!file) {
+		perror("Unable to open input file");
+		return EXIT_FAILURE;
+	}
+
+	nread = fread(&hdr, 1, sizeof(hdr), file);
+	if (nread != sizeof(hdr)) {
+		fclose(file);
+		perror("Unable to read input file");
+		return EXIT_FAILURE;
+	}
+
+	if (memcmp(hdr.ehdr32.e_ident, ELFMAG, SELFMAG)) {
+		fclose(file);
+		die("Input is not an ELF\n");
+	}
+
+	switch (hdr.ehdr32.e_ident[EI_CLASS]) {
+	case ELFCLASS32:
+		/* Sign extend to form a canonical address */
+		entry = (int64_t)(int32_t)hdr.ehdr32.e_entry;
+		break;
+
+	case ELFCLASS64:
+		entry = hdr.ehdr64.e_entry;
+		break;
+
+	default:
+		fclose(file);
+		die("Invalid ELF class\n");
+	}
+
+	fclose(file);
+	printf("0x%016" PRIx64 "\n", entry);
+
+	return EXIT_SUCCESS;
+}
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 22/24] LoongArch: Add multi-processor (SMP) support
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (20 preceding siblings ...)
  2022-04-30  9:05 ` [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support Huacai Chen
@ 2022-04-30  9:05 ` Huacai Chen
  2022-04-30  9:05 ` [PATCH V9 23/24] LoongArch: Add Non-Uniform Memory Access (NUMA) support Huacai Chen
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:05 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen

This patch adds multi-processor (SMP) support for LoongArch.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/Kconfig                  |  44 +-
 arch/loongarch/include/asm/atomic.h     |   4 +
 arch/loongarch/include/asm/barrier.h    | 108 ++++
 arch/loongarch/include/asm/cmpxchg.h    |   1 +
 arch/loongarch/include/asm/futex.h      |   1 +
 arch/loongarch/include/asm/hardirq.h    |   2 +
 arch/loongarch/include/asm/irq.h        |   3 +
 arch/loongarch/include/asm/percpu.h     | 202 +++++++
 arch/loongarch/include/asm/pgtable.h    |  21 +
 arch/loongarch/include/asm/smp.h        | 124 ++++
 arch/loongarch/include/asm/stackframe.h |  17 +-
 arch/loongarch/include/asm/tlbflush.h   |  13 +
 arch/loongarch/include/asm/topology.h   |   7 +-
 arch/loongarch/kernel/Makefile          |   2 +
 arch/loongarch/kernel/acpi.c            |  70 ++-
 arch/loongarch/kernel/asm-offsets.c     |   8 +
 arch/loongarch/kernel/cmpxchg.c         |   3 +
 arch/loongarch/kernel/head.S            |  30 +
 arch/loongarch/kernel/irq.c             |  11 +-
 arch/loongarch/kernel/proc.c            |   5 +
 arch/loongarch/kernel/process.c         |   7 +
 arch/loongarch/kernel/reset.c           |  12 +
 arch/loongarch/kernel/setup.c           |  26 +
 arch/loongarch/kernel/smp.c             | 746 ++++++++++++++++++++++++
 arch/loongarch/kernel/topology.c        |  43 +-
 arch/loongarch/kernel/vmlinux.lds.S     |   4 +
 arch/loongarch/mm/tlbex.S               |  69 +++
 include/linux/cpuhotplug.h              |   1 +
 28 files changed, 1573 insertions(+), 11 deletions(-)
 create mode 100644 arch/loongarch/include/asm/smp.h
 create mode 100644 arch/loongarch/kernel/smp.c

diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 6c1042746b2d..8479d2d43472 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -66,6 +66,7 @@ config LOONGARCH
 	select GENERIC_LIB_UCMPDI2
 	select GENERIC_PCI_IOMAP
 	select GENERIC_SCHED_CLOCK
+	select GENERIC_SMP_IDLE_THREAD
 	select GENERIC_TIME_VSYSCALL
 	select GPIOLIB
 	select HAVE_ARCH_AUDITSYSCALL
@@ -96,7 +97,7 @@ config LOONGARCH
 	select HAVE_RSEQ
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_TIF_NOHZ
-	select HAVE_VIRT_CPU_ACCOUNTING_GEN
+	select HAVE_VIRT_CPU_ACCOUNTING_GEN if !SMP
 	select IRQ_FORCED_THREADING
 	select IRQ_LOONGARCH_CPU
 	select MODULES_USE_ELF_RELA if MODULES
@@ -284,6 +285,47 @@ config EFI_STUB
 	  This kernel feature allows the kernel to be loaded directly by
 	  EFI firmware without the use of a bootloader.
 
+config SMP
+	bool "Multi-Processing support"
+	help
+	  This enables support for systems with more than one CPU. If you have
+	  a system with only one CPU, say N. If you have a system with more
+	  than one CPU, say Y.
+
+	  If you say N here, the kernel will run on uni- and multiprocessor
+	  machines, but will use only one CPU of a multiprocessor machine. If
+	  you say Y here, the kernel will run on many, but not all,
+	  uniprocessor machines. On a uniprocessor machine, the kernel
+	  will run faster if you say N here.
+
+	  People using multiprocessor machines who say Y here should also say
+	  Y to "Enhanced Real Time Clock Support", below.
+
+	  See also the SMP-HOWTO available at
+	  <http://www.tldp.org/docs.html#howto>.
+
+	  If you don't know what to do here, say N.
+
+config HOTPLUG_CPU
+	bool "Support for hot-pluggable CPUs"
+	depends on SMP
+	select GENERIC_IRQ_MIGRATION
+	help
+	  Say Y here to allow turning CPUs off and on. CPUs can be
+	  controlled through /sys/devices/system/cpu.
+	  (Note: power management support will enable this option
+	    automatically on SMP systems. )
+	  Say N if you want to disable CPU hotplug.
+
+config NR_CPUS
+	int "Maximum number of CPUs (2-256)"
+	range 2 256
+	depends on SMP
+	default "16"
+	help
+	  This allows you to specify the maximum number of CPUs which this
+	  kernel will support.
+
 config FORCE_MAX_ZONEORDER
 	int "Maximum zone order"
 	range 14 64 if PAGE_SIZE_64KB
diff --git a/arch/loongarch/include/asm/atomic.h b/arch/loongarch/include/asm/atomic.h
index f0ed7f9c08c9..3efd24e38917 100644
--- a/arch/loongarch/include/asm/atomic.h
+++ b/arch/loongarch/include/asm/atomic.h
@@ -162,6 +162,7 @@ static inline int arch_atomic_sub_if_positive(int i, atomic_t *v)
 		"	sc.w	%1, %2					\n"
 		"	beq	$zero, %1, 1b				\n"
 		"2:							\n"
+		__WEAK_LLSC_MB
 		: "=&r" (result), "=&r" (temp),
 		  "+" GCC_OFF_SMALL_ASM() (v->counter)
 		: "I" (-i));
@@ -174,6 +175,7 @@ static inline int arch_atomic_sub_if_positive(int i, atomic_t *v)
 		"	sc.w	%1, %2					\n"
 		"	beq	$zero, %1, 1b				\n"
 		"2:							\n"
+		__WEAK_LLSC_MB
 		: "=&r" (result), "=&r" (temp),
 		  "+" GCC_OFF_SMALL_ASM() (v->counter)
 		: "r" (i));
@@ -323,6 +325,7 @@ static inline long arch_atomic64_sub_if_positive(long i, atomic64_t *v)
 		"	sc.d	%1, %2					\n"
 		"	beq	%1, $zero, 1b				\n"
 		"2:							\n"
+		__WEAK_LLSC_MB
 		: "=&r" (result), "=&r" (temp),
 		  "+" GCC_OFF_SMALL_ASM() (v->counter)
 		: "I" (-i));
@@ -335,6 +338,7 @@ static inline long arch_atomic64_sub_if_positive(long i, atomic64_t *v)
 		"	sc.d	%1, %2					\n"
 		"	beq	%1, $zero, 1b				\n"
 		"2:							\n"
+		__WEAK_LLSC_MB
 		: "=&r" (result), "=&r" (temp),
 		  "+" GCC_OFF_SMALL_ASM() (v->counter)
 		: "r" (i));
diff --git a/arch/loongarch/include/asm/barrier.h b/arch/loongarch/include/asm/barrier.h
index cc6c7e3f5ce6..6c567c750d04 100644
--- a/arch/loongarch/include/asm/barrier.h
+++ b/arch/loongarch/include/asm/barrier.h
@@ -18,6 +18,19 @@
 #define mb()		fast_mb()
 #define iob()		fast_iob()
 
+#define __smp_mb()	__asm__ __volatile__("dbar 0" : : : "memory")
+#define __smp_rmb()	__asm__ __volatile__("dbar 0" : : : "memory")
+#define __smp_wmb()	__asm__ __volatile__("dbar 0" : : : "memory")
+
+#ifdef CONFIG_SMP
+#define __WEAK_LLSC_MB		"	dbar 0  \n"
+#else
+#define __WEAK_LLSC_MB		"		\n"
+#endif
+
+#define __smp_mb__before_atomic()	barrier()
+#define __smp_mb__after_atomic()	barrier()
+
 /**
  * array_index_mask_nospec() - generate a ~0 mask when index < size, 0 otherwise
  * @index: array element index
@@ -46,6 +59,101 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
 	return mask;
 }
 
+#define __smp_load_acquire(p)							\
+({										\
+	union { typeof(*p) __val; char __c[1]; } __u;				\
+	unsigned long __tmp = 0;							\
+	compiletime_assert_atomic_type(*p);					\
+	switch (sizeof(*p)) {							\
+	case 1:									\
+		*(__u8 *)__u.__c = *(volatile __u8 *)p;				\
+		__smp_mb();							\
+		break;								\
+	case 2:									\
+		*(__u16 *)__u.__c = *(volatile __u16 *)p;			\
+		__smp_mb();							\
+		break;								\
+	case 4:									\
+		__asm__ __volatile__(						\
+		"amor_db.w %[val], %[tmp], %[mem]	\n"				\
+		: [val] "=&r" (*(__u32 *)__u.__c)				\
+		: [mem] "ZB" (*(u32 *) p), [tmp] "r" (__tmp)			\
+		: "memory");							\
+		break;								\
+	case 8:									\
+		__asm__ __volatile__(						\
+		"amor_db.d %[val], %[tmp], %[mem]	\n"				\
+		: [val] "=&r" (*(__u64 *)__u.__c)				\
+		: [mem] "ZB" (*(u64 *) p), [tmp] "r" (__tmp)			\
+		: "memory");							\
+		break;								\
+	}									\
+	(typeof(*p))__u.__val;								\
+})
+
+#define __smp_store_release(p, v)						\
+do {										\
+	union { typeof(*p) __val; char __c[1]; } __u =				\
+		{ .__val = (__force typeof(*p)) (v) };				\
+	unsigned long __tmp;							\
+	compiletime_assert_atomic_type(*p);					\
+	switch (sizeof(*p)) {							\
+	case 1:									\
+		__smp_mb();							\
+		*(volatile __u8 *)p = *(__u8 *)__u.__c;				\
+		break;								\
+	case 2:									\
+		__smp_mb();							\
+		*(volatile __u16 *)p = *(__u16 *)__u.__c;			\
+		break;								\
+	case 4:									\
+		__asm__ __volatile__(						\
+		"amswap_db.w %[tmp], %[val], %[mem]	\n"			\
+		: [mem] "+ZB" (*(u32 *)p), [tmp] "=&r" (__tmp)			\
+		: [val] "r" (*(__u32 *)__u.__c)					\
+		: );								\
+		break;								\
+	case 8:									\
+		__asm__ __volatile__(						\
+		"amswap_db.d %[tmp], %[val], %[mem]	\n"			\
+		: [mem] "+ZB" (*(u64 *)p), [tmp] "=&r" (__tmp)			\
+		: [val] "r" (*(__u64 *)__u.__c)					\
+		: );								\
+		break;								\
+	}									\
+} while (0)
+
+#define __smp_store_mb(p, v)							\
+do {										\
+	union { typeof(p) __val; char __c[1]; } __u =				\
+		{ .__val = (__force typeof(p)) (v) };				\
+	unsigned long __tmp;							\
+	switch (sizeof(p)) {							\
+	case 1:									\
+		*(volatile __u8 *)&p = *(__u8 *)__u.__c;			\
+		__smp_mb();							\
+		break;								\
+	case 2:									\
+		*(volatile __u16 *)&p = *(__u16 *)__u.__c;			\
+		__smp_mb();							\
+		break;								\
+	case 4:									\
+		__asm__ __volatile__(						\
+		"amswap_db.w %[tmp], %[val], %[mem]	\n"			\
+		: [mem] "+ZB" (*(u32 *)&p), [tmp] "=&r" (__tmp)			\
+		: [val] "r" (*(__u32 *)__u.__c)					\
+		: );								\
+		break;								\
+	case 8:									\
+		__asm__ __volatile__(						\
+		"amswap_db.d %[tmp], %[val], %[mem]	\n"			\
+		: [mem] "+ZB" (*(u64 *)&p), [tmp] "=&r" (__tmp)			\
+		: [val] "r" (*(__u64 *)__u.__c)					\
+		: );								\
+		break;								\
+	}									\
+} while (0)
+
 #include <asm-generic/barrier.h>
 
 #endif /* __ASM_BARRIER_H */
diff --git a/arch/loongarch/include/asm/cmpxchg.h b/arch/loongarch/include/asm/cmpxchg.h
index 69c3e2b7827d..d636e81269b3 100644
--- a/arch/loongarch/include/asm/cmpxchg.h
+++ b/arch/loongarch/include/asm/cmpxchg.h
@@ -65,6 +65,7 @@ static inline unsigned long __xchg(volatile void *ptr, unsigned long x,
 	"	" st "	$t0, %1				\n"		\
 	"	beq	$zero, $t0, 1b			\n"		\
 	"2:						\n"		\
+	__WEAK_LLSC_MB							\
 	: "=&r" (__ret), "=ZB"(*m)					\
 	: "ZB"(*m), "Jr" (old), "Jr" (new)				\
 	: "t0", "memory");						\
diff --git a/arch/loongarch/include/asm/futex.h b/arch/loongarch/include/asm/futex.h
index b27d55f92db7..9de8231694ec 100644
--- a/arch/loongarch/include/asm/futex.h
+++ b/arch/loongarch/include/asm/futex.h
@@ -86,6 +86,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval, u32 newv
 	"2:	sc.w	$t0, %2					\n"
 	"	beq	$zero, $t0, 1b				\n"
 	"3:							\n"
+	__WEAK_LLSC_MB
 	"	.section .fixup,\"ax\"				\n"
 	"4:	li.d	%0, %6					\n"
 	"	b	3b					\n"
diff --git a/arch/loongarch/include/asm/hardirq.h b/arch/loongarch/include/asm/hardirq.h
index d32f83938880..befe8184aa08 100644
--- a/arch/loongarch/include/asm/hardirq.h
+++ b/arch/loongarch/include/asm/hardirq.h
@@ -21,4 +21,6 @@ typedef struct {
 
 DECLARE_PER_CPU_ALIGNED(irq_cpustat_t, irq_stat);
 
+#define __ARCH_IRQ_STAT
+
 #endif /* _ASM_HARDIRQ_H */
diff --git a/arch/loongarch/include/asm/irq.h b/arch/loongarch/include/asm/irq.h
index cd95d0d4e10f..ca3a68767b54 100644
--- a/arch/loongarch/include/asm/irq.h
+++ b/arch/loongarch/include/asm/irq.h
@@ -125,6 +125,9 @@ extern struct irq_domain *pch_lpc_domain;
 extern struct irq_domain *pch_msi_domain[MAX_IO_PICS];
 extern struct irq_domain *pch_pic_domain[MAX_IO_PICS];
 
+extern void fixup_irqs(void);
+extern irqreturn_t loongson3_ipi_interrupt(int irq, void *dev);
+
 #include <asm-generic/irq.h>
 
 #endif /* _ASM_IRQ_H */
diff --git a/arch/loongarch/include/asm/percpu.h b/arch/loongarch/include/asm/percpu.h
index 7d5b22ebd834..28ab7771aefd 100644
--- a/arch/loongarch/include/asm/percpu.h
+++ b/arch/loongarch/include/asm/percpu.h
@@ -5,6 +5,8 @@
 #ifndef __ASM_PERCPU_H
 #define __ASM_PERCPU_H
 
+#include <asm/cmpxchg.h>
+
 /* Use r21 for fast access */
 register unsigned long __my_cpu_offset __asm__("$r21");
 
@@ -15,6 +17,206 @@ static inline void set_my_cpu_offset(unsigned long off)
 }
 #define __my_cpu_offset __my_cpu_offset
 
+#define PERCPU_OP(op, asm_op, c_op)					\
+static inline unsigned long __percpu_##op(void *ptr,			\
+			unsigned long val, int size)			\
+{									\
+	unsigned long ret;						\
+									\
+	switch (size) {							\
+	case 4:								\
+		__asm__ __volatile__(					\
+		"am"#asm_op".w"	" %[ret], %[val], %[ptr]	\n"		\
+		: [ret] "=&r" (ret), [ptr] "+ZB"(*(u32 *)ptr)		\
+		: [val] "r" (val));					\
+		break;							\
+	case 8:								\
+		__asm__ __volatile__(					\
+		"am"#asm_op".d" " %[ret], %[val], %[ptr]	\n"		\
+		: [ret] "=&r" (ret), [ptr] "+ZB"(*(u64 *)ptr)		\
+		: [val] "r" (val));					\
+		break;							\
+	default:							\
+		ret = 0;						\
+		BUILD_BUG();						\
+	}								\
+									\
+	return ret c_op val;						\
+}
+
+PERCPU_OP(add, add, +)
+PERCPU_OP(and, and, &)
+PERCPU_OP(or, or, |)
+#undef PERCPU_OP
+
+static inline unsigned long __percpu_read(void *ptr, int size)
+{
+	unsigned long ret;
+
+	switch (size) {
+	case 1:
+		__asm__ __volatile__ ("ldx.b %[ret], $r21, %[ptr]	\n"
+		: [ret] "=&r"(ret)
+		: [ptr] "r"(ptr)
+		: "memory");
+		break;
+	case 2:
+		__asm__ __volatile__ ("ldx.h %[ret], $r21, %[ptr]	\n"
+		: [ret] "=&r"(ret)
+		: [ptr] "r"(ptr)
+		: "memory");
+		break;
+	case 4:
+		__asm__ __volatile__ ("ldx.w %[ret], $r21, %[ptr]	\n"
+		: [ret] "=&r"(ret)
+		: [ptr] "r"(ptr)
+		: "memory");
+		break;
+	case 8:
+		__asm__ __volatile__ ("ldx.d %[ret], $r21, %[ptr]	\n"
+		: [ret] "=&r"(ret)
+		: [ptr] "r"(ptr)
+		: "memory");
+		break;
+	default:
+		ret = 0;
+		BUILD_BUG();
+	}
+
+	return ret;
+}
+
+static inline void __percpu_write(void *ptr, unsigned long val, int size)
+{
+	switch (size) {
+	case 1:
+		__asm__ __volatile__("stx.b %[val], $r21, %[ptr]	\n"
+		:
+		: [val] "r" (val), [ptr] "r" (ptr)
+		: "memory");
+		break;
+	case 2:
+		__asm__ __volatile__("stx.h %[val], $r21, %[ptr]	\n"
+		:
+		: [val] "r" (val), [ptr] "r" (ptr)
+		: "memory");
+		break;
+	case 4:
+		__asm__ __volatile__("stx.w %[val], $r21, %[ptr]	\n"
+		:
+		: [val] "r" (val), [ptr] "r" (ptr)
+		: "memory");
+		break;
+	case 8:
+		__asm__ __volatile__("stx.d %[val], $r21, %[ptr]	\n"
+		:
+		: [val] "r" (val), [ptr] "r" (ptr)
+		: "memory");
+		break;
+	default:
+		BUILD_BUG();
+	}
+}
+
+static inline unsigned long __percpu_xchg(void *ptr, unsigned long val,
+						int size)
+{
+	switch (size) {
+	case 1:
+	case 2:
+		return __xchg_small((volatile void *)ptr, val, size);
+
+	case 4:
+		return __xchg_asm("amswap.w", (volatile u32 *)ptr, (u32)val);
+
+	case 8:
+		return __xchg_asm("amswap.d", (volatile u64 *)ptr, (u64)val);
+
+	default:
+		BUILD_BUG();
+	}
+
+	return 0;
+}
+
+/* this_cpu_cmpxchg */
+#define _protect_cmpxchg_local(pcp, o, n)			\
+({								\
+	typeof(*raw_cpu_ptr(&(pcp))) __ret;			\
+	preempt_disable_notrace();				\
+	__ret = cmpxchg_local(raw_cpu_ptr(&(pcp)), o, n);	\
+	preempt_enable_notrace();				\
+	__ret;							\
+})
+
+#define _percpu_read(pcp)						\
+({									\
+	typeof(pcp) __retval;						\
+	__retval = (typeof(pcp))__percpu_read(&(pcp), sizeof(pcp));	\
+	__retval;							\
+})
+
+#define _percpu_write(pcp, val)						\
+do {									\
+	__percpu_write(&(pcp), (unsigned long)(val), sizeof(pcp));	\
+} while (0)								\
+
+#define _pcp_protect(operation, pcp, val)			\
+({								\
+	typeof(pcp) __retval;					\
+	preempt_disable_notrace();				\
+	__retval = (typeof(pcp))operation(raw_cpu_ptr(&(pcp)),	\
+					  (val), sizeof(pcp));	\
+	preempt_enable_notrace();				\
+	__retval;						\
+})
+
+#define _percpu_add(pcp, val) \
+	_pcp_protect(__percpu_add, pcp, val)
+
+#define _percpu_add_return(pcp, val) _percpu_add(pcp, val)
+
+#define _percpu_and(pcp, val) \
+	_pcp_protect(__percpu_and, pcp, val)
+
+#define _percpu_or(pcp, val) \
+	_pcp_protect(__percpu_or, pcp, val)
+
+#define _percpu_xchg(pcp, val) ((typeof(pcp)) \
+	_pcp_protect(__percpu_xchg, pcp, (unsigned long)(val)))
+
+#define this_cpu_add_4(pcp, val) _percpu_add(pcp, val)
+#define this_cpu_add_8(pcp, val) _percpu_add(pcp, val)
+
+#define this_cpu_add_return_4(pcp, val) _percpu_add_return(pcp, val)
+#define this_cpu_add_return_8(pcp, val) _percpu_add_return(pcp, val)
+
+#define this_cpu_and_4(pcp, val) _percpu_and(pcp, val)
+#define this_cpu_and_8(pcp, val) _percpu_and(pcp, val)
+
+#define this_cpu_or_4(pcp, val) _percpu_or(pcp, val)
+#define this_cpu_or_8(pcp, val) _percpu_or(pcp, val)
+
+#define this_cpu_read_1(pcp) _percpu_read(pcp)
+#define this_cpu_read_2(pcp) _percpu_read(pcp)
+#define this_cpu_read_4(pcp) _percpu_read(pcp)
+#define this_cpu_read_8(pcp) _percpu_read(pcp)
+
+#define this_cpu_write_1(pcp, val) _percpu_write(pcp, val)
+#define this_cpu_write_2(pcp, val) _percpu_write(pcp, val)
+#define this_cpu_write_4(pcp, val) _percpu_write(pcp, val)
+#define this_cpu_write_8(pcp, val) _percpu_write(pcp, val)
+
+#define this_cpu_xchg_1(pcp, val) _percpu_xchg(pcp, val)
+#define this_cpu_xchg_2(pcp, val) _percpu_xchg(pcp, val)
+#define this_cpu_xchg_4(pcp, val) _percpu_xchg(pcp, val)
+#define this_cpu_xchg_8(pcp, val) _percpu_xchg(pcp, val)
+
+#define this_cpu_cmpxchg_1(ptr, o, n) _protect_cmpxchg_local(ptr, o, n)
+#define this_cpu_cmpxchg_2(ptr, o, n) _protect_cmpxchg_local(ptr, o, n)
+#define this_cpu_cmpxchg_4(ptr, o, n) _protect_cmpxchg_local(ptr, o, n)
+#define this_cpu_cmpxchg_8(ptr, o, n) _protect_cmpxchg_local(ptr, o, n)
+
 #include <asm-generic/percpu.h>
 
 #endif /* __ASM_PERCPU_H */
diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h
index ae8f3ef61091..dc3134a4a13c 100644
--- a/arch/loongarch/include/asm/pgtable.h
+++ b/arch/loongarch/include/asm/pgtable.h
@@ -286,8 +286,29 @@ static inline void set_pte(pte_t *ptep, pte_t pteval)
 		 * Make sure the buddy is global too (if it's !none,
 		 * it better already be global)
 		 */
+#ifdef CONFIG_SMP
+		/*
+		 * For SMP, multiple CPUs can race, so we need to do
+		 * this atomically.
+		 */
+		unsigned long page_global = _PAGE_GLOBAL;
+		unsigned long tmp;
+
+		__asm__ __volatile__ (
+		"1:"	__LL	"%[tmp], %[buddy]		\n"
+		"	bnez	%[tmp], 2f			\n"
+		"	 or	%[tmp], %[tmp], %[global]	\n"
+			__SC	"%[tmp], %[buddy]		\n"
+		"	beqz	%[tmp], 1b			\n"
+		"	nop					\n"
+		"2:						\n"
+		__WEAK_LLSC_MB
+		: [buddy] "+m" (buddy->pte), [tmp] "=&r" (tmp)
+		: [global] "r" (page_global));
+#else /* !CONFIG_SMP */
 		if (pte_none(*buddy))
 			pte_val(*buddy) = pte_val(*buddy) | _PAGE_GLOBAL;
+#endif /* CONFIG_SMP */
 	}
 }
 
diff --git a/arch/loongarch/include/asm/smp.h b/arch/loongarch/include/asm/smp.h
new file mode 100644
index 000000000000..551e1f37c705
--- /dev/null
+++ b/arch/loongarch/include/asm/smp.h
@@ -0,0 +1,124 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Author: Huacai Chen <chenhuacai@loongson.cn>
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM_SMP_H
+#define __ASM_SMP_H
+
+#include <linux/atomic.h>
+#include <linux/bitops.h>
+#include <linux/linkage.h>
+#include <linux/smp.h>
+#include <linux/threads.h>
+#include <linux/cpumask.h>
+
+void loongson3_smp_setup(void);
+void loongson3_prepare_cpus(unsigned int max_cpus);
+void loongson3_boot_secondary(int cpu, struct task_struct *idle);
+void loongson3_init_secondary(void);
+void loongson3_smp_finish(void);
+void loongson3_send_ipi_single(int cpu, unsigned int action);
+void loongson3_send_ipi_mask(const struct cpumask *mask, unsigned int action);
+#ifdef CONFIG_HOTPLUG_CPU
+int loongson3_cpu_disable(void);
+void loongson3_cpu_die(unsigned int cpu);
+#endif
+
+#ifdef CONFIG_SMP
+
+static inline void plat_smp_setup(void)
+{
+	loongson3_smp_setup();
+}
+
+#else /* !CONFIG_SMP */
+
+static inline void plat_smp_setup(void) { }
+
+#endif /* !CONFIG_SMP */
+
+extern int smp_num_siblings;
+extern int num_processors;
+extern int disabled_cpus;
+extern cpumask_t cpu_sibling_map[];
+extern cpumask_t cpu_core_map[];
+extern cpumask_t cpu_foreign_map[];
+
+static inline int raw_smp_processor_id(void)
+{
+#if defined(__VDSO__)
+	extern int vdso_smp_processor_id(void)
+		__compiletime_error("VDSO should not call smp_processor_id()");
+	return vdso_smp_processor_id();
+#else
+	return current_thread_info()->cpu;
+#endif
+}
+#define raw_smp_processor_id raw_smp_processor_id
+
+/* Map from cpu id to sequential logical cpu number.  This will only
+ * not be idempotent when cpus failed to come on-line.	*/
+extern int __cpu_number_map[NR_CPUS];
+#define cpu_number_map(cpu)  __cpu_number_map[cpu]
+
+/* The reverse map from sequential logical cpu number to cpu id.  */
+extern int __cpu_logical_map[NR_CPUS];
+#define cpu_logical_map(cpu)  __cpu_logical_map[cpu]
+
+#define cpu_physical_id(cpu)	cpu_logical_map(cpu)
+
+#define SMP_BOOT_CPU		0x1
+#define SMP_RESCHEDULE		0x2
+#define SMP_CALL_FUNCTION	0x4
+
+struct secondary_data {
+	unsigned long stack;
+	unsigned long thread_info;
+};
+extern struct secondary_data cpuboot_data;
+
+extern asmlinkage void smpboot_entry(void);
+
+extern void calculate_cpu_foreign_map(void);
+
+/*
+ * Generate IPI list text
+ */
+extern void show_ipi_list(struct seq_file *p, int prec);
+
+/*
+ * This function sends a 'reschedule' IPI to another CPU.
+ * it goes straight through and wastes no time serializing
+ * anything. Worst case is that we lose a reschedule ...
+ */
+static inline void smp_send_reschedule(int cpu)
+{
+	loongson3_send_ipi_single(cpu, SMP_RESCHEDULE);
+}
+
+static inline void arch_send_call_function_single_ipi(int cpu)
+{
+	loongson3_send_ipi_single(cpu, SMP_CALL_FUNCTION);
+}
+
+static inline void arch_send_call_function_ipi_mask(const struct cpumask *mask)
+{
+	loongson3_send_ipi_mask(mask, SMP_CALL_FUNCTION);
+}
+
+#ifdef CONFIG_HOTPLUG_CPU
+static inline int __cpu_disable(void)
+{
+	return loongson3_cpu_disable();
+}
+
+static inline void __cpu_die(unsigned int cpu)
+{
+	loongson3_cpu_die(cpu);
+}
+
+extern void play_dead(void);
+#endif
+
+#endif /* __ASM_SMP_H */
diff --git a/arch/loongarch/include/asm/stackframe.h b/arch/loongarch/include/asm/stackframe.h
index fed198fbd51d..7bf2f6091f47 100644
--- a/arch/loongarch/include/asm/stackframe.h
+++ b/arch/loongarch/include/asm/stackframe.h
@@ -77,17 +77,24 @@
  * new value in sp.
  */
 	.macro	get_saved_sp docfi=0
-	la.abs	t1, kernelsp
-	move	t0, sp
+	la.abs	  t1, kernelsp
+#ifdef CONFIG_SMP
+	csrrd	  t0, PERCPU_BASE_KS
+	LONG_ADDU t1, t1, t0
+#endif
+	move	  t0, sp
 	.if \docfi
 	.cfi_register sp, t0
 	.endif
-	LONG_L	sp, t1, 0
+	LONG_L	  sp, t1, 0
 	.endm
 
 	.macro	set_saved_sp stackp temp temp2
-	la.abs	\temp, kernelsp
-	LONG_S	\stackp, \temp, 0
+	la.abs	  \temp, kernelsp
+#ifdef CONFIG_SMP
+	LONG_ADDU \temp, \temp, u0
+#endif
+	LONG_S	  \stackp, \temp, 0
 	.endm
 
 	.macro	SAVE_SOME docfi=0
diff --git a/arch/loongarch/include/asm/tlbflush.h b/arch/loongarch/include/asm/tlbflush.h
index 36bd6d11dc2d..a0785e590681 100644
--- a/arch/loongarch/include/asm/tlbflush.h
+++ b/arch/loongarch/include/asm/tlbflush.h
@@ -25,6 +25,17 @@ extern void local_flush_tlb_kernel_range(unsigned long start, unsigned long end)
 extern void local_flush_tlb_page(struct vm_area_struct *vma, unsigned long page);
 extern void local_flush_tlb_one(unsigned long vaddr);
 
+#ifdef CONFIG_SMP
+
+extern void flush_tlb_all(void);
+extern void flush_tlb_mm(struct mm_struct *);
+extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long, unsigned long);
+extern void flush_tlb_kernel_range(unsigned long, unsigned long);
+extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
+extern void flush_tlb_one(unsigned long vaddr);
+
+#else /* CONFIG_SMP */
+
 #define flush_tlb_all()			local_flush_tlb_all()
 #define flush_tlb_mm(mm)		local_flush_tlb_mm(mm)
 #define flush_tlb_range(vma, vmaddr, end)	local_flush_tlb_range(vma, vmaddr, end)
@@ -32,4 +43,6 @@ extern void local_flush_tlb_one(unsigned long vaddr);
 #define flush_tlb_page(vma, page)	local_flush_tlb_page(vma, page)
 #define flush_tlb_one(vaddr)		local_flush_tlb_one(vaddr)
 
+#endif /* CONFIG_SMP */
+
 #endif /* __ASM_TLBFLUSH_H */
diff --git a/arch/loongarch/include/asm/topology.h b/arch/loongarch/include/asm/topology.h
index 9ac71a25207a..9314d7a3998c 100644
--- a/arch/loongarch/include/asm/topology.h
+++ b/arch/loongarch/include/asm/topology.h
@@ -7,7 +7,12 @@
 
 #include <linux/smp.h>
 
-#define cpu_logical_map(cpu)  0
+#ifdef CONFIG_SMP
+#define topology_physical_package_id(cpu)	(cpu_data[cpu].package)
+#define topology_core_id(cpu)			(cpu_core(&cpu_data[cpu]))
+#define topology_core_cpumask(cpu)		(&cpu_core_map[cpu])
+#define topology_sibling_cpumask(cpu)		(&cpu_sibling_map[cpu])
+#endif
 
 #include <asm-generic/topology.h>
 
diff --git a/arch/loongarch/kernel/Makefile b/arch/loongarch/kernel/Makefile
index ead27a11e8e0..5b17e1e3d6f5 100644
--- a/arch/loongarch/kernel/Makefile
+++ b/arch/loongarch/kernel/Makefile
@@ -19,4 +19,6 @@ obj-$(CONFIG_MODULES)		+= module.o module-sections.o
 
 obj-$(CONFIG_PROC_FS)		+= proc.o
 
+obj-$(CONFIG_SMP)		+= smp.o
+
 CPPFLAGS_vmlinux.lds		:= $(KBUILD_CFLAGS)
diff --git a/arch/loongarch/kernel/acpi.c b/arch/loongarch/kernel/acpi.c
index 506ab9912c51..0c7f2d1077a1 100644
--- a/arch/loongarch/kernel/acpi.c
+++ b/arch/loongarch/kernel/acpi.c
@@ -139,6 +139,35 @@ void __init acpi_boot_table_init(void)
 	}
 }
 
+static int set_processor_mask(u32 id, u32 flags)
+{
+
+	int cpu, cpuid = id;
+
+	if (num_processors >= nr_cpu_ids) {
+		pr_warn("acpi: nr_cpus/possible_cpus limit of %i reached."
+			" processor 0x%x ignored.\n", nr_cpu_ids, cpuid);
+
+		return -ENODEV;
+
+	}
+	if (cpuid == loongson_sysconf.boot_cpu_id)
+		cpu = 0;
+	else
+		cpu = cpumask_next_zero(-1, cpu_present_mask);
+
+	if (flags & ACPI_MADT_ENABLED) {
+		num_processors++;
+		set_cpu_possible(cpu, true);
+		set_cpu_present(cpu, true);
+		__cpu_number_map[cpuid] = cpu;
+		__cpu_logical_map[cpu] = cpuid;
+	} else
+		disabled_cpus++;
+
+	return cpu;
+}
+
 static int __init
 acpi_parse_cpuintc(union acpi_subtable_headers *header, const unsigned long end)
 {
@@ -149,6 +178,7 @@ acpi_parse_cpuintc(union acpi_subtable_headers *header, const unsigned long end)
 		return -EINVAL;
 
 	acpi_table_print_madt_entry(&header->common);
+	set_processor_mask(processor->core_id, processor->flags);
 
 	return 0;
 }
@@ -250,7 +280,12 @@ acpi_parse_pch_lpc(union acpi_subtable_headers *header, const unsigned long end)
 
 static void __init acpi_process_madt(void)
 {
-	int error;
+	int i, error;
+
+	for (i = 0; i < NR_CPUS; i++) {
+		__cpu_number_map[i] = -1;
+		__cpu_logical_map[i] = -1;
+	}
 
 	/* Parse MADT CPUINTC entries */
 	error = acpi_table_parse_madt(ACPI_MADT_TYPE_CORE_PIC, acpi_parse_cpuintc, MAX_CORE_PIC);
@@ -336,3 +371,36 @@ void __init arch_reserve_mem_area(acpi_physical_address addr, size_t size)
 {
 	memblock_reserve(addr, size);
 }
+
+#ifdef CONFIG_ACPI_HOTPLUG_CPU
+
+#include <acpi/processor.h>
+
+int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 acpi_id, int *pcpu)
+{
+	int cpu;
+
+	cpu = set_processor_mask(physid, ACPI_MADT_ENABLED);
+	if (cpu < 0) {
+		pr_info(PREFIX "Unable to map lapic to logical cpu number\n");
+		return cpu;
+	}
+
+	*pcpu = cpu;
+
+	return 0;
+}
+EXPORT_SYMBOL(acpi_map_cpu);
+
+int acpi_unmap_cpu(int cpu)
+{
+	set_cpu_present(cpu, false);
+	num_processors--;
+
+	pr_info("cpu%d hot remove!\n", cpu);
+
+	return 0;
+}
+EXPORT_SYMBOL(acpi_unmap_cpu);
+
+#endif /* CONFIG_ACPI_HOTPLUG_CPU */
diff --git a/arch/loongarch/kernel/asm-offsets.c b/arch/loongarch/kernel/asm-offsets.c
index 3531e3c60a6e..7cad66c69e58 100644
--- a/arch/loongarch/kernel/asm-offsets.c
+++ b/arch/loongarch/kernel/asm-offsets.c
@@ -252,3 +252,11 @@ void output_signal_defines(void)
 	DEFINE(_SIGXFSZ, SIGXFSZ);
 	BLANK();
 }
+
+void output_smpboot_defines(void)
+{
+	COMMENT("Linux smp cpu boot offsets.");
+	OFFSET(CPU_BOOT_STACK, secondary_data, stack);
+	OFFSET(CPU_BOOT_TINFO, secondary_data, thread_info);
+	BLANK();
+}
diff --git a/arch/loongarch/kernel/cmpxchg.c b/arch/loongarch/kernel/cmpxchg.c
index 7994489adc79..4c83471c4e47 100644
--- a/arch/loongarch/kernel/cmpxchg.c
+++ b/arch/loongarch/kernel/cmpxchg.c
@@ -93,7 +93,10 @@ unsigned long __cmpxchg_small(volatile void *ptr, unsigned long old,
 	"	or		%1, %1, %6	\n"
 	"	sc.w		%1, %2		\n"
 	"	beqz		%1, 1b		\n"
+	"	b		3f		\n"
 	"2:					\n"
+	__WEAK_LLSC_MB
+	"3:					\n"
 	: "=&r" (old32), "=&r" (temp), "=" GCC_OFF_SMALL_ASM() (*ptr32)
 	: GCC_OFF_SMALL_ASM() (*ptr32), "Jr" (mask), "Jr" (old), "Jr" (new)
 	: "memory");
diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
index 361b72e8bfc5..759d7e391231 100644
--- a/arch/loongarch/kernel/head.S
+++ b/arch/loongarch/kernel/head.S
@@ -111,4 +111,34 @@ SYM_CODE_START(kernel_entry)			# kernel entry point
 
 SYM_CODE_END(kernel_entry)
 
+#ifdef CONFIG_SMP
+
+/*
+ * SMP slave cpus entry point.	Board specific code for bootstrap calls this
+ * function after setting up the stack and tp registers.
+ */
+SYM_CODE_START(smpboot_entry)
+	li.d		t0, CSR_DMW0_INIT	# UC, PLV0
+	csrwr		t0, LOONGARCH_CSR_DMWIN0
+	li.d		t0, CSR_DMW1_INIT	# CA, PLV0
+	csrwr		t0, LOONGARCH_CSR_DMWIN1
+	li.w		t0, 0xb0		# PLV=0, IE=0, PG=1
+	csrwr		t0, LOONGARCH_CSR_CRMD
+	li.w		t0, 0x04		# PLV=0, PIE=1, PWE=0
+	csrwr		t0, LOONGARCH_CSR_PRMD
+	li.w		t0, 0x00		# FPE=0, SXE=0, ASXE=0, BTE=0
+	csrwr		t0, LOONGARCH_CSR_EUEN
+
+	la.abs		t0, cpuboot_data
+	ld.d		sp, t0, CPU_BOOT_STACK
+	ld.d		tp, t0, CPU_BOOT_TINFO
+
+	la.abs	t0, 0f
+	jirl	zero, t0, 0
+0:
+	bl		start_secondary
+SYM_CODE_END(smpboot_entry)
+
+#endif /* CONFIG_SMP */
+
 SYM_ENTRY(kernel_entry_end, SYM_L_GLOBAL, SYM_A_NONE)
diff --git a/arch/loongarch/kernel/irq.c b/arch/loongarch/kernel/irq.c
index 48032ffd9331..fd0de65f7911 100644
--- a/arch/loongarch/kernel/irq.c
+++ b/arch/loongarch/kernel/irq.c
@@ -73,6 +73,7 @@ asmlinkage void spurious_interrupt(void)
 
 int arch_show_interrupts(struct seq_file *p, int prec)
 {
+	show_ipi_list(p, prec);
 	seq_printf(p, "%*s: %10u\n", prec, "ERR", atomic_read(&irq_err_count));
 	return 0;
 }
@@ -108,13 +109,21 @@ void __init setup_IRQ(void)
 
 void __init init_IRQ(void)
 {
-	int i;
+	int i, r, ipi_irq;
+	static int ipi_dummy_dev;
 	unsigned int order = get_order(IRQ_STACK_SIZE);
 
 	clear_csr_ecfg(ECFG0_IM);
 	clear_csr_estat(ESTATF_IP);
 
 	setup_IRQ();
+#ifdef CONFIG_SMP
+	ipi_irq = get_ipi_irq();
+	irq_set_percpu_devid(ipi_irq);
+	r = request_percpu_irq(ipi_irq, loongson3_ipi_interrupt, "IPI", &ipi_dummy_dev);
+	if (r < 0)
+		panic("IPI IRQ request failed\n");
+#endif
 
 	for (i = 0; i < NR_IRQS; i++)
 		irq_set_noprobe(i);
diff --git a/arch/loongarch/kernel/proc.c b/arch/loongarch/kernel/proc.c
index 1f2ee5b15b50..a535138d87de 100644
--- a/arch/loongarch/kernel/proc.c
+++ b/arch/loongarch/kernel/proc.c
@@ -35,6 +35,11 @@ static int show_cpuinfo(struct seq_file *m, void *v)
 	unsigned int fp_version = cpu_data[n].fpu_vers;
 	struct proc_cpuinfo_notifier_args proc_cpuinfo_notifier_args;
 
+#ifdef CONFIG_SMP
+	if (!cpu_online(n))
+		return 0;
+#endif
+
 	/*
 	 * For the first processor also print the system type
 	 */
diff --git a/arch/loongarch/kernel/process.c b/arch/loongarch/kernel/process.c
index 8ac0dcf18be3..4c295bfee79c 100644
--- a/arch/loongarch/kernel/process.c
+++ b/arch/loongarch/kernel/process.c
@@ -53,6 +53,13 @@
 unsigned long boot_option_idle_override = IDLE_NO_OVERRIDE;
 EXPORT_SYMBOL(boot_option_idle_override);
 
+#ifdef CONFIG_HOTPLUG_CPU
+void arch_cpu_idle_dead(void)
+{
+	play_dead();
+}
+#endif
+
 asmlinkage void ret_from_fork(void);
 asmlinkage void ret_from_kernel_thread(void);
 
diff --git a/arch/loongarch/kernel/reset.c b/arch/loongarch/kernel/reset.c
index ef484ce43c5c..2b86469e4718 100644
--- a/arch/loongarch/kernel/reset.c
+++ b/arch/loongarch/kernel/reset.c
@@ -65,16 +65,28 @@ EXPORT_SYMBOL(pm_power_off);
 
 void machine_halt(void)
 {
+#ifdef CONFIG_SMP
+	preempt_disable();
+	smp_send_stop();
+#endif
 	default_halt();
 }
 
 void machine_power_off(void)
 {
+#ifdef CONFIG_SMP
+	preempt_disable();
+	smp_send_stop();
+#endif
 	pm_power_off();
 }
 
 void machine_restart(char *command)
 {
+#ifdef CONFIG_SMP
+	preempt_disable();
+	smp_send_stop();
+#endif
 	do_kernel_restart(command);
 	pm_restart();
 }
diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c
index 13c79c9ce558..7bf9c255d036 100644
--- a/arch/loongarch/kernel/setup.c
+++ b/arch/loongarch/kernel/setup.c
@@ -39,6 +39,7 @@
 #include <asm/pgalloc.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
+#include <asm/smp.h>
 #include <asm/time.h>
 
 #define SMBIOS_BIOSSIZE_OFFSET		0x09
@@ -401,6 +402,29 @@ static int __init reserve_memblock_reserved_regions(void)
 }
 arch_initcall(reserve_memblock_reserved_regions);
 
+#ifdef CONFIG_SMP
+static void __init prefill_possible_map(void)
+{
+	int i, possible;
+
+	possible = num_processors + disabled_cpus;
+	if (possible > nr_cpu_ids)
+		possible = nr_cpu_ids;
+
+	pr_info("SMP: Allowing %d CPUs, %d hotplug CPUs\n",
+			possible, max((possible - num_processors), 0));
+
+	for (i = 0; i < possible; i++)
+		set_cpu_possible(i, true);
+	for (; i < NR_CPUS; i++)
+		set_cpu_possible(i, false);
+
+	nr_cpu_ids = possible;
+}
+#else
+static inline void prefill_possible_map(void) {}
+#endif
+
 void __init setup_arch(char **cmdline_p)
 {
 	cpu_probe();
@@ -417,6 +441,8 @@ void __init setup_arch(char **cmdline_p)
 	arch_mem_init(cmdline_p);
 
 	resource_init();
+	plat_smp_setup();
+	prefill_possible_map();
 
 	paging_init();
 }
diff --git a/arch/loongarch/kernel/smp.c b/arch/loongarch/kernel/smp.c
new file mode 100644
index 000000000000..27704f30754b
--- /dev/null
+++ b/arch/loongarch/kernel/smp.c
@@ -0,0 +1,746 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ *
+ * Derived from MIPS:
+ * Copyright (C) 2000, 2001 Kanoj Sarcar
+ * Copyright (C) 2000, 2001 Ralf Baechle
+ * Copyright (C) 2000, 2001 Silicon Graphics, Inc.
+ * Copyright (C) 2000, 2001, 2003 Broadcom Corporation
+ */
+#include <linux/cpu.h>
+#include <linux/cpumask.h>
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/seq_file.h>
+#include <linux/smp.h>
+#include <linux/threads.h>
+#include <linux/export.h>
+#include <linux/time.h>
+#include <linux/tracepoint.h>
+#include <linux/sched/hotplug.h>
+#include <linux/sched/task_stack.h>
+
+#include <asm/cpu.h>
+#include <asm/idle.h>
+#include <asm/loongson.h>
+#include <asm/mmu_context.h>
+#include <asm/processor.h>
+#include <asm/setup.h>
+#include <asm/time.h>
+
+int __cpu_number_map[NR_CPUS];   /* Map physical to logical */
+EXPORT_SYMBOL(__cpu_number_map);
+
+int __cpu_logical_map[NR_CPUS];		/* Map logical to physical */
+EXPORT_SYMBOL(__cpu_logical_map);
+
+/* Number of threads (siblings) per CPU core */
+int smp_num_siblings = 1;
+EXPORT_SYMBOL(smp_num_siblings);
+
+/* Representing the threads (siblings) of each logical CPU */
+cpumask_t cpu_sibling_map[NR_CPUS] __read_mostly;
+EXPORT_SYMBOL(cpu_sibling_map);
+
+/* Representing the core map of multi-core chips of each logical CPU */
+cpumask_t cpu_core_map[NR_CPUS] __read_mostly;
+EXPORT_SYMBOL(cpu_core_map);
+
+static DECLARE_COMPLETION(cpu_starting);
+static DECLARE_COMPLETION(cpu_running);
+
+/*
+ * A logcal cpu mask containing only one VPE per core to
+ * reduce the number of IPIs on large MT systems.
+ */
+cpumask_t cpu_foreign_map[NR_CPUS] __read_mostly;
+EXPORT_SYMBOL(cpu_foreign_map);
+
+/* representing cpus for which sibling maps can be computed */
+static cpumask_t cpu_sibling_setup_map;
+
+/* representing cpus for which core maps can be computed */
+static cpumask_t cpu_core_setup_map;
+
+struct secondary_data cpuboot_data;
+static DEFINE_PER_CPU(int, cpu_state);
+DEFINE_PER_CPU_SHARED_ALIGNED(irq_cpustat_t, irq_stat);
+EXPORT_PER_CPU_SYMBOL(irq_stat);
+
+enum ipi_msg_type {
+	IPI_RESCHEDULE,
+	IPI_CALL_FUNCTION,
+};
+
+static const char *ipi_types[NR_IPI] __tracepoint_string = {
+	[IPI_RESCHEDULE] = "Rescheduling interrupts",
+	[IPI_CALL_FUNCTION] = "Call Function interrupts",
+};
+
+void show_ipi_list(struct seq_file *p, int prec)
+{
+	unsigned int cpu, i;
+
+	for (i = 0; i < NR_IPI; i++) {
+		seq_printf(p, "%*s%u:%s", prec - 1, "IPI", i, prec >= 4 ? " " : "");
+		for_each_online_cpu(cpu)
+			seq_printf(p, "%10u ", per_cpu(irq_stat, cpu).ipi_irqs[i]);
+		seq_printf(p, " LoongArch  %d  %s\n", i + 1, ipi_types[i]);
+	}
+}
+
+/* Send mailbox buffer via Mail_Send */
+static void csr_mail_send(uint64_t data, int cpu, int mailbox)
+{
+	uint64_t val;
+
+	/* Send high 32 bits */
+	val = IOCSR_MBUF_SEND_BLOCKING;
+	val |= (IOCSR_MBUF_SEND_BOX_HI(mailbox) << IOCSR_MBUF_SEND_BOX_SHIFT);
+	val |= (cpu << IOCSR_MBUF_SEND_CPU_SHIFT);
+	val |= (data & IOCSR_MBUF_SEND_H32_MASK);
+	iocsr_writeq(val, LOONGARCH_IOCSR_MBUF_SEND);
+
+	/* Send low 32 bits */
+	val = IOCSR_MBUF_SEND_BLOCKING;
+	val |= (IOCSR_MBUF_SEND_BOX_LO(mailbox) << IOCSR_MBUF_SEND_BOX_SHIFT);
+	val |= (cpu << IOCSR_MBUF_SEND_CPU_SHIFT);
+	val |= (data << IOCSR_MBUF_SEND_BUF_SHIFT);
+	iocsr_writeq(val, LOONGARCH_IOCSR_MBUF_SEND);
+};
+
+static u32 ipi_read_clear(int cpu)
+{
+	u32 action;
+
+	/* Load the ipi register to figure out what we're supposed to do */
+	action = iocsr_readl(LOONGARCH_IOCSR_IPI_STATUS);
+	/* Clear the ipi register to clear the interrupt */
+	iocsr_writel(action, LOONGARCH_IOCSR_IPI_CLEAR);
+	smp_mb();
+
+	return action;
+}
+
+static void ipi_write_action(int cpu, u32 action)
+{
+	unsigned int irq = 0;
+
+	while ((irq = ffs(action))) {
+		uint32_t val = IOCSR_IPI_SEND_BLOCKING;
+
+		val |= (irq - 1);
+		val |= (cpu << IOCSR_IPI_SEND_CPU_SHIFT);
+		iocsr_writel(val, LOONGARCH_IOCSR_IPI_SEND);
+		action &= ~BIT(irq - 1);
+	}
+}
+
+void loongson3_send_ipi_single(int cpu, unsigned int action)
+{
+	ipi_write_action(cpu_logical_map(cpu), (u32)action);
+}
+
+void loongson3_send_ipi_mask(const struct cpumask *mask, unsigned int action)
+{
+	unsigned int i;
+
+	for_each_cpu(i, mask)
+		ipi_write_action(cpu_logical_map(i), (u32)action);
+}
+
+irqreturn_t loongson3_ipi_interrupt(int irq, void *dev)
+{
+	unsigned int action;
+	unsigned int cpu = smp_processor_id();
+
+	action = ipi_read_clear(cpu_logical_map(cpu));
+
+	if (action & SMP_RESCHEDULE) {
+		scheduler_ipi();
+		per_cpu(irq_stat, cpu).ipi_irqs[IPI_RESCHEDULE]++;
+	}
+
+	if (action & SMP_CALL_FUNCTION) {
+		generic_smp_call_function_interrupt();
+		per_cpu(irq_stat, cpu).ipi_irqs[IPI_CALL_FUNCTION]++;
+	}
+
+	return IRQ_HANDLED;
+}
+
+void __init loongson3_smp_setup(void)
+{
+	cpu_set_core(&cpu_data[0],
+		     cpu_logical_map(0) % loongson_sysconf.cores_per_package);
+	cpu_set_cluster(&cpu_data[0],
+		     cpu_logical_map(0) / loongson_sysconf.cores_per_package);
+	cpu_data[0].package = cpu_logical_map(0) / loongson_sysconf.cores_per_package;
+
+	iocsr_writel(0xffffffff, LOONGARCH_IOCSR_IPI_EN);
+	pr_info("Detected %i available CPU(s)\n", loongson_sysconf.nr_cpus);
+}
+
+void __init loongson3_prepare_cpus(unsigned int max_cpus)
+{
+	int i = 0;
+
+	for (i = 0; i < loongson_sysconf.nr_cpus; i++) {
+		set_cpu_present(i, true);
+		csr_mail_send(0, __cpu_logical_map[i], 0);
+	}
+
+	per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE;
+}
+
+/*
+ * Setup the PC, SP, and TP of a secondary processor and start it running!
+ */
+void loongson3_boot_secondary(int cpu, struct task_struct *idle)
+{
+	unsigned long entry;
+
+	pr_info("Booting CPU#%d...\n", cpu);
+
+	entry = __pa_symbol((unsigned long)&smpboot_entry);
+	cpuboot_data.stack = (unsigned long)__KSTK_TOS(idle);
+	cpuboot_data.thread_info = (unsigned long)task_thread_info(idle);
+
+	csr_mail_send(entry, cpu_logical_map(cpu), 0);
+
+	loongson3_send_ipi_single(cpu, SMP_BOOT_CPU);
+}
+
+/*
+ * SMP init and finish on secondary CPUs
+ */
+void loongson3_init_secondary(void)
+{
+	unsigned int cpu = smp_processor_id();
+	unsigned int imask = ECFGF_IP0 | ECFGF_IP1 | ECFGF_IP2 |
+			     ECFGF_IPI | ECFGF_PMC | ECFGF_TIMER;
+
+	change_csr_ecfg(ECFG0_IM, imask);
+
+	iocsr_writel(0xffffffff, LOONGARCH_IOCSR_IPI_EN);
+
+	per_cpu(cpu_state, cpu) = CPU_ONLINE;
+	cpu_set_core(&cpu_data[cpu],
+		     cpu_logical_map(cpu) % loongson_sysconf.cores_per_package);
+	cpu_set_cluster(&cpu_data[cpu],
+		     cpu_logical_map(cpu) / loongson_sysconf.cores_per_package);
+	cpu_data[cpu].package =
+		     cpu_logical_map(cpu) / loongson_sysconf.cores_per_package;
+}
+
+void loongson3_smp_finish(void)
+{
+	local_irq_enable();
+	iocsr_writeq(0, LOONGARCH_IOCSR_MBUF0);
+	pr_info("CPU#%d finished\n", smp_processor_id());
+}
+
+#ifdef CONFIG_HOTPLUG_CPU
+
+static bool io_master(int cpu)
+{
+	int i, node, master;
+
+	if (cpu == 0)
+		return true;
+
+	for (i = 1; i < loongson_sysconf.nr_io_pics; i++) {
+		node = eiointc_get_node(i);
+		master = cpu_number_map(node * CORES_PER_EIO_NODE);
+		if (cpu == master)
+			return true;
+	}
+
+	return false;
+}
+
+int loongson3_cpu_disable(void)
+{
+	unsigned long flags;
+	unsigned int cpu = smp_processor_id();
+
+	if (io_master(cpu))
+		return -EBUSY;
+
+	set_cpu_online(cpu, false);
+	calculate_cpu_foreign_map();
+	local_irq_save(flags);
+	irq_migrate_all_off_this_cpu();
+	clear_csr_ecfg(ECFG0_IM);
+	local_irq_restore(flags);
+	local_flush_tlb_all();
+
+	return 0;
+}
+
+void loongson3_cpu_die(unsigned int cpu)
+{
+	while (per_cpu(cpu_state, cpu) != CPU_DEAD)
+		cpu_relax();
+
+	mb();
+}
+
+/*
+ * The target CPU should go to XKPRANGE (uncached area) and flush
+ * ICache/DCache/VCache before the control CPU can safely disable its clock.
+ */
+static void loongson3_play_dead(int *state_addr)
+{
+	register int val;
+	register void *addr;
+	register void (*init_fn)(void);
+
+	__asm__ __volatile__(
+		"   li.d %[addr], 0x8000000000000000\n"
+		"1: cacop 0x8, %[addr], 0           \n" /* flush ICache */
+		"   cacop 0x8, %[addr], 1           \n"
+		"   cacop 0x8, %[addr], 2           \n"
+		"   cacop 0x8, %[addr], 3           \n"
+		"   cacop 0x9, %[addr], 0           \n" /* flush DCache */
+		"   cacop 0x9, %[addr], 1           \n"
+		"   cacop 0x9, %[addr], 2           \n"
+		"   cacop 0x9, %[addr], 3           \n"
+		"   addi.w %[sets], %[sets], -1     \n"
+		"   addi.d %[addr], %[addr], 0x40   \n"
+		"   bnez %[sets], 1b                \n"
+		"   li.d %[addr], 0x8000000000000000\n"
+		"2: cacop 0xa, %[addr], 0           \n" /* flush VCache */
+		"   cacop 0xa, %[addr], 1           \n"
+		"   cacop 0xa, %[addr], 2           \n"
+		"   cacop 0xa, %[addr], 3           \n"
+		"   cacop 0xa, %[addr], 4           \n"
+		"   cacop 0xa, %[addr], 5           \n"
+		"   cacop 0xa, %[addr], 6           \n"
+		"   cacop 0xa, %[addr], 7           \n"
+		"   cacop 0xa, %[addr], 8           \n"
+		"   cacop 0xa, %[addr], 9           \n"
+		"   cacop 0xa, %[addr], 10          \n"
+		"   cacop 0xa, %[addr], 11          \n"
+		"   cacop 0xa, %[addr], 12          \n"
+		"   cacop 0xa, %[addr], 13          \n"
+		"   cacop 0xa, %[addr], 14          \n"
+		"   cacop 0xa, %[addr], 15          \n"
+		"   addi.w %[vsets], %[vsets], -1   \n"
+		"   addi.d %[addr], %[addr], 0x40   \n"
+		"   bnez   %[vsets], 2b             \n"
+		"   li.w   %[val], 0x7              \n" /* *state_addr = CPU_DEAD; */
+		"   st.w   %[val], %[state_addr], 0 \n"
+		"   dbar 0                          \n"
+		"   cacop 0x11, %[state_addr], 0    \n" /* flush entry of *state_addr */
+		: [addr] "=&r" (addr), [val] "=&r" (val)
+		: [state_addr] "r" (state_addr),
+		  [sets] "r" (cpu_data[smp_processor_id()].dcache.sets),
+		  [vsets] "r" (cpu_data[smp_processor_id()].vcache.sets));
+
+	local_irq_enable();
+	change_csr_ecfg(ECFG0_IM, ECFGF_IPI);
+
+	__asm__ __volatile__(
+		"   idle      0			    \n"
+		"   li.w      $t0, 0x1020	    \n"
+		"   iocsrrd.d %[init_fn], $t0	    \n" /* Get init PC */
+		: [init_fn] "=&r" (addr)
+		: /* No Input */
+		: "a0");
+	init_fn = __va(addr);
+
+	init_fn();
+	unreachable();
+}
+
+void play_dead(void)
+{
+	int *state_addr;
+	unsigned int cpu = smp_processor_id();
+	void (*play_dead_uncached)(int *s);
+
+	idle_task_exit();
+	play_dead_uncached = (void *)TO_UNCAC(__pa((unsigned long)loongson3_play_dead));
+	state_addr = &per_cpu(cpu_state, cpu);
+	mb();
+	play_dead_uncached(state_addr);
+}
+
+static int loongson3_enable_clock(unsigned int cpu)
+{
+	uint64_t core_id = cpu_core(&cpu_data[cpu]);
+	uint64_t package_id = cpu_data[cpu].package;
+
+	LOONGSON_FREQCTRL(package_id) |= 1 << (core_id * 4 + 3);
+
+	return 0;
+}
+
+static int loongson3_disable_clock(unsigned int cpu)
+{
+	uint64_t core_id = cpu_core(&cpu_data[cpu]);
+	uint64_t package_id = cpu_data[cpu].package;
+
+	LOONGSON_FREQCTRL(package_id) &= ~(1 << (core_id * 4 + 3));
+
+	return 0;
+}
+
+static int register_loongson3_notifier(void)
+{
+	return cpuhp_setup_state_nocalls(CPUHP_LOONGARCH_SOC_PREPARE,
+					 "loongarch/loongson:prepare",
+					 loongson3_enable_clock,
+					 loongson3_disable_clock);
+}
+early_initcall(register_loongson3_notifier);
+
+#endif
+
+/*
+ * Power management
+ */
+#ifdef CONFIG_PM
+
+static int loongson3_ipi_suspend(void)
+{
+	return 0;
+}
+
+static void loongson3_ipi_resume(void)
+{
+	iocsr_writel(0xffffffff, LOONGARCH_IOCSR_IPI_EN);
+}
+
+static struct syscore_ops loongson3_ipi_syscore_ops = {
+	.resume         = loongson3_ipi_resume,
+	.suspend        = loongson3_ipi_suspend,
+};
+
+/*
+ * Enable boot cpu ipi before enabling nonboot cpus
+ * during syscore_resume.
+ */
+static int __init ipi_pm_init(void)
+{
+	register_syscore_ops(&loongson3_ipi_syscore_ops);
+	return 0;
+}
+
+core_initcall(ipi_pm_init);
+#endif
+
+static inline void set_cpu_sibling_map(int cpu)
+{
+	int i;
+
+	cpumask_set_cpu(cpu, &cpu_sibling_setup_map);
+
+	if (smp_num_siblings <= 1)
+		cpumask_set_cpu(cpu, &cpu_sibling_map[cpu]);
+	else {
+		for_each_cpu(i, &cpu_sibling_setup_map) {
+			if (cpus_are_siblings(cpu, i)) {
+				cpumask_set_cpu(i, &cpu_sibling_map[cpu]);
+				cpumask_set_cpu(cpu, &cpu_sibling_map[i]);
+			}
+		}
+	}
+}
+
+static inline void set_cpu_core_map(int cpu)
+{
+	int i;
+
+	cpumask_set_cpu(cpu, &cpu_core_setup_map);
+
+	for_each_cpu(i, &cpu_core_setup_map) {
+		if (cpu_data[cpu].package == cpu_data[i].package) {
+			cpumask_set_cpu(i, &cpu_core_map[cpu]);
+			cpumask_set_cpu(cpu, &cpu_core_map[i]);
+		}
+	}
+}
+
+/*
+ * Calculate a new cpu_foreign_map mask whenever a
+ * new cpu appears or disappears.
+ */
+void calculate_cpu_foreign_map(void)
+{
+	int i, k, core_present;
+	cpumask_t temp_foreign_map;
+
+	/* Re-calculate the mask */
+	cpumask_clear(&temp_foreign_map);
+	for_each_online_cpu(i) {
+		core_present = 0;
+		for_each_cpu(k, &temp_foreign_map)
+			if (cpus_are_siblings(i, k))
+				core_present = 1;
+		if (!core_present)
+			cpumask_set_cpu(i, &temp_foreign_map);
+	}
+
+	for_each_online_cpu(i)
+		cpumask_andnot(&cpu_foreign_map[i],
+			       &temp_foreign_map, &cpu_sibling_map[i]);
+}
+
+/* Preload SMP state for boot cpu */
+void smp_prepare_boot_cpu(void)
+{
+	unsigned int cpu;
+
+	set_cpu_possible(0, true);
+	set_cpu_online(0, true);
+	set_my_cpu_offset(per_cpu_offset(0));
+
+	for_each_possible_cpu(cpu)
+		set_cpu_numa_node(cpu, 0);
+}
+
+/* called from main before smp_init() */
+void __init smp_prepare_cpus(unsigned int max_cpus)
+{
+	init_new_context(current, &init_mm);
+	current_thread_info()->cpu = 0;
+	loongson3_prepare_cpus(max_cpus);
+	set_cpu_sibling_map(0);
+	set_cpu_core_map(0);
+	calculate_cpu_foreign_map();
+#ifndef CONFIG_HOTPLUG_CPU
+	init_cpu_present(cpu_possible_mask);
+#endif
+}
+
+int __cpu_up(unsigned int cpu, struct task_struct *tidle)
+{
+	loongson3_boot_secondary(cpu, tidle);
+
+	/* Wait for CPU to start and be ready to sync counters */
+	if (!wait_for_completion_timeout(&cpu_starting,
+					 msecs_to_jiffies(5000))) {
+		pr_crit("CPU%u: failed to start\n", cpu);
+		return -EIO;
+	}
+
+	/* Wait for CPU to finish startup & mark itself online before return */
+	wait_for_completion(&cpu_running);
+
+	return 0;
+}
+
+/*
+ * First C code run on the secondary CPUs after being started up by
+ * the master.
+ */
+asmlinkage void start_secondary(void)
+{
+	unsigned int cpu;
+
+	sync_counter();
+	cpu = smp_processor_id();
+	set_my_cpu_offset(per_cpu_offset(cpu));
+
+	cpu_probe();
+	constant_clockevent_init();
+	loongson3_init_secondary();
+
+	set_cpu_sibling_map(cpu);
+	set_cpu_core_map(cpu);
+
+	notify_cpu_starting(cpu);
+
+	/* Notify boot CPU that we're starting */
+	complete(&cpu_starting);
+
+	/* The CPU is running, now mark it online */
+	set_cpu_online(cpu, true);
+
+	calculate_cpu_foreign_map();
+
+	/*
+	 * Notify boot CPU that we're up & online and it can safely return
+	 * from __cpu_up()
+	 */
+	complete(&cpu_running);
+
+	/*
+	 * irq will be enabled in loongson3_smp_finish(), enabling it too
+	 * early is dangerous.
+	 */
+	WARN_ON_ONCE(!irqs_disabled());
+	loongson3_smp_finish();
+
+	cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
+}
+
+void __init smp_cpus_done(unsigned int max_cpus)
+{
+}
+
+static void stop_this_cpu(void *dummy)
+{
+	set_cpu_online(smp_processor_id(), false);
+	calculate_cpu_foreign_map();
+	local_irq_disable();
+	while (true);
+}
+
+void smp_send_stop(void)
+{
+	smp_call_function(stop_this_cpu, NULL, 0);
+}
+
+int setup_profiling_timer(unsigned int multiplier)
+{
+	return 0;
+}
+
+static void flush_tlb_all_ipi(void *info)
+{
+	local_flush_tlb_all();
+}
+
+void flush_tlb_all(void)
+{
+	on_each_cpu(flush_tlb_all_ipi, NULL, 1);
+}
+
+static void flush_tlb_mm_ipi(void *mm)
+{
+	local_flush_tlb_mm((struct mm_struct *)mm);
+}
+
+void flush_tlb_mm(struct mm_struct *mm)
+{
+	preempt_disable();
+
+	if ((atomic_read(&mm->mm_users) != 1) || (current->mm != mm)) {
+		on_each_cpu_mask(mm_cpumask(mm), flush_tlb_mm_ipi, mm, 1);
+	} else {
+		unsigned int cpu;
+
+		for_each_online_cpu(cpu) {
+			if (cpu != smp_processor_id() && cpu_context(cpu, mm))
+				cpu_context(cpu, mm) = 0;
+		}
+		local_flush_tlb_mm(mm);
+	}
+
+	preempt_enable();
+}
+
+struct flush_tlb_data {
+	struct vm_area_struct *vma;
+	unsigned long addr1;
+	unsigned long addr2;
+};
+
+static void flush_tlb_range_ipi(void *info)
+{
+	struct flush_tlb_data *fd = info;
+
+	local_flush_tlb_range(fd->vma, fd->addr1, fd->addr2);
+}
+
+void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
+{
+	struct mm_struct *mm = vma->vm_mm;
+
+	preempt_disable();
+	if ((atomic_read(&mm->mm_users) != 1) || (current->mm != mm)) {
+		struct flush_tlb_data fd = {
+			.vma = vma,
+			.addr1 = start,
+			.addr2 = end,
+		};
+
+		on_each_cpu_mask(mm_cpumask(mm), flush_tlb_range_ipi, &fd, 1);
+	} else {
+		unsigned int cpu;
+		int exec = vma->vm_flags & VM_EXEC;
+
+		for_each_online_cpu(cpu) {
+			/*
+			 * flush_cache_range() will only fully flush icache if
+			 * the VMA is executable, otherwise we must invalidate
+			 * ASID without it appearing to has_valid_asid() as if
+			 * mm has been completely unused by that CPU.
+			 */
+			if (cpu != smp_processor_id() && cpu_context(cpu, mm))
+				cpu_context(cpu, mm) = !exec;
+		}
+		local_flush_tlb_range(vma, start, end);
+	}
+	preempt_enable();
+}
+
+static void flush_tlb_kernel_range_ipi(void *info)
+{
+	struct flush_tlb_data *fd = info;
+
+	local_flush_tlb_kernel_range(fd->addr1, fd->addr2);
+}
+
+void flush_tlb_kernel_range(unsigned long start, unsigned long end)
+{
+	struct flush_tlb_data fd = {
+		.addr1 = start,
+		.addr2 = end,
+	};
+
+	on_each_cpu(flush_tlb_kernel_range_ipi, &fd, 1);
+}
+
+static void flush_tlb_page_ipi(void *info)
+{
+	struct flush_tlb_data *fd = info;
+
+	local_flush_tlb_page(fd->vma, fd->addr1);
+}
+
+void flush_tlb_page(struct vm_area_struct *vma, unsigned long page)
+{
+	preempt_disable();
+	if ((atomic_read(&vma->vm_mm->mm_users) != 1) || (current->mm != vma->vm_mm)) {
+		struct flush_tlb_data fd = {
+			.vma = vma,
+			.addr1 = page,
+		};
+
+		on_each_cpu_mask(mm_cpumask(vma->vm_mm), flush_tlb_page_ipi, &fd, 1);
+	} else {
+		unsigned int cpu;
+
+		for_each_online_cpu(cpu) {
+			/*
+			 * flush_cache_page() only does partial flushes, so
+			 * invalidate ASID without it appearing to
+			 * has_valid_asid() as if mm has been completely unused
+			 * by that CPU.
+			 */
+			if (cpu != smp_processor_id() && cpu_context(cpu, vma->vm_mm))
+				cpu_context(cpu, vma->vm_mm) = 1;
+		}
+		local_flush_tlb_page(vma, page);
+	}
+	preempt_enable();
+}
+EXPORT_SYMBOL(flush_tlb_page);
+
+static void flush_tlb_one_ipi(void *info)
+{
+	unsigned long vaddr = (unsigned long) info;
+
+	local_flush_tlb_one(vaddr);
+}
+
+void flush_tlb_one(unsigned long vaddr)
+{
+	on_each_cpu(flush_tlb_one_ipi, (void *)vaddr, 1);
+}
+EXPORT_SYMBOL(flush_tlb_one);
diff --git a/arch/loongarch/kernel/topology.c b/arch/loongarch/kernel/topology.c
index 3b2cbb95875b..ab1a75c0b5a6 100644
--- a/arch/loongarch/kernel/topology.c
+++ b/arch/loongarch/kernel/topology.c
@@ -1,13 +1,52 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <linux/cpu.h>
+#include <linux/cpumask.h>
 #include <linux/init.h>
+#include <linux/node.h>
+#include <linux/nodemask.h>
 #include <linux/percpu.h>
 
-static struct cpu cpu_device;
+static DEFINE_PER_CPU(struct cpu, cpu_devices);
+
+#ifdef CONFIG_HOTPLUG_CPU
+int arch_register_cpu(int cpu)
+{
+	int ret;
+	struct cpu *c = &per_cpu(cpu_devices, cpu);
+
+	c->hotpluggable = 1;
+	ret = register_cpu(c, cpu);
+	if (ret < 0)
+		pr_warn("register_cpu %d failed (%d)\n", cpu, ret);
+
+	return ret;
+}
+EXPORT_SYMBOL(arch_register_cpu);
+
+void arch_unregister_cpu(int cpu)
+{
+	struct cpu *c = &per_cpu(cpu_devices, cpu);
+
+	c->hotpluggable = 0;
+	unregister_cpu(c);
+}
+EXPORT_SYMBOL(arch_unregister_cpu);
+#endif
 
 static int __init topology_init(void)
 {
-	return register_cpu(&cpu_device, 0);
+	int i, ret;
+
+	for_each_present_cpu(i) {
+		struct cpu *c = &per_cpu(cpu_devices, i);
+
+		c->hotpluggable = !!i;
+		ret = register_cpu(c, i);
+		if (ret < 0)
+			pr_warn("topology_init: register_cpu %d failed (%d)\n", i, ret);
+	}
+
+	return 0;
 }
 
 subsys_initcall(topology_init);
diff --git a/arch/loongarch/kernel/vmlinux.lds.S b/arch/loongarch/kernel/vmlinux.lds.S
index 7da4c4d7c50d..006cbb1bd5c6 100644
--- a/arch/loongarch/kernel/vmlinux.lds.S
+++ b/arch/loongarch/kernel/vmlinux.lds.S
@@ -73,6 +73,10 @@ SECTIONS
 		EXIT_DATA
 	}
 
+#ifdef CONFIG_SMP
+	PERCPU_SECTION(1 << CONFIG_L1_CACHE_SHIFT)
+#endif
+
 	.init.bss : {
 		*(.init.bss)
 	}
diff --git a/arch/loongarch/mm/tlbex.S b/arch/loongarch/mm/tlbex.S
index a4ca4e507ee8..606b7800edc6 100644
--- a/arch/loongarch/mm/tlbex.S
+++ b/arch/loongarch/mm/tlbex.S
@@ -88,7 +88,14 @@ vmalloc_done_load:
 	slli.d	t0, t0, _PTE_T_LOG2
 	add.d	t1, ra, t0
 
+#ifdef CONFIG_SMP
+smp_pgtable_change_load:
+#endif
+#ifdef CONFIG_SMP
+	ll.d	t0, t1, 0
+#else
 	ld.d	t0, t1, 0
+#endif
 	tlbsrch
 
 	srli.d	ra, t0, _PAGE_PRESENT_SHIFT
@@ -96,7 +103,12 @@ vmalloc_done_load:
 	beq	ra, $r0, nopage_tlb_load
 
 	ori	t0, t0, _PAGE_VALID
+#ifdef CONFIG_SMP
+	sc.d	t0, t1, 0
+	beq	t0, $r0, smp_pgtable_change_load
+#else
 	st.d	t0, t1, 0
+#endif
 	ori	t1, t1, 8
 	xori	t1, t1, 8
 	ld.d	t0, t1, 0
@@ -120,14 +132,24 @@ vmalloc_load:
 	 * spots a huge page.
 	 */
 tlb_huge_update_load:
+#ifdef CONFIG_SMP
+	ll.d	t0, t1, 0
+#else
 	ld.d	t0, t1, 0
+#endif
 	srli.d	ra, t0, _PAGE_PRESENT_SHIFT
 	andi	ra, ra, 1
 	beq	ra, $r0, nopage_tlb_load
 	tlbsrch
 
 	ori	t0, t0, _PAGE_VALID
+#ifdef CONFIG_SMP
+	sc.d	t0, t1, 0
+	beq	t0, $r0, tlb_huge_update_load
+	ld.d	t0, t1, 0
+#else
 	st.d	t0, t1, 0
+#endif
 	addu16i.d	t1, $r0, -(CSR_TLBIDX_EHINV >> 16)
 	addi.d	ra, t1, 0
 	csrxchg	ra, t1, LOONGARCH_CSR_TLBIDX
@@ -173,6 +195,7 @@ tlb_huge_update_load:
 	csrxchg		t1, t0, LOONGARCH_CSR_TLBIDX
 
 nopage_tlb_load:
+	dbar	0
 	csrrd	ra, EXCEPTION_KS2
 	la.abs	t0, tlb_do_page_fault_0
 	jirl	$r0, t0, 0
@@ -229,7 +252,14 @@ vmalloc_done_store:
 	slli.d	t0, t0, _PTE_T_LOG2
 	add.d	t1, ra, t0
 
+#ifdef CONFIG_SMP
+smp_pgtable_change_store:
+#endif
+#ifdef CONFIG_SMP
+	ll.d	t0, t1, 0
+#else
 	ld.d	t0, t1, 0
+#endif
 	tlbsrch
 
 	srli.d	ra, t0, _PAGE_PRESENT_SHIFT
@@ -238,7 +268,12 @@ vmalloc_done_store:
 	bne	ra, $r0, nopage_tlb_store
 
 	ori	t0, t0, (_PAGE_VALID | _PAGE_DIRTY | _PAGE_MODIFIED)
+#ifdef CONFIG_SMP
+	sc.d	t0, t1, 0
+	beq	t0, $r0, smp_pgtable_change_store
+#else
 	st.d	t0, t1, 0
+#endif
 
 	ori	t1, t1, 8
 	xori	t1, t1, 8
@@ -263,7 +298,11 @@ vmalloc_store:
 	 * spots a huge page.
 	 */
 tlb_huge_update_store:
+#ifdef CONFIG_SMP
+	ll.d	t0, t1, 0
+#else
 	ld.d	t0, t1, 0
+#endif
 	srli.d	ra, t0, _PAGE_PRESENT_SHIFT
 	andi	ra, ra, ((_PAGE_PRESENT | _PAGE_WRITE) >> _PAGE_PRESENT_SHIFT)
 	xori	ra, ra, ((_PAGE_PRESENT | _PAGE_WRITE) >> _PAGE_PRESENT_SHIFT)
@@ -272,7 +311,13 @@ tlb_huge_update_store:
 	tlbsrch
 	ori	t0, t0, (_PAGE_VALID | _PAGE_DIRTY | _PAGE_MODIFIED)
 
+#ifdef CONFIG_SMP
+	sc.d	t0, t1, 0
+	beq	t0, $r0, tlb_huge_update_store
+	ld.d	t0, t1, 0
+#else
 	st.d	t0, t1, 0
+#endif
 	addu16i.d	t1, $r0, -(CSR_TLBIDX_EHINV >> 16)
 	addi.d	ra, t1, 0
 	csrxchg	ra, t1, LOONGARCH_CSR_TLBIDX
@@ -318,6 +363,7 @@ tlb_huge_update_store:
 	csrxchg		t1, t0, LOONGARCH_CSR_TLBIDX
 
 nopage_tlb_store:
+	dbar	0
 	csrrd	ra, EXCEPTION_KS2
 	la.abs	t0, tlb_do_page_fault_1
 	jirl	$r0, t0, 0
@@ -373,7 +419,14 @@ vmalloc_done_modify:
 	slli.d	t0, t0, _PTE_T_LOG2
 	add.d	t1, ra, t0
 
+#ifdef CONFIG_SMP
+smp_pgtable_change_modify:
+#endif
+#ifdef CONFIG_SMP
+	ll.d	t0, t1, 0
+#else
 	ld.d	t0, t1, 0
+#endif
 	tlbsrch
 
 	srli.d	ra, t0, _PAGE_WRITE_SHIFT
@@ -381,7 +434,12 @@ vmalloc_done_modify:
 	beq	ra, $r0, nopage_tlb_modify
 
 	ori	t0, t0, (_PAGE_VALID | _PAGE_DIRTY | _PAGE_MODIFIED)
+#ifdef CONFIG_SMP
+	sc.d	t0, t1, 0
+	beq	t0, $r0, smp_pgtable_change_modify
+#else
 	st.d	t0, t1, 0
+#endif
 	ori	t1, t1, 8
 	xori	t1, t1, 8
 	ld.d	t0, t1, 0
@@ -405,7 +463,11 @@ vmalloc_modify:
 	 * build_tlbchange_handler_head spots a huge page.
 	 */
 tlb_huge_update_modify:
+#ifdef CONFIG_SMP
+	ll.d	t0, t1, 0
+#else
 	ld.d	t0, t1, 0
+#endif
 
 	srli.d	ra, t0, _PAGE_WRITE_SHIFT
 	andi	ra, ra, 1
@@ -414,7 +476,13 @@ tlb_huge_update_modify:
 	tlbsrch
 	ori	t0, t0, (_PAGE_VALID | _PAGE_DIRTY | _PAGE_MODIFIED)
 
+#ifdef CONFIG_SMP
+	sc.d	t0, t1, 0
+	beq	t0, $r0, tlb_huge_update_modify
+	ld.d	t0, t1, 0
+#else
 	st.d	t0, t1, 0
+#endif
 	/*
 	 * A huge PTE describes an area the size of the
 	 * configured huge page size. This is twice the
@@ -454,6 +522,7 @@ tlb_huge_update_modify:
 	csrxchg	t1, t0, LOONGARCH_CSR_TLBIDX
 
 nopage_tlb_modify:
+	dbar	0
 	csrrd	ra, EXCEPTION_KS2
 	la.abs	t0, tlb_do_page_fault_1
 	jirl	$r0, t0, 0
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index 2af7c6587875..8abd28c4f32b 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -130,6 +130,7 @@ enum cpuhp_state {
 	CPUHP_ZCOMP_PREPARE,
 	CPUHP_TIMERS_PREPARE,
 	CPUHP_MIPS_SOC_PREPARE,
+	CPUHP_LOONGARCH_SOC_PREPARE,
 	CPUHP_BP_PREPARE_DYN,
 	CPUHP_BP_PREPARE_DYN_END		= CPUHP_BP_PREPARE_DYN + 20,
 	CPUHP_BRINGUP_CPU,
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 23/24] LoongArch: Add Non-Uniform Memory Access (NUMA) support
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (21 preceding siblings ...)
  2022-04-30  9:05 ` [PATCH V9 22/24] LoongArch: Add multi-processor (SMP) support Huacai Chen
@ 2022-04-30  9:05 ` Huacai Chen
  2022-04-30  9:05 ` [PATCH V9 24/24] LoongArch: Add Loongson-3 default config file Huacai Chen
  2022-05-01  8:19 ` [PATCH V9 00/22] arch: Add basic LoongArch support Bagas Sanjaya
  24 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:05 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen

This patch adds Non-Uniform Memory Access (NUMA) support for LoongArch.
LoongArch has 48-bit physical address, but the HT I/O bus only support
40-bit address, so we need a custom phys_to_dma() and dma_to_phys() to
extract the 4-bit node id (bit 44~47) from Loongson-3's 48-bit physical
address space and embed it into 40-bit. In the 40-bit dma address, node
id offset can be read from the LS7A_DMA_CFG register.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/Kconfig                  |  22 ++
 arch/loongarch/include/asm/bootinfo.h   |   1 +
 arch/loongarch/include/asm/dma-direct.h |  11 +
 arch/loongarch/include/asm/mmzone.h     |  18 +
 arch/loongarch/include/asm/numa.h       |  69 ++++
 arch/loongarch/include/asm/pgtable.h    |  12 +
 arch/loongarch/include/asm/topology.h   |  21 ++
 arch/loongarch/kernel/Makefile          |   2 +
 arch/loongarch/kernel/acpi.c            |  95 +++++
 arch/loongarch/kernel/dma.c             |  40 ++
 arch/loongarch/kernel/module.c          |   1 +
 arch/loongarch/kernel/numa.c            | 461 ++++++++++++++++++++++++
 arch/loongarch/kernel/setup.c           |   6 +-
 arch/loongarch/kernel/smp.c             |  52 ++-
 arch/loongarch/kernel/traps.c           |   4 +-
 arch/loongarch/mm/init.c                |  13 +
 arch/loongarch/mm/tlb.c                 |  35 +-
 arch/loongarch/pci/acpi.c               |   3 +
 18 files changed, 838 insertions(+), 28 deletions(-)
 create mode 100644 arch/loongarch/include/asm/dma-direct.h
 create mode 100644 arch/loongarch/include/asm/mmzone.h
 create mode 100644 arch/loongarch/include/asm/numa.h
 create mode 100644 arch/loongarch/kernel/dma.c
 create mode 100644 arch/loongarch/kernel/numa.c

diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 8479d2d43472..6aa73e96f5de 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -8,6 +8,7 @@ config LOONGARCH
 	select ARCH_ENABLE_MEMORY_HOTPLUG
 	select ARCH_ENABLE_MEMORY_HOTREMOVE
 	select ARCH_HAS_ACPI_TABLE_UPGRADE	if ACPI
+	select ARCH_HAS_PHYS_TO_DMA
 	select ARCH_HAS_PTE_SPECIAL
 	select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
 	select ARCH_INLINE_READ_LOCK if !PREEMPTION
@@ -42,6 +43,7 @@ config LOONGARCH
 	select ARCH_SUPPORTS_ACPI
 	select ARCH_SUPPORTS_ATOMIC_RMW
 	select ARCH_SUPPORTS_HUGETLBFS
+	select ARCH_SUPPORTS_NUMA_BALANCING
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_CMPXCHG_LOCKREF
 	select ARCH_USE_QUEUED_RWLOCKS
@@ -95,12 +97,15 @@ config LOONGARCH
 	select HAVE_PERF_EVENTS
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select HAVE_RSEQ
+	select HAVE_SETUP_PER_CPU_AREA if NUMA
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_TIF_NOHZ
 	select HAVE_VIRT_CPU_ACCOUNTING_GEN if !SMP
 	select IRQ_FORCED_THREADING
 	select IRQ_LOONGARCH_CPU
 	select MODULES_USE_ELF_RELA if MODULES
+	select NEED_PER_CPU_EMBED_FIRST_CHUNK
+	select NEED_PER_CPU_PAGE_FIRST_CHUNK
 	select PCI
 	select PCI_DOMAINS_GENERIC
 	select PCI_ECAM if ACPI
@@ -112,6 +117,7 @@ config LOONGARCH
 	select SYSCTL_EXCEPTION_TRACE
 	select SWIOTLB
 	select TRACE_IRQFLAGS_SUPPORT
+	select USE_PERCPU_NUMA_NODE_ID
 	select ZONE_DMA32
 
 config 32BIT
@@ -326,6 +332,21 @@ config NR_CPUS
 	  This allows you to specify the maximum number of CPUs which this
 	  kernel will support.
 
+config NUMA
+	bool "NUMA Support"
+	select ACPI_NUMA if ACPI
+	help
+	  Say Y to compile the kernel to support NUMA (Non-Uniform Memory
+	  Access).  This option improves performance on systems with more
+	  than two nodes; on two node systems it is generally better to
+	  leave it disabled; on single node systems disable this option
+	  disabled.
+
+config NODES_SHIFT
+	int
+	default "6"
+	depends on NUMA
+
 config FORCE_MAX_ZONEORDER
 	int "Maximum zone order"
 	range 14 64 if PAGE_SIZE_64KB
@@ -372,6 +393,7 @@ config ARCH_SELECT_MEMORY_MODEL
 
 config ARCH_FLATMEM_ENABLE
 	def_bool y
+	depends on !NUMA
 
 config ARCH_SPARSEMEM_ENABLE
 	def_bool y
diff --git a/arch/loongarch/include/asm/bootinfo.h b/arch/loongarch/include/asm/bootinfo.h
index 74fbba536568..f95db548f8fa 100644
--- a/arch/loongarch/include/asm/bootinfo.h
+++ b/arch/loongarch/include/asm/bootinfo.h
@@ -13,6 +13,7 @@ const char *get_system_type(void);
 extern void early_init(void);
 extern void early_memblock_init(void);
 extern void platform_init(void);
+extern void plat_swiotlb_setup(void);
 
 /*
  * Initial kernel command line, usually setup by fw_init_cmdline()
diff --git a/arch/loongarch/include/asm/dma-direct.h b/arch/loongarch/include/asm/dma-direct.h
new file mode 100644
index 000000000000..75ccd808a2af
--- /dev/null
+++ b/arch/loongarch/include/asm/dma-direct.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _LOONGARCH_DMA_DIRECT_H
+#define _LOONGARCH_DMA_DIRECT_H
+
+dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr);
+phys_addr_t dma_to_phys(struct device *dev, dma_addr_t daddr);
+
+#endif /* _LOONGARCH_DMA_DIRECT_H */
diff --git a/arch/loongarch/include/asm/mmzone.h b/arch/loongarch/include/asm/mmzone.h
new file mode 100644
index 000000000000..fe67d0b4b33d
--- /dev/null
+++ b/arch/loongarch/include/asm/mmzone.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Author: Huacai Chen (chenhuacai@loongson.cn)
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#ifndef _ASM_MMZONE_H_
+#define _ASM_MMZONE_H_
+
+#include <asm/page.h>
+#include <asm/numa.h>
+
+extern struct pglist_data *node_data[];
+
+#define NODE_DATA(nid)	(node_data[(nid)])
+
+extern void setup_zero_pages(void);
+
+#endif /* _ASM_MMZONE_H_ */
diff --git a/arch/loongarch/include/asm/numa.h b/arch/loongarch/include/asm/numa.h
new file mode 100644
index 000000000000..8f9c81af7930
--- /dev/null
+++ b/arch/loongarch/include/asm/numa.h
@@ -0,0 +1,69 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * LoongArch specific ACPICA environments and implementation
+ *
+ * Author: Jianmin Lv <lvjianmin@loongson.cn>
+ *         Huacai Chen <chenhuacai@loongson.cn>
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+
+#ifndef _ASM_LOONGARCH_NUMA_H
+#define _ASM_LOONGARCH_NUMA_H
+
+#include <linux/nodemask.h>
+
+#define NODE_ADDRSPACE_SHIFT 44
+
+#define pa_to_nid(addr)		(((addr) & 0xf00000000000) >> NODE_ADDRSPACE_SHIFT)
+#define nid_to_addrbase(nid)	(_ULCAST_(nid) << NODE_ADDRSPACE_SHIFT)
+
+#ifdef CONFIG_NUMA
+
+extern int numa_off;
+extern s16 __cpuid_to_node[CONFIG_NR_CPUS];
+extern nodemask_t numa_nodes_parsed __initdata;
+
+struct numa_memblk {
+	u64			start;
+	u64			end;
+	int			nid;
+};
+
+#define NR_NODE_MEMBLKS		(MAX_NUMNODES*2)
+struct numa_meminfo {
+	int			nr_blks;
+	struct numa_memblk	blk[NR_NODE_MEMBLKS];
+};
+
+extern int __init numa_add_memblk(int nodeid, u64 start, u64 end);
+
+extern void __init early_numa_add_cpu(int cpuid, s16 node);
+extern void numa_add_cpu(unsigned int cpu);
+extern void numa_remove_cpu(unsigned int cpu);
+
+static inline void numa_clear_node(int cpu)
+{
+}
+
+static inline void set_cpuid_to_node(int cpuid, s16 node)
+{
+	__cpuid_to_node[cpuid] = node;
+}
+
+extern int early_cpu_to_node(int cpu);
+
+#else
+
+static inline void early_numa_add_cpu(int cpuid, s16 node)	{ }
+static inline void numa_add_cpu(unsigned int cpu)		{ }
+static inline void numa_remove_cpu(unsigned int cpu)		{ }
+
+static inline int early_cpu_to_node(int cpu)
+{
+	return 0;
+}
+
+#endif	/* CONFIG_NUMA */
+
+#endif	/* _ASM_LOONGARCH_NUMA_H */
diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h
index dc3134a4a13c..30838719a349 100644
--- a/arch/loongarch/include/asm/pgtable.h
+++ b/arch/loongarch/include/asm/pgtable.h
@@ -548,6 +548,18 @@ static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm,
 
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
+#ifdef CONFIG_NUMA_BALANCING
+static inline long pte_protnone(pte_t pte)
+{
+	return (pte_val(pte) & _PAGE_PROTNONE);
+}
+
+static inline long pmd_protnone(pmd_t pmd)
+{
+	return (pmd_val(pmd) & _PAGE_PROTNONE);
+}
+#endif /* CONFIG_NUMA_BALANCING */
+
 /*
  * We provide our own get_unmapped area to cope with the virtual aliasing
  * constraints placed on us by the cache architecture.
diff --git a/arch/loongarch/include/asm/topology.h b/arch/loongarch/include/asm/topology.h
index 9314d7a3998c..6ceff034d522 100644
--- a/arch/loongarch/include/asm/topology.h
+++ b/arch/loongarch/include/asm/topology.h
@@ -7,6 +7,27 @@
 
 #include <linux/smp.h>
 
+#ifdef CONFIG_NUMA
+
+extern cpumask_t cpus_on_node[];
+
+#define cpumask_of_node(node)  (&cpus_on_node[node])
+
+struct pci_bus;
+extern int pcibus_to_node(struct pci_bus *);
+
+#define cpumask_of_pcibus(bus)	(cpu_online_mask)
+
+extern unsigned char node_distances[MAX_NUMNODES][MAX_NUMNODES];
+
+void numa_set_distance(int from, int to, int distance);
+
+#define node_distance(from, to)	(node_distances[(from)][(to)])
+
+#else
+#define pcibus_to_node(bus)	0
+#endif
+
 #ifdef CONFIG_SMP
 #define topology_physical_package_id(cpu)	(cpu_data[cpu].package)
 #define topology_core_id(cpu)			(cpu_core(&cpu_data[cpu]))
diff --git a/arch/loongarch/kernel/Makefile b/arch/loongarch/kernel/Makefile
index 5b17e1e3d6f5..47e228d53b62 100644
--- a/arch/loongarch/kernel/Makefile
+++ b/arch/loongarch/kernel/Makefile
@@ -21,4 +21,6 @@ obj-$(CONFIG_PROC_FS)		+= proc.o
 
 obj-$(CONFIG_SMP)		+= smp.o
 
+obj-$(CONFIG_NUMA)		+= numa.o
+
 CPPFLAGS_vmlinux.lds		:= $(KBUILD_CFLAGS)
diff --git a/arch/loongarch/kernel/acpi.c b/arch/loongarch/kernel/acpi.c
index 0c7f2d1077a1..df1af8847a72 100644
--- a/arch/loongarch/kernel/acpi.c
+++ b/arch/loongarch/kernel/acpi.c
@@ -14,6 +14,7 @@
 #include <linux/memblock.h>
 #include <linux/serial_core.h>
 #include <asm/io.h>
+#include <asm/numa.h>
 #include <asm/loongson.h>
 
 int acpi_disabled;
@@ -367,6 +368,79 @@ int __init acpi_boot_init(void)
 	return 0;
 }
 
+#ifdef CONFIG_ACPI_NUMA
+
+static __init int setup_node(int pxm)
+{
+	return acpi_map_pxm_to_node(pxm);
+}
+
+/*
+ * Callback for SLIT parsing.  pxm_to_node() returns NUMA_NO_NODE for
+ * I/O localities since SRAT does not list them.  I/O localities are
+ * not supported at this point.
+ */
+unsigned int numa_distance_cnt;
+
+static inline unsigned int get_numa_distances_cnt(struct acpi_table_slit *slit)
+{
+	return slit->locality_count;
+}
+
+void __init numa_set_distance(int from, int to, int distance)
+{
+	if ((u8)distance != distance || (from == to && distance != LOCAL_DISTANCE)) {
+		pr_warn_once("Warning: invalid distance parameter, from=%d to=%d distance=%d\n",
+				from, to, distance);
+		return;
+	}
+
+	node_distances[from][to] = distance;
+}
+
+/* Callback for Proximity Domain -> CPUID mapping */
+void __init
+acpi_numa_processor_affinity_init(struct acpi_srat_cpu_affinity *pa)
+{
+	int pxm, node;
+
+	if (srat_disabled())
+		return;
+	if (pa->header.length != sizeof(struct acpi_srat_cpu_affinity)) {
+		bad_srat();
+		return;
+	}
+	if ((pa->flags & ACPI_SRAT_CPU_ENABLED) == 0)
+		return;
+	pxm = pa->proximity_domain_lo;
+	if (acpi_srat_revision >= 2) {
+		pxm |= (pa->proximity_domain_hi[0] << 8);
+		pxm |= (pa->proximity_domain_hi[1] << 16);
+		pxm |= (pa->proximity_domain_hi[2] << 24);
+	}
+	node = setup_node(pxm);
+	if (node < 0) {
+		pr_err("SRAT: Too many proximity domains %x\n", pxm);
+		bad_srat();
+		return;
+	}
+
+	if (pa->apic_id >= CONFIG_NR_CPUS) {
+		pr_info("SRAT: PXM %u -> CPU 0x%02x -> Node %u skipped apicid that is too big\n",
+				pxm, pa->apic_id, node);
+		return;
+	}
+
+	early_numa_add_cpu(pa->apic_id, node);
+
+	set_cpuid_to_node(pa->apic_id, node);
+	node_set(node, numa_nodes_parsed);
+	pr_info("SRAT: PXM %u -> CPU 0x%02x -> Node %u\n", pxm, pa->apic_id, node);
+}
+
+void __init acpi_numa_arch_fixup(void) {}
+#endif
+
 void __init arch_reserve_mem_area(acpi_physical_address addr, size_t size)
 {
 	memblock_reserve(addr, size);
@@ -376,6 +450,22 @@ void __init arch_reserve_mem_area(acpi_physical_address addr, size_t size)
 
 #include <acpi/processor.h>
 
+static int __ref acpi_map_cpu2node(acpi_handle handle, int cpu, int physid)
+{
+#ifdef CONFIG_ACPI_NUMA
+	int nid;
+
+	nid = acpi_get_node(handle);
+	if (nid != NUMA_NO_NODE) {
+		set_cpuid_to_node(physid, nid);
+		node_set(nid, numa_nodes_parsed);
+		set_cpu_numa_node(cpu, nid);
+		cpumask_set_cpu(cpu, cpumask_of_node(nid));
+	}
+#endif
+	return 0;
+}
+
 int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 acpi_id, int *pcpu)
 {
 	int cpu;
@@ -386,6 +476,8 @@ int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 acpi_id, int *pcpu
 		return cpu;
 	}
 
+	acpi_map_cpu2node(handle, cpu, physid);
+
 	*pcpu = cpu;
 
 	return 0;
@@ -394,6 +486,9 @@ EXPORT_SYMBOL(acpi_map_cpu);
 
 int acpi_unmap_cpu(int cpu)
 {
+#ifdef CONFIG_ACPI_NUMA
+	set_cpuid_to_node(cpu_logical_map(cpu), NUMA_NO_NODE);
+#endif
 	set_cpu_present(cpu, false);
 	num_processors--;
 
diff --git a/arch/loongarch/kernel/dma.c b/arch/loongarch/kernel/dma.c
new file mode 100644
index 000000000000..659b8faccaee
--- /dev/null
+++ b/arch/loongarch/kernel/dma.c
@@ -0,0 +1,40 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/init.h>
+#include <linux/dma-direct.h>
+#include <linux/dma-mapping.h>
+#include <linux/dma-map-ops.h>
+#include <linux/swiotlb.h>
+
+#include <asm/bootinfo.h>
+#include <asm/dma.h>
+#include <asm/loongson.h>
+
+/*
+ * We extract 4bit node id (bit 44~47) from Loongson-3's
+ * 48bit physical address space and embed it into 40bit.
+ */
+
+static int node_id_offset;
+
+dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
+{
+	long nid = (paddr >> 44) & 0xf;
+
+	return ((nid << 44) ^ paddr) | (nid << node_id_offset);
+}
+
+phys_addr_t dma_to_phys(struct device *dev, dma_addr_t daddr)
+{
+	long nid = (daddr >> node_id_offset) & 0xf;
+
+	return ((nid << node_id_offset) ^ daddr) | (nid << 44);
+}
+
+void __init plat_swiotlb_setup(void)
+{
+	swiotlb_init(1);
+	node_id_offset = ((readl(LS7A_DMA_CFG) & LS7A_DMA_NODE_MASK) >> LS7A_DMA_NODE_SHF) + 36;
+}
diff --git a/arch/loongarch/kernel/module.c b/arch/loongarch/kernel/module.c
index d2004bcfedcc..7c61ad4c2142 100644
--- a/arch/loongarch/kernel/module.c
+++ b/arch/loongarch/kernel/module.c
@@ -11,6 +11,7 @@
 #include <linux/moduleloader.h>
 #include <linux/elf.h>
 #include <linux/mm.h>
+#include <linux/numa.h>
 #include <linux/vmalloc.h>
 #include <linux/slab.h>
 #include <linux/fs.h>
diff --git a/arch/loongarch/kernel/numa.c b/arch/loongarch/kernel/numa.c
new file mode 100644
index 000000000000..228449edb8b9
--- /dev/null
+++ b/arch/loongarch/kernel/numa.c
@@ -0,0 +1,461 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Author:  Xiang Gao <gaoxiang@loongson.cn>
+ *          Huacai Chen <chenhuacai@loongson.cn>
+ *
+ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+ */
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/mmzone.h>
+#include <linux/export.h>
+#include <linux/nodemask.h>
+#include <linux/swap.h>
+#include <linux/memblock.h>
+#include <linux/pfn.h>
+#include <linux/acpi.h>
+#include <linux/highmem.h>
+#include <linux/irq.h>
+#include <linux/pci.h>
+#include <asm/bootinfo.h>
+#include <asm/loongson.h>
+#include <asm/numa.h>
+#include <asm/page.h>
+#include <asm/pgalloc.h>
+#include <asm/sections.h>
+#include <asm/time.h>
+
+int numa_off;
+struct pglist_data *node_data[MAX_NUMNODES];
+unsigned char node_distances[MAX_NUMNODES][MAX_NUMNODES];
+
+EXPORT_SYMBOL(node_data);
+EXPORT_SYMBOL(node_distances);
+
+static struct numa_meminfo numa_meminfo;
+cpumask_t cpus_on_node[MAX_NUMNODES];
+cpumask_t phys_cpus_on_node[MAX_NUMNODES];
+EXPORT_SYMBOL(cpus_on_node);
+
+/*
+ * apicid, cpu, node mappings
+ */
+s16 __cpuid_to_node[CONFIG_NR_CPUS] = {
+	[0 ... CONFIG_NR_CPUS - 1] = NUMA_NO_NODE
+};
+EXPORT_SYMBOL(__cpuid_to_node);
+
+nodemask_t numa_nodes_parsed __initdata;
+
+#ifdef CONFIG_HAVE_SETUP_PER_CPU_AREA
+unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
+EXPORT_SYMBOL(__per_cpu_offset);
+
+static int __init pcpu_cpu_to_node(int cpu)
+{
+	return early_cpu_to_node(cpu);
+}
+
+static int __init pcpu_cpu_distance(unsigned int from, unsigned int to)
+{
+	if (early_cpu_to_node(from) == early_cpu_to_node(to))
+		return LOCAL_DISTANCE;
+	else
+		return REMOTE_DISTANCE;
+}
+
+void __init pcpu_populate_pte(unsigned long addr)
+{
+	pgd_t *pgd = pgd_offset_k(addr);
+	p4d_t *p4d = p4d_offset(pgd, addr);
+	pud_t *pud;
+	pmd_t *pmd;
+
+	if (p4d_none(*p4d)) {
+		pud_t *new;
+
+		new = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+		pgd_populate(&init_mm, pgd, new);
+#ifndef __PAGETABLE_PUD_FOLDED
+		pud_init((unsigned long)new, (unsigned long)invalid_pmd_table);
+#endif
+	}
+
+	pud = pud_offset(p4d, addr);
+	if (pud_none(*pud)) {
+		pmd_t *new;
+
+		new = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+		pud_populate(&init_mm, pud, new);
+#ifndef __PAGETABLE_PMD_FOLDED
+		pmd_init((unsigned long)new, (unsigned long)invalid_pte_table);
+#endif
+	}
+
+	pmd = pmd_offset(pud, addr);
+	if (!pmd_present(*pmd)) {
+		pte_t *new;
+
+		new = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+		pmd_populate_kernel(&init_mm, pmd, new);
+	}
+}
+
+void __init setup_per_cpu_areas(void)
+{
+	unsigned long delta;
+	unsigned int cpu;
+	int rc = -EINVAL;
+
+	if (pcpu_chosen_fc == PCPU_FC_AUTO) {
+		if (nr_node_ids >= 8)
+			pcpu_chosen_fc = PCPU_FC_PAGE;
+		else
+			pcpu_chosen_fc = PCPU_FC_EMBED;
+	}
+
+	/*
+	 * Always reserve area for module percpu variables.  That's
+	 * what the legacy allocator did.
+	 */
+	if (pcpu_chosen_fc != PCPU_FC_PAGE) {
+		rc = pcpu_embed_first_chunk(PERCPU_MODULE_RESERVE,
+					    PERCPU_DYNAMIC_RESERVE, PMD_SIZE,
+					    pcpu_cpu_distance, pcpu_cpu_to_node);
+		if (rc < 0)
+			pr_warn("%s allocator failed (%d), falling back to page size\n",
+				pcpu_fc_names[pcpu_chosen_fc], rc);
+	}
+	if (rc < 0)
+		rc = pcpu_page_first_chunk(PERCPU_MODULE_RESERVE, pcpu_cpu_to_node);
+	if (rc < 0)
+		panic("cannot initialize percpu area (err=%d)", rc);
+
+	delta = (unsigned long)pcpu_base_addr - (unsigned long)__per_cpu_start;
+	for_each_possible_cpu(cpu)
+		__per_cpu_offset[cpu] = delta + pcpu_unit_offsets[cpu];
+}
+#endif
+
+/*
+ * Get nodeid by logical cpu number.
+ * __cpuid_to_node maps phyical cpu id to node, so we
+ * should use cpu_logical_map(cpu) to index it.
+ *
+ * This routine is only used in early phase during
+ * booting, after setup_per_cpu_areas calling and numa_node
+ * initialization, cpu_to_node will be used instead.
+ */
+int early_cpu_to_node(int cpu)
+{
+	int physid = cpu_logical_map(cpu);
+
+	if (physid < 0)
+		return NUMA_NO_NODE;
+
+	return __cpuid_to_node[physid];
+}
+
+void __init early_numa_add_cpu(int cpuid, s16 node)
+{
+	int cpu = __cpu_number_map[cpuid];
+
+	if (cpu < 0)
+		return;
+
+	cpumask_set_cpu(cpu, &cpus_on_node[node]);
+	cpumask_set_cpu(cpuid, &phys_cpus_on_node[node]);
+}
+
+void numa_add_cpu(unsigned int cpu)
+{
+	int nid = cpu_to_node(cpu);
+	cpumask_set_cpu(cpu, &cpus_on_node[nid]);
+}
+
+void numa_remove_cpu(unsigned int cpu)
+{
+	int nid = cpu_to_node(cpu);
+	cpumask_clear_cpu(cpu, &cpus_on_node[nid]);
+}
+
+static int __init numa_add_memblk_to(int nid, u64 start, u64 end,
+				     struct numa_meminfo *mi)
+{
+	/* ignore zero length blks */
+	if (start == end)
+		return 0;
+
+	/* whine about and ignore invalid blks */
+	if (start > end || nid < 0 || nid >= MAX_NUMNODES) {
+		pr_warn("NUMA: Warning: invalid memblk node %d [mem %#010Lx-%#010Lx]\n",
+			   nid, start, end - 1);
+		return 0;
+	}
+
+	if (mi->nr_blks >= NR_NODE_MEMBLKS) {
+		pr_err("NUMA: too many memblk ranges\n");
+		return -EINVAL;
+	}
+
+	mi->blk[mi->nr_blks].start = PFN_ALIGN(start);
+	mi->blk[mi->nr_blks].end = PFN_ALIGN(end - PAGE_SIZE + 1);
+	mi->blk[mi->nr_blks].nid = nid;
+	mi->nr_blks++;
+	return 0;
+}
+
+/**
+ * numa_add_memblk - Add one numa_memblk to numa_meminfo
+ * @nid: NUMA node ID of the new memblk
+ * @start: Start address of the new memblk
+ * @end: End address of the new memblk
+ *
+ * Add a new memblk to the default numa_meminfo.
+ *
+ * RETURNS:
+ * 0 on success, -errno on failure.
+ */
+int __init numa_add_memblk(int nid, u64 start, u64 end)
+{
+	return numa_add_memblk_to(nid, start, end, &numa_meminfo);
+}
+
+static void __init alloc_node_data(int nid)
+{
+	void *nd;
+	unsigned long nd_pa;
+	size_t nd_sz = roundup(sizeof(pg_data_t), PAGE_SIZE);
+
+	nd_pa = memblock_phys_alloc_try_nid(nd_sz, SMP_CACHE_BYTES, nid);
+	if (!nd_pa) {
+		pr_err("Cannot find %zu Byte for node_data (initial node: %d)\n", nd_sz, nid);
+		return;
+	}
+
+	nd = __va(nd_pa);
+
+	node_data[nid] = nd;
+	memset(nd, 0, sizeof(pg_data_t));
+}
+
+static void __init node_mem_init(unsigned int node)
+{
+	unsigned long start_pfn, end_pfn;
+	unsigned long node_addrspace_offset;
+
+	node_addrspace_offset = nid_to_addrbase(node);
+	pr_info("Node%d's addrspace_offset is 0x%lx\n",
+			node, node_addrspace_offset);
+
+	get_pfn_range_for_nid(node, &start_pfn, &end_pfn);
+	pr_info("Node%d: start_pfn=0x%lx, end_pfn=0x%lx\n",
+		node, start_pfn, end_pfn);
+
+	alloc_node_data(node);
+}
+
+#ifdef CONFIG_ACPI_NUMA
+
+/*
+ * Sanity check to catch more bad NUMA configurations (they are amazingly
+ * common).  Make sure the nodes cover all memory.
+ */
+static bool __init numa_meminfo_cover_memory(const struct numa_meminfo *mi)
+{
+	int i;
+	u64 numaram, biosram;
+
+	numaram = 0;
+	for (i = 0; i < mi->nr_blks; i++) {
+		u64 s = mi->blk[i].start >> PAGE_SHIFT;
+		u64 e = mi->blk[i].end >> PAGE_SHIFT;
+
+		numaram += e - s;
+		numaram -= __absent_pages_in_range(mi->blk[i].nid, s, e);
+		if ((s64)numaram < 0)
+			numaram = 0;
+	}
+	max_pfn = max_low_pfn;
+	biosram = max_pfn - absent_pages_in_range(0, max_pfn);
+
+	BUG_ON((s64)(biosram - numaram) >= (1 << (20 - PAGE_SHIFT)));
+	return true;
+}
+
+static void __init add_node_intersection(u32 node, u64 start, u64 size)
+{
+	static unsigned long num_physpages;
+
+	num_physpages += (size >> PAGE_SHIFT);
+	pr_info("Node%d: mem_type:%d, mem_start:0x%llx, mem_size:0x%llx Bytes\n",
+		node, ADDRESS_TYPE_SYSRAM, start, size);
+	pr_info("       start_pfn:0x%llx, end_pfn:0x%llx, num_physpages:0x%lx\n",
+		start >> PAGE_SHIFT, (start + size) >> PAGE_SHIFT, num_physpages);
+	memblock_set_node(start, size, &memblock.memory, node);
+}
+
+/*
+ * add_numamem_region
+ *
+ * Add a uasable memory region described by BIOS. The
+ * routine gets each intersection between BIOS's region
+ * and node's region, and adds them into node's memblock
+ * pool.
+ *
+ */
+static void __init add_numamem_region(u64 start, u64 end)
+{
+	u32 i;
+	u64 ofs = start;
+
+	if (start >= end) {
+		pr_debug("Invalid region: %016llx-%016llx\n", start, end);
+		return;
+	}
+
+	for (i = 0; i < numa_meminfo.nr_blks; i++) {
+		struct numa_memblk *mb = &numa_meminfo.blk[i];
+
+		if (ofs > mb->end)
+			continue;
+
+		if (end > mb->end) {
+			add_node_intersection(mb->nid, ofs, mb->end - ofs);
+			ofs = mb->end;
+		} else {
+			add_node_intersection(mb->nid, ofs, end - ofs);
+			break;
+		}
+	}
+}
+
+static void __init init_node_memblock(void)
+{
+	u32 i, mem_type;
+	u64 mem_end, mem_start, mem_size;
+
+	/* Parse memory information and activate */
+	for (i = 0; i < loongson_mem_map->map_count; i++) {
+		mem_type = loongson_mem_map->map[i].mem_type;
+		mem_start = loongson_mem_map->map[i].mem_start;
+		mem_size = loongson_mem_map->map[i].mem_size;
+		mem_end = loongson_mem_map->map[i].mem_start + mem_size;
+		switch (mem_type) {
+		case ADDRESS_TYPE_SYSRAM:
+			mem_start = PFN_ALIGN(mem_start);
+			mem_end = PFN_ALIGN(mem_end - PAGE_SIZE + 1);
+			add_numamem_region(mem_start, mem_end);
+			break;
+		case ADDRESS_TYPE_ACPI:
+			mem_start = PFN_ALIGN(mem_start);
+			mem_end = PFN_ALIGN(mem_end - PAGE_SIZE + 1);
+			memblock_add(mem_start, mem_size);
+			add_numamem_region(mem_start, mem_end);
+			fallthrough;
+		case ADDRESS_TYPE_RESERVED:
+			pr_info("Resvd: mem_type:%d, mem_start:0x%llx, mem_size:0x%llx Bytes\n",
+					mem_type, mem_start, mem_size);
+			memblock_reserve(mem_start, mem_size);
+			break;
+		}
+	}
+}
+
+static void __init numa_default_distance(void)
+{
+	int row, col;
+
+	for (row = 0; row < MAX_NUMNODES; row++)
+		for (col = 0; col < MAX_NUMNODES; col++) {
+			if (col == row)
+				node_distances[row][col] = LOCAL_DISTANCE;
+			else
+				/* We assume that one node per package here!
+				 *
+				 * A SLIT should be used for multiple nodes
+				 * per package to override default setting.
+				 */
+				node_distances[row][col] = REMOTE_DISTANCE;
+	}
+}
+
+static int __init numa_mem_init(int (*init_func)(void))
+{
+	int i;
+	int ret;
+	int node;
+
+	for (i = 0; i < NR_CPUS; i++)
+		set_cpuid_to_node(i, NUMA_NO_NODE);
+
+	numa_default_distance();
+	nodes_clear(numa_nodes_parsed);
+	nodes_clear(node_possible_map);
+	nodes_clear(node_online_map);
+	memset(&numa_meminfo, 0, sizeof(numa_meminfo));
+
+	/* Parse SRAT and SLIT if provided by firmware. */
+	ret = init_func();
+	if (ret < 0)
+		return ret;
+
+	node_possible_map = numa_nodes_parsed;
+	if (WARN_ON(nodes_empty(node_possible_map)))
+		return -EINVAL;
+
+	init_node_memblock();
+	if (numa_meminfo_cover_memory(&numa_meminfo) == false)
+		return -EINVAL;
+
+	for_each_node_mask(node, node_possible_map) {
+		node_mem_init(node);
+		node_set_online(node);
+	}
+	max_low_pfn = PHYS_PFN(memblock_end_of_DRAM());
+
+	return 0;
+}
+#endif
+void __init paging_init(void)
+{
+	unsigned int node;
+	unsigned long zones_size[MAX_NR_ZONES] = {0, };
+
+	for_each_online_node(node) {
+		unsigned long start_pfn, end_pfn;
+
+		get_pfn_range_for_nid(node, &start_pfn, &end_pfn);
+
+		if (end_pfn > max_low_pfn)
+			max_low_pfn = end_pfn;
+	}
+#ifdef CONFIG_ZONE_DMA32
+	zones_size[ZONE_DMA32] = MAX_DMA32_PFN;
+#endif
+	zones_size[ZONE_NORMAL] = max_low_pfn;
+	free_area_init(zones_size);
+}
+
+void __init mem_init(void)
+{
+	high_memory = (void *) __va(get_num_physpages() << PAGE_SHIFT);
+	memblock_free_all();
+	setup_zero_pages();	/* This comes from node 0 */
+}
+
+int pcibus_to_node(struct pci_bus *bus)
+{
+	return dev_to_node(&bus->dev);
+}
+EXPORT_SYMBOL(pcibus_to_node);
+
+void __init fw_init_numa_memory(void)
+{
+	numa_mem_init(acpi_numa_init);
+	setup_nr_node_ids();
+	loongson_sysconf.nr_nodes = nr_node_ids;
+	loongson_sysconf.cores_per_node = cpumask_weight(&phys_cpus_on_node[0]);
+}
+EXPORT_SYMBOL(fw_init_numa_memory);
diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c
index 7bf9c255d036..ea4299134232 100644
--- a/arch/loongarch/kernel/setup.c
+++ b/arch/loongarch/kernel/setup.c
@@ -281,7 +281,11 @@ void __init platform_init(void)
 	acpi_boot_init();
 #endif
 
+#ifndef CONFIG_NUMA
 	fw_init_memory();
+#else
+	fw_init_numa_memory();
+#endif
 	dmi_setup();
 	smbios_parse();
 	pr_info("The BIOS Version: %s\n", b_info.bios_version);
@@ -320,7 +324,7 @@ static void __init arch_mem_init(char **cmdline_p)
 	sparse_init();
 	memblock_set_bottom_up(true);
 
-	swiotlb_init(1);
+	plat_swiotlb_setup();
 
 	dma_contiguous_reserve(PFN_PHYS(max_low_pfn));
 
diff --git a/arch/loongarch/kernel/smp.c b/arch/loongarch/kernel/smp.c
index 27704f30754b..6079ce8c6277 100644
--- a/arch/loongarch/kernel/smp.c
+++ b/arch/loongarch/kernel/smp.c
@@ -25,6 +25,7 @@
 #include <asm/idle.h>
 #include <asm/loongson.h>
 #include <asm/mmu_context.h>
+#include <asm/numa.h>
 #include <asm/processor.h>
 #include <asm/setup.h>
 #include <asm/time.h>
@@ -225,6 +226,9 @@ void loongson3_init_secondary(void)
 
 	iocsr_writel(0xffffffff, LOONGARCH_IOCSR_IPI_EN);
 
+#ifdef CONFIG_NUMA
+	numa_add_cpu(cpu);
+#endif
 	per_cpu(cpu_state, cpu) = CPU_ONLINE;
 	cpu_set_core(&cpu_data[cpu],
 		     cpu_logical_map(cpu) % loongson_sysconf.cores_per_package);
@@ -268,6 +272,9 @@ int loongson3_cpu_disable(void)
 	if (io_master(cpu))
 		return -EBUSY;
 
+#ifdef CONFIG_NUMA
+	numa_remove_cpu(cpu);
+#endif
 	set_cpu_online(cpu, false);
 	calculate_cpu_foreign_map();
 	local_irq_save(flags);
@@ -492,14 +499,36 @@ void calculate_cpu_foreign_map(void)
 /* Preload SMP state for boot cpu */
 void smp_prepare_boot_cpu(void)
 {
-	unsigned int cpu;
+	unsigned int cpu, node, rr_node;
 
 	set_cpu_possible(0, true);
 	set_cpu_online(0, true);
 	set_my_cpu_offset(per_cpu_offset(0));
 
-	for_each_possible_cpu(cpu)
-		set_cpu_numa_node(cpu, 0);
+	rr_node = first_node(node_online_map);
+	for_each_possible_cpu(cpu) {
+		node = early_cpu_to_node(cpu);
+
+		/*
+		 * The mapping between present cpus and nodes has been
+		 * built during MADT and SRAT parsing.
+		 *
+		 * If possible cpus = present cpus here, early_cpu_to_node
+		 * will return valid node.
+		 *
+		 * If possible cpus > present cpus here (e.g. some possible
+		 * cpus will be added by cpu-hotplug later), for possible but
+		 * not present cpus, early_cpu_to_node will return NUMA_NO_NODE,
+		 * and we just map them to online nodes in round-robin way.
+		 * Once hotplugged, new correct mapping will be built for them.
+		 */
+		if (node != NUMA_NO_NODE)
+			set_cpu_numa_node(cpu, node);
+		else {
+			set_cpu_numa_node(cpu, rr_node);
+			rr_node = next_node_in(rr_node, node_online_map);
+		}
+	}
 }
 
 /* called from main before smp_init() */
@@ -662,17 +691,10 @@ void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned l
 		on_each_cpu_mask(mm_cpumask(mm), flush_tlb_range_ipi, &fd, 1);
 	} else {
 		unsigned int cpu;
-		int exec = vma->vm_flags & VM_EXEC;
 
 		for_each_online_cpu(cpu) {
-			/*
-			 * flush_cache_range() will only fully flush icache if
-			 * the VMA is executable, otherwise we must invalidate
-			 * ASID without it appearing to has_valid_asid() as if
-			 * mm has been completely unused by that CPU.
-			 */
 			if (cpu != smp_processor_id() && cpu_context(cpu, mm))
-				cpu_context(cpu, mm) = !exec;
+				cpu_context(cpu, mm) = 0;
 		}
 		local_flush_tlb_range(vma, start, end);
 	}
@@ -717,14 +739,8 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long page)
 		unsigned int cpu;
 
 		for_each_online_cpu(cpu) {
-			/*
-			 * flush_cache_page() only does partial flushes, so
-			 * invalidate ASID without it appearing to
-			 * has_valid_asid() as if mm has been completely unused
-			 * by that CPU.
-			 */
 			if (cpu != smp_processor_id() && cpu_context(cpu, vma->vm_mm))
-				cpu_context(cpu, vma->vm_mm) = 1;
+				cpu_context(cpu, vma->vm_mm) = 0;
 		}
 		local_flush_tlb_page(vma, page);
 	}
diff --git a/arch/loongarch/kernel/traps.c b/arch/loongarch/kernel/traps.c
index 182680e08571..d813bc577d96 100644
--- a/arch/loongarch/kernel/traps.c
+++ b/arch/loongarch/kernel/traps.c
@@ -658,7 +658,7 @@ asmlinkage void noinstr do_vint(struct pt_regs *regs, unsigned long sp)
 	irqentry_exit(regs, state);
 }
 
-extern void tlb_init(void);
+extern void tlb_init(int cpu);
 extern void cache_error_setup(void);
 
 unsigned long eentry;
@@ -697,7 +697,7 @@ void per_cpu_trap_init(int cpu)
 		for (i = 0; i < 64; i++)
 			set_handler(i * VECSIZE, handle_reserved, VECSIZE);
 
-	tlb_init();
+	tlb_init(cpu);
 	cpu_cache_init();
 }
 
diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c
index afd6634ce171..7094a68c9b83 100644
--- a/arch/loongarch/mm/init.c
+++ b/arch/loongarch/mm/init.c
@@ -84,6 +84,7 @@ int __ref page_is_ram(unsigned long pfn)
 	return memblock_is_memory(addr) && !memblock_is_reserved(addr);
 }
 
+#ifndef CONFIG_NUMA
 void __init paging_init(void)
 {
 	unsigned long max_zone_pfns[MAX_NR_ZONES];
@@ -107,6 +108,7 @@ void __init mem_init(void)
 	memblock_free_all();
 	setup_zero_pages();	/* Setup zeroed pages.  */
 }
+#endif /* !CONFIG_NUMA */
 
 void __ref free_initmem(void)
 {
@@ -129,6 +131,17 @@ int arch_add_memory(int nid, u64 start, u64 size, struct mhp_params *params)
 	return ret;
 }
 
+#ifdef CONFIG_NUMA
+int memory_add_physaddr_to_nid(u64 start)
+{
+	int nid;
+
+	nid = pa_to_nid(start);
+	return nid;
+}
+EXPORT_SYMBOL_GPL(memory_add_physaddr_to_nid);
+#endif
+
 #ifdef CONFIG_MEMORY_HOTREMOVE
 void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap)
 {
diff --git a/arch/loongarch/mm/tlb.c b/arch/loongarch/mm/tlb.c
index 5201a87937fe..01e64a4a100d 100644
--- a/arch/loongarch/mm/tlb.c
+++ b/arch/loongarch/mm/tlb.c
@@ -250,15 +250,16 @@ static void output_pgtable_bits_defines(void)
 	pr_debug("\n");
 }
 
-void setup_tlb_handler(void)
-{
-	static int run_once = 0;
+static unsigned long pcpu_handlers[NR_CPUS];
+extern long exception_handlers[VECSIZE * 128 / sizeof(long)];
 
+void setup_tlb_handler(int cpu)
+{
 	setup_ptwalker();
 	output_pgtable_bits_defines();
 
 	/* The tlb handlers are generated only once */
-	if (!run_once) {
+	if (cpu == 0) {
 		memcpy((void *)tlbrentry, handle_tlb_refill, 0x80);
 		local_flush_icache_range(tlbrentry, tlbrentry + 0x80);
 		set_handler(EXCCODE_TLBI * VECSIZE, handle_tlb_load, VECSIZE);
@@ -268,15 +269,35 @@ void setup_tlb_handler(void)
 		set_handler(EXCCODE_TLBNR * VECSIZE, handle_tlb_protect, VECSIZE);
 		set_handler(EXCCODE_TLBNX * VECSIZE, handle_tlb_protect, VECSIZE);
 		set_handler(EXCCODE_TLBPE * VECSIZE, handle_tlb_protect, VECSIZE);
-		run_once++;
 	}
+#ifdef CONFIG_NUMA
+	else {
+		void *addr;
+		struct page *page;
+		const int vec_sz = sizeof(exception_handlers);
+
+		if (pcpu_handlers[cpu])
+			return;
+
+		page = alloc_pages_node(cpu_to_node(cpu), GFP_KERNEL, get_order(vec_sz));
+		if (!page)
+			return;
+
+		addr = page_address(page);
+		pcpu_handlers[cpu] = virt_to_phys(addr);
+		memcpy((void *)addr, (void *)eentry, vec_sz);
+		local_flush_icache_range((unsigned long)addr, (unsigned long)addr + vec_sz);
+		csr_writeq(pcpu_handlers[cpu], LOONGARCH_CSR_TLBRENTRY);
+		csr_writeq(pcpu_handlers[cpu] + 80*VECSIZE, LOONGARCH_CSR_TLBRENTRY);
+	}
+#endif
 }
 
-void tlb_init(void)
+void tlb_init(int cpu)
 {
 	write_csr_pagesize(PS_DEFAULT_SIZE);
 	write_csr_stlbpgsize(PS_DEFAULT_SIZE);
 	write_csr_tlbrefill_pagesize(PS_DEFAULT_SIZE);
-	setup_tlb_handler();
+	setup_tlb_handler(cpu);
 	local_flush_tlb_all();
 }
diff --git a/arch/loongarch/pci/acpi.c b/arch/loongarch/pci/acpi.c
index 7cabb8f37218..bf921487333c 100644
--- a/arch/loongarch/pci/acpi.c
+++ b/arch/loongarch/pci/acpi.c
@@ -11,6 +11,7 @@
 #include <linux/pci-ecam.h>
 
 #include <asm/pci.h>
+#include <asm/numa.h>
 #include <asm/loongson.h>
 
 struct pci_root_info {
@@ -27,8 +28,10 @@ int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
 {
 	struct pci_config_window *cfg = bridge->bus->sysdata;
 	struct acpi_device *adev = to_acpi_device(cfg->parent);
+	struct device *bus_dev = &bridge->bus->dev;
 
 	ACPI_COMPANION_SET(&bridge->dev, adev);
+	set_dev_node(bus_dev, pa_to_nid(cfg->res.start));
 
 	return 0;
 }
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH V9 24/24] LoongArch: Add Loongson-3 default config file
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (22 preceding siblings ...)
  2022-04-30  9:05 ` [PATCH V9 23/24] LoongArch: Add Non-Uniform Memory Access (NUMA) support Huacai Chen
@ 2022-04-30  9:05 ` Huacai Chen
  2022-05-01  8:19 ` [PATCH V9 00/22] arch: Add basic LoongArch support Bagas Sanjaya
  24 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-04-30  9:05 UTC (permalink / raw)
  To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Huacai Chen

This patch adds a default config file for LoongArch-based Loongson-3
platform.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 arch/loongarch/Makefile                    |   2 +
 arch/loongarch/configs/loongson3_defconfig | 770 +++++++++++++++++++++
 2 files changed, 772 insertions(+)
 create mode 100644 arch/loongarch/configs/loongson3_defconfig

diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
index 1ed5b8466565..952dc6c163b0 100644
--- a/arch/loongarch/Makefile
+++ b/arch/loongarch/Makefile
@@ -5,6 +5,8 @@
 
 boot	:= arch/loongarch/boot
 
+KBUILD_DEFCONFIG := loongson3_defconfig
+
 ifndef CONFIG_SYS_SUPPORTS_ZBOOT
 
 ifndef CONFIG_EFI_STUB
diff --git a/arch/loongarch/configs/loongson3_defconfig b/arch/loongarch/configs/loongson3_defconfig
new file mode 100644
index 000000000000..e7812e20082b
--- /dev/null
+++ b/arch/loongarch/configs/loongson3_defconfig
@@ -0,0 +1,770 @@
+# CONFIG_LOCALVERSION_AUTO is not set
+CONFIG_SYSVIPC=y
+CONFIG_POSIX_MQUEUE=y
+CONFIG_NO_HZ=y
+CONFIG_HIGH_RES_TIMERS=y
+CONFIG_BPF_SYSCALL=y
+CONFIG_PREEMPT=y
+CONFIG_BSD_PROCESS_ACCT=y
+CONFIG_BSD_PROCESS_ACCT_V3=y
+CONFIG_TASKSTATS=y
+CONFIG_TASK_DELAY_ACCT=y
+CONFIG_TASK_XACCT=y
+CONFIG_TASK_IO_ACCOUNTING=y
+CONFIG_LOG_BUF_SHIFT=18
+CONFIG_NUMA_BALANCING=y
+CONFIG_MEMCG=y
+CONFIG_BLK_CGROUP=y
+CONFIG_CFS_BANDWIDTH=y
+CONFIG_RT_GROUP_SCHED=y
+CONFIG_CGROUP_PIDS=y
+CONFIG_CGROUP_FREEZER=y
+CONFIG_CGROUP_HUGETLB=y
+CONFIG_CPUSETS=y
+CONFIG_CGROUP_DEVICE=y
+CONFIG_CGROUP_CPUACCT=y
+CONFIG_CGROUP_PERF=y
+CONFIG_CGROUP_BPF=y
+CONFIG_NAMESPACES=y
+CONFIG_USER_NS=y
+CONFIG_CHECKPOINT_RESTORE=y
+CONFIG_SCHED_AUTOGROUP=y
+CONFIG_SYSFS_DEPRECATED=y
+CONFIG_RELAY=y
+CONFIG_BLK_DEV_INITRD=y
+CONFIG_EXPERT=y
+CONFIG_USERFAULTFD=y
+CONFIG_PERF_EVENTS=y
+# CONFIG_COMPAT_BRK is not set
+CONFIG_LOONGARCH=y
+CONFIG_64BIT=y
+CONFIG_MACH_LOONGSON64=y
+CONFIG_DMI=y
+CONFIG_EFI=y
+CONFIG_SMP=y
+CONFIG_NR_CPUS=64
+CONFIG_NUMA=y
+CONFIG_PAGE_SIZE_16KB=y
+CONFIG_HZ_250=y
+CONFIG_ACPI=y
+CONFIG_ACPI_SPCR_TABLE=y
+CONFIG_ACPI_HOTPLUG_CPU=y
+CONFIG_ACPI_TAD=y
+CONFIG_ACPI_DOCK=y
+CONFIG_ACPI_IPMI=m
+CONFIG_ACPI_PCI_SLOT=y
+CONFIG_ACPI_HOTPLUG_MEMORY=y
+CONFIG_SYSFB_SIMPLEFB=y
+CONFIG_EFI_CAPSULE_LOADER=m
+CONFIG_EFI_TEST=m
+CONFIG_MODULES=y
+CONFIG_MODULE_FORCE_LOAD=y
+CONFIG_MODULE_UNLOAD=y
+CONFIG_MODULE_FORCE_UNLOAD=y
+CONFIG_MODVERSIONS=y
+CONFIG_BLK_DEV_THROTTLING=y
+CONFIG_PARTITION_ADVANCED=y
+CONFIG_IOSCHED_BFQ=y
+CONFIG_BFQ_GROUP_IOSCHED=y
+CONFIG_BINFMT_MISC=m
+CONFIG_MEMORY_HOTPLUG=y
+CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=y
+CONFIG_MEMORY_HOTREMOVE=y
+CONFIG_KSM=y
+CONFIG_TRANSPARENT_HUGEPAGE=y
+CONFIG_ZSWAP=y
+CONFIG_ZSWAP_COMPRESSOR_DEFAULT_ZSTD=y
+CONFIG_ZPOOL=y
+CONFIG_ZBUD=y
+CONFIG_Z3FOLD=y
+CONFIG_ZSMALLOC=m
+CONFIG_NET=y
+CONFIG_PACKET=y
+CONFIG_UNIX=y
+CONFIG_XFRM_USER=y
+CONFIG_NET_KEY=y
+CONFIG_INET=y
+CONFIG_IP_MULTICAST=y
+CONFIG_IP_ADVANCED_ROUTER=y
+CONFIG_IP_MULTIPLE_TABLES=y
+CONFIG_IP_ROUTE_MULTIPATH=y
+CONFIG_IP_ROUTE_VERBOSE=y
+CONFIG_IP_PNP=y
+CONFIG_IP_PNP_DHCP=y
+CONFIG_IP_PNP_BOOTP=y
+CONFIG_IP_PNP_RARP=y
+CONFIG_NET_IPIP=m
+CONFIG_IP_MROUTE=y
+CONFIG_INET_ESP=m
+CONFIG_INET_UDP_DIAG=y
+CONFIG_TCP_CONG_ADVANCED=y
+CONFIG_TCP_CONG_BBR=m
+CONFIG_IPV6_ROUTER_PREF=y
+CONFIG_IPV6_ROUTE_INFO=y
+CONFIG_IPV6_MROUTE=y
+CONFIG_NETWORK_PHY_TIMESTAMPING=y
+CONFIG_NETFILTER=y
+CONFIG_BRIDGE_NETFILTER=m
+CONFIG_NETFILTER_NETLINK_LOG=m
+CONFIG_NF_CONNTRACK=m
+CONFIG_NF_LOG_NETDEV=m
+CONFIG_NF_CONNTRACK_AMANDA=m
+CONFIG_NF_CONNTRACK_FTP=m
+CONFIG_NF_CONNTRACK_NETBIOS_NS=m
+CONFIG_NF_CONNTRACK_TFTP=m
+CONFIG_NF_CT_NETLINK=m
+CONFIG_NF_TABLES=m
+CONFIG_NFT_COUNTER=m
+CONFIG_NFT_CONNLIMIT=m
+CONFIG_NFT_LOG=m
+CONFIG_NFT_LIMIT=m
+CONFIG_NFT_MASQ=m
+CONFIG_NFT_REDIR=m
+CONFIG_NFT_NAT=m
+CONFIG_NFT_TUNNEL=m
+CONFIG_NFT_OBJREF=m
+CONFIG_NFT_QUEUE=m
+CONFIG_NFT_QUOTA=m
+CONFIG_NFT_REJECT=m
+CONFIG_NFT_COMPAT=m
+CONFIG_NFT_HASH=m
+CONFIG_NFT_SOCKET=m
+CONFIG_NFT_OSF=m
+CONFIG_NFT_TPROXY=m
+CONFIG_NETFILTER_XT_SET=m
+CONFIG_NETFILTER_XT_TARGET_AUDIT=m
+CONFIG_NETFILTER_XT_TARGET_CHECKSUM=m
+CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m
+CONFIG_NETFILTER_XT_TARGET_CONNMARK=m
+CONFIG_NETFILTER_XT_TARGET_CT=m
+CONFIG_NETFILTER_XT_TARGET_DSCP=m
+CONFIG_NETFILTER_XT_TARGET_HMARK=m
+CONFIG_NETFILTER_XT_TARGET_IDLETIMER=m
+CONFIG_NETFILTER_XT_TARGET_LED=m
+CONFIG_NETFILTER_XT_TARGET_LOG=m
+CONFIG_NETFILTER_XT_TARGET_MARK=m
+CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m
+CONFIG_NETFILTER_XT_TARGET_TRACE=m
+CONFIG_NETFILTER_XT_TARGET_SECMARK=m
+CONFIG_NETFILTER_XT_TARGET_TCPMSS=m
+CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m
+CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m
+CONFIG_NETFILTER_XT_MATCH_BPF=m
+CONFIG_NETFILTER_XT_MATCH_CGROUP=m
+CONFIG_NETFILTER_XT_MATCH_CLUSTER=m
+CONFIG_NETFILTER_XT_MATCH_COMMENT=m
+CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m
+CONFIG_NETFILTER_XT_MATCH_CONNLABEL=m
+CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=m
+CONFIG_NETFILTER_XT_MATCH_CONNMARK=m
+CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m
+CONFIG_NETFILTER_XT_MATCH_CPU=m
+CONFIG_NETFILTER_XT_MATCH_DCCP=m
+CONFIG_NETFILTER_XT_MATCH_DEVGROUP=m
+CONFIG_NETFILTER_XT_MATCH_DSCP=m
+CONFIG_NETFILTER_XT_MATCH_ESP=m
+CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m
+CONFIG_NETFILTER_XT_MATCH_HELPER=m
+CONFIG_NETFILTER_XT_MATCH_IPCOMP=m
+CONFIG_NETFILTER_XT_MATCH_IPRANGE=m
+CONFIG_NETFILTER_XT_MATCH_IPVS=m
+CONFIG_NETFILTER_XT_MATCH_LENGTH=m
+CONFIG_NETFILTER_XT_MATCH_LIMIT=m
+CONFIG_NETFILTER_XT_MATCH_MAC=m
+CONFIG_NETFILTER_XT_MATCH_MARK=m
+CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m
+CONFIG_NETFILTER_XT_MATCH_NFACCT=m
+CONFIG_NETFILTER_XT_MATCH_OSF=m
+CONFIG_NETFILTER_XT_MATCH_OWNER=m
+CONFIG_NETFILTER_XT_MATCH_POLICY=m
+CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m
+CONFIG_NETFILTER_XT_MATCH_QUOTA=m
+CONFIG_NETFILTER_XT_MATCH_RATEEST=m
+CONFIG_NETFILTER_XT_MATCH_REALM=m
+CONFIG_NETFILTER_XT_MATCH_SOCKET=m
+CONFIG_NETFILTER_XT_MATCH_STATE=m
+CONFIG_NETFILTER_XT_MATCH_STATISTIC=m
+CONFIG_NETFILTER_XT_MATCH_STRING=m
+CONFIG_NETFILTER_XT_MATCH_TCPMSS=m
+CONFIG_NETFILTER_XT_MATCH_TIME=m
+CONFIG_NETFILTER_XT_MATCH_U32=m
+CONFIG_IP_SET=m
+CONFIG_IP_VS=m
+CONFIG_IP_VS_IPV6=y
+CONFIG_IP_VS_PROTO_TCP=y
+CONFIG_IP_VS_PROTO_UDP=y
+CONFIG_IP_VS_RR=m
+CONFIG_IP_VS_NFCT=y
+CONFIG_NF_TABLES_IPV4=y
+CONFIG_NFT_DUP_IPV4=m
+CONFIG_NFT_FIB_IPV4=m
+CONFIG_NF_TABLES_ARP=y
+CONFIG_NF_LOG_ARP=m
+CONFIG_IP_NF_IPTABLES=m
+CONFIG_IP_NF_MATCH_AH=m
+CONFIG_IP_NF_MATCH_ECN=m
+CONFIG_IP_NF_MATCH_RPFILTER=m
+CONFIG_IP_NF_MATCH_TTL=m
+CONFIG_IP_NF_FILTER=m
+CONFIG_IP_NF_TARGET_REJECT=m
+CONFIG_IP_NF_TARGET_SYNPROXY=m
+CONFIG_IP_NF_NAT=m
+CONFIG_IP_NF_TARGET_MASQUERADE=m
+CONFIG_IP_NF_TARGET_NETMAP=m
+CONFIG_IP_NF_TARGET_REDIRECT=m
+CONFIG_IP_NF_MANGLE=m
+CONFIG_IP_NF_TARGET_CLUSTERIP=m
+CONFIG_IP_NF_TARGET_ECN=m
+CONFIG_IP_NF_TARGET_TTL=m
+CONFIG_IP_NF_RAW=m
+CONFIG_IP_NF_SECURITY=m
+CONFIG_IP_NF_ARPTABLES=m
+CONFIG_IP_NF_ARPFILTER=m
+CONFIG_IP_NF_ARP_MANGLE=m
+CONFIG_NF_TABLES_IPV6=y
+CONFIG_IP6_NF_IPTABLES=y
+CONFIG_IP6_NF_MATCH_AH=m
+CONFIG_IP6_NF_MATCH_EUI64=m
+CONFIG_IP6_NF_MATCH_FRAG=m
+CONFIG_IP6_NF_MATCH_OPTS=m
+CONFIG_IP6_NF_MATCH_IPV6HEADER=m
+CONFIG_IP6_NF_MATCH_MH=m
+CONFIG_IP6_NF_MATCH_RPFILTER=m
+CONFIG_IP6_NF_MATCH_RT=m
+CONFIG_IP6_NF_MATCH_SRH=m
+CONFIG_IP6_NF_FILTER=y
+CONFIG_IP6_NF_TARGET_REJECT=m
+CONFIG_IP6_NF_TARGET_SYNPROXY=m
+CONFIG_IP6_NF_MANGLE=m
+CONFIG_IP6_NF_RAW=m
+CONFIG_IP6_NF_SECURITY=m
+CONFIG_IP6_NF_NAT=m
+CONFIG_IP6_NF_TARGET_MASQUERADE=m
+CONFIG_IP6_NF_TARGET_NPT=m
+CONFIG_NF_TABLES_BRIDGE=m
+CONFIG_BRIDGE_NF_EBTABLES=m
+CONFIG_BRIDGE_EBT_BROUTE=m
+CONFIG_BRIDGE_EBT_T_FILTER=m
+CONFIG_BRIDGE_EBT_T_NAT=m
+CONFIG_BRIDGE_EBT_ARP=m
+CONFIG_BRIDGE_EBT_IP=m
+CONFIG_BRIDGE_EBT_IP6=m
+CONFIG_BPFILTER=y
+CONFIG_IP_SCTP=m
+CONFIG_RDS=y
+CONFIG_L2TP=m
+CONFIG_BRIDGE=m
+CONFIG_VLAN_8021Q=m
+CONFIG_VLAN_8021Q_GVRP=y
+CONFIG_VLAN_8021Q_MVRP=y
+CONFIG_NET_SCHED=y
+CONFIG_NET_SCH_HTB=m
+CONFIG_NET_SCH_PRIO=m
+CONFIG_NET_SCH_SFQ=m
+CONFIG_NET_SCH_TBF=m
+CONFIG_NET_SCH_NETEM=m
+CONFIG_NET_SCH_INGRESS=m
+CONFIG_NET_CLS_BASIC=m
+CONFIG_NET_CLS_FW=m
+CONFIG_NET_CLS_U32=m
+CONFIG_NET_CLS_CGROUP=m
+CONFIG_NET_CLS_BPF=m
+CONFIG_NET_CLS_ACT=y
+CONFIG_NET_ACT_POLICE=m
+CONFIG_NET_ACT_GACT=m
+CONFIG_NET_ACT_MIRRED=m
+CONFIG_NET_ACT_IPT=m
+CONFIG_NET_ACT_NAT=m
+CONFIG_NET_ACT_BPF=m
+CONFIG_OPENVSWITCH=m
+CONFIG_NETLINK_DIAG=y
+CONFIG_CGROUP_NET_PRIO=y
+CONFIG_BT=m
+CONFIG_BT_HCIBTUSB=m
+# CONFIG_BT_HCIBTUSB_BCM is not set
+CONFIG_CFG80211=m
+CONFIG_CFG80211_WEXT=y
+CONFIG_MAC80211=m
+CONFIG_RFKILL=m
+CONFIG_RFKILL_INPUT=y
+CONFIG_NET_9P=y
+CONFIG_CEPH_LIB=m
+CONFIG_PCIEPORTBUS=y
+CONFIG_HOTPLUG_PCI_PCIE=y
+CONFIG_PCIEAER=y
+# CONFIG_PCIEASPM is not set
+CONFIG_PCI_IOV=y
+CONFIG_HOTPLUG_PCI=y
+CONFIG_HOTPLUG_PCI_SHPC=y
+CONFIG_PCCARD=m
+CONFIG_YENTA=m
+CONFIG_RAPIDIO=y
+CONFIG_RAPIDIO_TSI721=y
+CONFIG_RAPIDIO_ENABLE_RX_TX_PORTS=y
+CONFIG_RAPIDIO_ENUM_BASIC=m
+CONFIG_RAPIDIO_CHMAN=m
+CONFIG_RAPIDIO_MPORT_CDEV=m
+CONFIG_UEVENT_HELPER=y
+CONFIG_DEVTMPFS=y
+CONFIG_DEVTMPFS_MOUNT=y
+CONFIG_MTD=m
+CONFIG_MTD_BLOCK=m
+CONFIG_MTD_CFI=m
+CONFIG_MTD_JEDECPROBE=m
+CONFIG_MTD_CFI_INTELEXT=m
+CONFIG_MTD_CFI_AMDSTD=m
+CONFIG_MTD_CFI_STAA=m
+CONFIG_MTD_RAM=m
+CONFIG_MTD_ROM=m
+CONFIG_PARPORT=y
+CONFIG_PARPORT_PC=y
+CONFIG_PARPORT_SERIAL=y
+CONFIG_PARPORT_PC_FIFO=y
+CONFIG_ZRAM=m
+CONFIG_ZRAM_DEF_COMP_ZSTD=y
+CONFIG_BLK_DEV_LOOP=y
+CONFIG_BLK_DEV_CRYPTOLOOP=y
+CONFIG_BLK_DEV_NBD=m
+CONFIG_BLK_DEV_RAM=y
+CONFIG_BLK_DEV_RAM_SIZE=8192
+CONFIG_BLK_DEV_RBD=m
+CONFIG_BLK_DEV_NVME=y
+CONFIG_EEPROM_AT24=m
+CONFIG_BLK_DEV_SD=y
+CONFIG_BLK_DEV_SR=y
+CONFIG_CHR_DEV_SG=y
+CONFIG_CHR_DEV_SCH=m
+CONFIG_SCSI_CONSTANTS=y
+CONFIG_SCSI_LOGGING=y
+CONFIG_SCSI_SPI_ATTRS=m
+CONFIG_SCSI_FC_ATTRS=m
+CONFIG_SCSI_SAS_ATA=y
+CONFIG_ISCSI_TCP=m
+CONFIG_SCSI_MVSAS=y
+# CONFIG_SCSI_MVSAS_DEBUG is not set
+CONFIG_SCSI_MVSAS_TASKLET=y
+CONFIG_SCSI_MVUMI=y
+CONFIG_MEGARAID_NEWGEN=y
+CONFIG_MEGARAID_MM=y
+CONFIG_MEGARAID_MAILBOX=y
+CONFIG_MEGARAID_LEGACY=y
+CONFIG_MEGARAID_SAS=y
+CONFIG_SCSI_MPT2SAS=y
+CONFIG_LIBFC=m
+CONFIG_LIBFCOE=m
+CONFIG_FCOE=m
+CONFIG_SCSI_QLOGIC_1280=m
+CONFIG_SCSI_QLA_FC=m
+CONFIG_TCM_QLA2XXX=m
+CONFIG_SCSI_QLA_ISCSI=m
+CONFIG_SCSI_LPFC=m
+CONFIG_ATA=y
+CONFIG_SATA_AHCI=y
+CONFIG_SATA_AHCI_PLATFORM=y
+CONFIG_PATA_ATIIXP=y
+CONFIG_PATA_PCMCIA=m
+CONFIG_MD=y
+CONFIG_BLK_DEV_MD=m
+CONFIG_MD_LINEAR=m
+CONFIG_MD_RAID0=m
+CONFIG_MD_RAID1=m
+CONFIG_MD_RAID10=m
+CONFIG_MD_RAID456=m
+CONFIG_MD_MULTIPATH=m
+CONFIG_BCACHE=m
+CONFIG_BLK_DEV_DM=y
+CONFIG_DM_CRYPT=m
+CONFIG_DM_SNAPSHOT=m
+CONFIG_DM_THIN_PROVISIONING=m
+CONFIG_DM_CACHE=m
+CONFIG_DM_WRITECACHE=m
+CONFIG_DM_MIRROR=m
+CONFIG_DM_RAID=m
+CONFIG_DM_ZERO=m
+CONFIG_DM_MULTIPATH=m
+CONFIG_DM_MULTIPATH_QL=m
+CONFIG_DM_MULTIPATH_ST=m
+CONFIG_TARGET_CORE=m
+CONFIG_TCM_IBLOCK=m
+CONFIG_TCM_FILEIO=m
+CONFIG_TCM_PSCSI=m
+CONFIG_TCM_USER2=m
+CONFIG_LOOPBACK_TARGET=m
+CONFIG_ISCSI_TARGET=m
+CONFIG_NETDEVICES=y
+CONFIG_BONDING=m
+CONFIG_DUMMY=y
+CONFIG_WIREGUARD=m
+CONFIG_MACVLAN=m
+CONFIG_MACVTAP=m
+CONFIG_IPVLAN=m
+CONFIG_VXLAN=y
+CONFIG_RIONET=m
+CONFIG_TUN=m
+CONFIG_VETH=m
+# CONFIG_NET_VENDOR_3COM is not set
+# CONFIG_NET_VENDOR_ADAPTEC is not set
+# CONFIG_NET_VENDOR_AGERE is not set
+# CONFIG_NET_VENDOR_ALACRITECH is not set
+# CONFIG_NET_VENDOR_ALTEON is not set
+# CONFIG_NET_VENDOR_AMAZON is not set
+# CONFIG_NET_VENDOR_AMD is not set
+# CONFIG_NET_VENDOR_AQUANTIA is not set
+# CONFIG_NET_VENDOR_ARC is not set
+# CONFIG_NET_VENDOR_ATHEROS is not set
+CONFIG_BNX2=y
+# CONFIG_NET_VENDOR_BROCADE is not set
+# CONFIG_NET_VENDOR_CAVIUM is not set
+CONFIG_CHELSIO_T1=m
+CONFIG_CHELSIO_T1_1G=y
+CONFIG_CHELSIO_T3=m
+CONFIG_CHELSIO_T4=m
+# CONFIG_NET_VENDOR_CIRRUS is not set
+# CONFIG_NET_VENDOR_CISCO is not set
+# CONFIG_NET_VENDOR_DEC is not set
+# CONFIG_NET_VENDOR_DLINK is not set
+# CONFIG_NET_VENDOR_EMULEX is not set
+# CONFIG_NET_VENDOR_EZCHIP is not set
+# CONFIG_NET_VENDOR_I825XX is not set
+CONFIG_E1000=y
+CONFIG_E1000E=y
+CONFIG_IGB=y
+CONFIG_IXGB=y
+CONFIG_IXGBE=y
+# CONFIG_NET_VENDOR_MARVELL is not set
+# CONFIG_NET_VENDOR_MELLANOX is not set
+# CONFIG_NET_VENDOR_MICREL is not set
+# CONFIG_NET_VENDOR_MYRI is not set
+# CONFIG_NET_VENDOR_NATSEMI is not set
+# CONFIG_NET_VENDOR_NETRONOME is not set
+# CONFIG_NET_VENDOR_NVIDIA is not set
+# CONFIG_NET_VENDOR_OKI is not set
+# CONFIG_NET_VENDOR_QLOGIC is not set
+# CONFIG_NET_VENDOR_QUALCOMM is not set
+# CONFIG_NET_VENDOR_RDC is not set
+CONFIG_8139CP=m
+CONFIG_8139TOO=m
+CONFIG_R8169=y
+# CONFIG_NET_VENDOR_RENESAS is not set
+# CONFIG_NET_VENDOR_ROCKER is not set
+# CONFIG_NET_VENDOR_SAMSUNG is not set
+# CONFIG_NET_VENDOR_SEEQ is not set
+# CONFIG_NET_VENDOR_SOLARFLARE is not set
+# CONFIG_NET_VENDOR_SILAN is not set
+# CONFIG_NET_VENDOR_SIS is not set
+# CONFIG_NET_VENDOR_SMSC is not set
+CONFIG_STMMAC_ETH=y
+# CONFIG_NET_VENDOR_SUN is not set
+# CONFIG_NET_VENDOR_TEHUTI is not set
+# CONFIG_NET_VENDOR_TI is not set
+# CONFIG_NET_VENDOR_VIA is not set
+# CONFIG_NET_VENDOR_WIZNET is not set
+# CONFIG_NET_VENDOR_XILINX is not set
+CONFIG_PPP=m
+CONFIG_PPP_BSDCOMP=m
+CONFIG_PPP_DEFLATE=m
+CONFIG_PPP_FILTER=y
+CONFIG_PPP_MPPE=m
+CONFIG_PPP_MULTILINK=y
+CONFIG_PPPOE=m
+CONFIG_PPPOL2TP=m
+CONFIG_PPP_ASYNC=m
+CONFIG_PPP_SYNC_TTY=m
+CONFIG_USB_RTL8150=m
+CONFIG_USB_RTL8152=m
+# CONFIG_USB_NET_AX8817X is not set
+# CONFIG_USB_NET_AX88179_178A is not set
+CONFIG_USB_NET_CDC_EEM=m
+CONFIG_USB_NET_HUAWEI_CDC_NCM=m
+CONFIG_USB_NET_CDC_MBIM=m
+# CONFIG_USB_NET_NET1080 is not set
+# CONFIG_USB_BELKIN is not set
+# CONFIG_USB_ARMLINUX is not set
+# CONFIG_USB_NET_ZAURUS is not set
+CONFIG_ATH9K=m
+CONFIG_ATH9K_HTC=m
+CONFIG_IWLWIFI=m
+CONFIG_IWLDVM=m
+CONFIG_IWLMVM=m
+CONFIG_IWLWIFI_BCAST_FILTERING=y
+CONFIG_HOSTAP=m
+CONFIG_MT7601U=m
+CONFIG_RT2X00=m
+CONFIG_RT2800USB=m
+CONFIG_RTL8192CE=m
+CONFIG_RTL8192SE=m
+CONFIG_RTL8192DE=m
+CONFIG_RTL8723AE=m
+CONFIG_RTL8723BE=m
+CONFIG_RTL8188EE=m
+CONFIG_RTL8192EE=m
+CONFIG_RTL8821AE=m
+CONFIG_RTL8192CU=m
+# CONFIG_RTLWIFI_DEBUG is not set
+CONFIG_RTL8XXXU=m
+CONFIG_ZD1211RW=m
+CONFIG_USB_NET_RNDIS_WLAN=m
+CONFIG_INPUT_MOUSEDEV=y
+CONFIG_INPUT_MOUSEDEV_PSAUX=y
+CONFIG_INPUT_EVDEV=y
+CONFIG_KEYBOARD_XTKBD=m
+CONFIG_MOUSE_PS2_ELANTECH=y
+CONFIG_MOUSE_PS2_SENTELIC=y
+CONFIG_MOUSE_SERIAL=m
+CONFIG_INPUT_MISC=y
+CONFIG_INPUT_UINPUT=m
+CONFIG_SERIO_SERPORT=m
+CONFIG_SERIO_RAW=m
+CONFIG_LEGACY_PTY_COUNT=16
+CONFIG_SERIAL_8250=y
+CONFIG_SERIAL_8250_CONSOLE=y
+CONFIG_SERIAL_8250_NR_UARTS=16
+CONFIG_SERIAL_8250_RUNTIME_UARTS=16
+CONFIG_SERIAL_8250_EXTENDED=y
+CONFIG_SERIAL_8250_MANY_PORTS=y
+CONFIG_SERIAL_8250_SHARE_IRQ=y
+CONFIG_SERIAL_8250_RSA=y
+CONFIG_SERIAL_NONSTANDARD=y
+CONFIG_PRINTER=m
+CONFIG_IPMI_HANDLER=m
+CONFIG_IPMI_DEVICE_INTERFACE=m
+CONFIG_IPMI_SI=m
+CONFIG_HW_RANDOM=y
+CONFIG_I2C_CHARDEV=y
+CONFIG_I2C_PIIX4=y
+CONFIG_I2C_GPIO=y
+CONFIG_SPI=y
+CONFIG_GPIO_SYSFS=y
+CONFIG_GPIO_LOONGSON=y
+CONFIG_SENSORS_LM75=m
+CONFIG_SENSORS_LM93=m
+CONFIG_SENSORS_W83795=m
+CONFIG_SENSORS_W83627HF=m
+CONFIG_RC_CORE=m
+CONFIG_LIRC=y
+CONFIG_RC_DECODERS=y
+CONFIG_IR_NEC_DECODER=m
+CONFIG_IR_RC5_DECODER=m
+CONFIG_IR_RC6_DECODER=m
+CONFIG_IR_JVC_DECODER=m
+CONFIG_IR_SONY_DECODER=m
+CONFIG_IR_SANYO_DECODER=m
+CONFIG_IR_SHARP_DECODER=m
+CONFIG_IR_MCE_KBD_DECODER=m
+CONFIG_IR_XMP_DECODER=m
+CONFIG_IR_IMON_DECODER=m
+CONFIG_MEDIA_SUPPORT=m
+CONFIG_MEDIA_USB_SUPPORT=y
+CONFIG_USB_VIDEO_CLASS=m
+CONFIG_MEDIA_PCI_SUPPORT=y
+CONFIG_VIDEO_BT848=m
+CONFIG_DVB_BT8XX=m
+CONFIG_DRM=y
+CONFIG_DRM_RADEON=m
+CONFIG_DRM_RADEON_USERPTR=y
+CONFIG_DRM_AMDGPU=m
+CONFIG_DRM_AMDGPU_SI=y
+CONFIG_DRM_AMDGPU_CIK=y
+CONFIG_DRM_AMDGPU_USERPTR=y
+CONFIG_DRM_AST=y
+CONFIG_FB=y
+CONFIG_FB_EFI=y
+CONFIG_FB_RADEON=y
+CONFIG_LCD_PLATFORM=m
+# CONFIG_VGA_CONSOLE is not set
+CONFIG_FRAMEBUFFER_CONSOLE=y
+CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y
+CONFIG_LOGO=y
+CONFIG_SOUND=y
+CONFIG_SND=y
+CONFIG_SND_SEQUENCER=m
+CONFIG_SND_SEQ_DUMMY=m
+# CONFIG_SND_ISA is not set
+CONFIG_SND_BT87X=m
+CONFIG_SND_BT87X_OVERCLOCK=y
+CONFIG_SND_HDA_INTEL=y
+CONFIG_SND_HDA_HWDEP=y
+CONFIG_SND_HDA_INPUT_BEEP=y
+CONFIG_SND_HDA_PATCH_LOADER=y
+CONFIG_SND_HDA_CODEC_REALTEK=y
+CONFIG_SND_HDA_CODEC_SIGMATEL=y
+CONFIG_SND_HDA_CODEC_HDMI=y
+CONFIG_SND_HDA_CODEC_CONEXANT=y
+CONFIG_SND_USB_AUDIO=m
+CONFIG_HIDRAW=y
+CONFIG_UHID=m
+CONFIG_HID_A4TECH=m
+CONFIG_HID_CHERRY=m
+CONFIG_HID_LOGITECH=m
+CONFIG_HID_LOGITECH_DJ=m
+CONFIG_LOGITECH_FF=y
+CONFIG_LOGIRUMBLEPAD2_FF=y
+CONFIG_LOGIG940_FF=y
+CONFIG_HID_MICROSOFT=m
+CONFIG_HID_MULTITOUCH=m
+CONFIG_HID_SUNPLUS=m
+CONFIG_USB_HIDDEV=y
+CONFIG_USB=y
+CONFIG_USB_OTG=y
+CONFIG_USB_MON=y
+CONFIG_USB_XHCI_HCD=y
+CONFIG_USB_EHCI_HCD=y
+CONFIG_USB_EHCI_ROOT_HUB_TT=y
+CONFIG_USB_EHCI_HCD_PLATFORM=y
+CONFIG_USB_OHCI_HCD=y
+CONFIG_USB_OHCI_HCD_PLATFORM=y
+CONFIG_USB_UHCI_HCD=m
+CONFIG_USB_ACM=m
+CONFIG_USB_PRINTER=m
+CONFIG_USB_STORAGE=m
+CONFIG_USB_STORAGE_REALTEK=m
+CONFIG_USB_UAS=m
+CONFIG_USB_DWC2=y
+CONFIG_USB_DWC2_HOST=y
+CONFIG_USB_SERIAL=m
+CONFIG_USB_SERIAL_CH341=m
+CONFIG_USB_SERIAL_CP210X=m
+CONFIG_USB_SERIAL_FTDI_SIO=m
+CONFIG_USB_SERIAL_PL2303=m
+CONFIG_USB_SERIAL_OPTION=m
+CONFIG_USB_GADGET=y
+CONFIG_INFINIBAND=m
+CONFIG_RTC_CLASS=y
+CONFIG_RTC_DRV_EFI=y
+CONFIG_DMADEVICES=y
+CONFIG_UIO=m
+CONFIG_UIO_PDRV_GENIRQ=m
+CONFIG_UIO_DMEM_GENIRQ=m
+CONFIG_UIO_PCI_GENERIC=m
+# CONFIG_VIRTIO_MENU is not set
+CONFIG_COMEDI=m
+CONFIG_COMEDI_PCI_DRIVERS=m
+CONFIG_COMEDI_8255_PCI=m
+CONFIG_COMEDI_ADL_PCI6208=m
+CONFIG_COMEDI_ADL_PCI7X3X=m
+CONFIG_COMEDI_ADL_PCI8164=m
+CONFIG_COMEDI_ADL_PCI9111=m
+CONFIG_COMEDI_ADL_PCI9118=m
+CONFIG_COMEDI_ADV_PCI1710=m
+CONFIG_COMEDI_ADV_PCI1720=m
+CONFIG_COMEDI_ADV_PCI1723=m
+CONFIG_COMEDI_ADV_PCI1724=m
+CONFIG_COMEDI_ADV_PCI1760=m
+CONFIG_COMEDI_ADV_PCI_DIO=m
+CONFIG_COMEDI_NI_LABPC_PCI=m
+CONFIG_COMEDI_NI_PCIDIO=m
+CONFIG_COMEDI_NI_PCIMIO=m
+CONFIG_STAGING=y
+CONFIG_R8188EU=m
+# CONFIG_88EU_AP_MODE is not set
+CONFIG_PM_DEVFREQ=y
+CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND=y
+CONFIG_DEVFREQ_GOV_PERFORMANCE=y
+CONFIG_DEVFREQ_GOV_POWERSAVE=y
+CONFIG_DEVFREQ_GOV_USERSPACE=y
+CONFIG_PWM=y
+CONFIG_EXT2_FS=y
+CONFIG_EXT2_FS_XATTR=y
+CONFIG_EXT2_FS_POSIX_ACL=y
+CONFIG_EXT2_FS_SECURITY=y
+CONFIG_EXT3_FS=y
+CONFIG_EXT3_FS_POSIX_ACL=y
+CONFIG_EXT3_FS_SECURITY=y
+CONFIG_XFS_FS=y
+CONFIG_XFS_QUOTA=y
+CONFIG_XFS_POSIX_ACL=y
+CONFIG_BTRFS_FS=y
+CONFIG_FANOTIFY=y
+CONFIG_FANOTIFY_ACCESS_PERMISSIONS=y
+CONFIG_QUOTA=y
+# CONFIG_PRINT_QUOTA_WARNING is not set
+CONFIG_QFMT_V1=m
+CONFIG_QFMT_V2=m
+CONFIG_AUTOFS4_FS=y
+CONFIG_FUSE_FS=m
+CONFIG_OVERLAY_FS=y
+CONFIG_OVERLAY_FS_INDEX=y
+CONFIG_OVERLAY_FS_XINO_AUTO=y
+CONFIG_OVERLAY_FS_METACOPY=y
+CONFIG_FSCACHE=y
+CONFIG_ISO9660_FS=y
+CONFIG_JOLIET=y
+CONFIG_ZISOFS=y
+CONFIG_UDF_FS=y
+CONFIG_MSDOS_FS=m
+CONFIG_VFAT_FS=m
+CONFIG_FAT_DEFAULT_CODEPAGE=936
+CONFIG_FAT_DEFAULT_IOCHARSET="gb2312"
+CONFIG_PROC_KCORE=y
+CONFIG_TMPFS=y
+CONFIG_TMPFS_POSIX_ACL=y
+CONFIG_HUGETLBFS=y
+CONFIG_CONFIGFS_FS=y
+CONFIG_HFS_FS=m
+CONFIG_HFSPLUS_FS=m
+CONFIG_CRAMFS=m
+CONFIG_SQUASHFS=y
+CONFIG_SQUASHFS_XATTR=y
+CONFIG_SQUASHFS_LZ4=y
+CONFIG_SQUASHFS_LZO=y
+CONFIG_SQUASHFS_XZ=y
+CONFIG_NFS_FS=y
+CONFIG_NFS_V3_ACL=y
+CONFIG_NFS_V4=y
+CONFIG_NFS_V4_1=y
+CONFIG_NFS_V4_2=y
+CONFIG_ROOT_NFS=y
+CONFIG_NFSD=y
+CONFIG_NFSD_V3_ACL=y
+CONFIG_NFSD_V4=y
+CONFIG_NFSD_BLOCKLAYOUT=y
+CONFIG_CIFS=m
+# CONFIG_CIFS_DEBUG is not set
+CONFIG_9P_FS=y
+CONFIG_NLS_CODEPAGE_437=y
+CONFIG_NLS_CODEPAGE_936=y
+CONFIG_NLS_ASCII=y
+CONFIG_NLS_UTF8=y
+CONFIG_KEY_DH_OPERATIONS=y
+CONFIG_SECURITY=y
+CONFIG_SECURITY_SELINUX=y
+CONFIG_SECURITY_SELINUX_BOOTPARAM=y
+CONFIG_SECURITY_SELINUX_DISABLE=y
+CONFIG_SECURITY_APPARMOR=y
+CONFIG_SECURITY_YAMA=y
+CONFIG_DEFAULT_SECURITY_DAC=y
+CONFIG_CRYPTO_USER=m
+# CONFIG_CRYPTO_MANAGER_DISABLE_TESTS is not set
+CONFIG_CRYPTO_PCRYPT=m
+CONFIG_CRYPTO_CRYPTD=m
+CONFIG_CRYPTO_CHACHA20POLY1305=m
+CONFIG_CRYPTO_HMAC=y
+CONFIG_CRYPTO_VMAC=m
+CONFIG_CRYPTO_TGR192=m
+CONFIG_CRYPTO_WP512=m
+CONFIG_CRYPTO_ANUBIS=m
+CONFIG_CRYPTO_BLOWFISH=m
+CONFIG_CRYPTO_CAST5=m
+CONFIG_CRYPTO_CAST6=m
+CONFIG_CRYPTO_KHAZAD=m
+CONFIG_CRYPTO_SALSA20=m
+CONFIG_CRYPTO_SEED=m
+CONFIG_CRYPTO_SERPENT=m
+CONFIG_CRYPTO_TEA=m
+CONFIG_CRYPTO_TWOFISH=m
+CONFIG_CRYPTO_DEFLATE=m
+CONFIG_CRYPTO_LZO=m
+CONFIG_CRYPTO_842=m
+CONFIG_CRYPTO_LZ4=m
+CONFIG_CRYPTO_LZ4HC=m
+CONFIG_CRYPTO_USER_API_HASH=m
+CONFIG_CRYPTO_USER_API_SKCIPHER=m
+CONFIG_CRYPTO_USER_API_RNG=m
+CONFIG_CRYPTO_USER_API_AEAD=m
+CONFIG_PRINTK_TIME=y
+CONFIG_STRIP_ASM_SYMS=y
+CONFIG_MAGIC_SYSRQ=y
+# CONFIG_SCHED_DEBUG is not set
+CONFIG_SCHEDSTATS=y
+# CONFIG_DEBUG_PREEMPT is not set
+# CONFIG_FTRACE is not set
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 13/24] LoongArch: Add system call support
  2022-04-30  9:05 ` [PATCH V9 13/24] LoongArch: Add system call support Huacai Chen
@ 2022-04-30  9:44   ` Arnd Bergmann
  2022-04-30 10:05     ` Huacai Chen
  0 siblings, 1 reply; 94+ messages in thread
From: Arnd Bergmann @ 2022-04-30  9:44 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds,
	linux-arch, open list:DOCUMENTATION, Linux Kernel Mailing List,
	Xuefeng Li, Yanteng Si, Huacai Chen, Guo Ren, Xuerui Wang,
	Jiaxun Yang

On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
>
> This patch adds system call support and related uaccess.h for LoongArch.
>
> Q: Why keep __ARCH_WANT_NEW_STAT definition while there is statx:
> A: Until the latest glibc release (2.34), statx is only used for 32-bit
>    platforms, or 64-bit platforms with 32-bit timestamp. I.e., Most 64-
>    bit platforms still use newstat now.
>
> Q: Why keep _ARCH_WANT_SYS_CLONE definition while there is clone3:
> A: The latest glibc release (2.34) has some basic support for clone3 but
>    it isn't complete. E.g., pthread_create() and spawni() have converted
>    to use clone3 but fork() will still use clone. Moreover, some seccomp
>    related applications can still not work perfectly with clone3. E.g.,
>    Chromium sandbox cannot work at all and there is no solution for it,
>    which is more terrible than the fork() story [1].
>
> [1] https://chromium-review.googlesource.com/c/chromium/src/+/2936184

I still think these have to be removed. There is no mainline glibc or musl
port yet, and neither of them should actually be required. Please remove
them here, and modify your libc patches accordingly when you send those
upstream.

       Arnd

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 16/24] LoongArch: Add misc common routines
  2022-04-30  9:05 ` [PATCH V9 16/24] LoongArch: Add misc common routines Huacai Chen
@ 2022-04-30  9:50   ` Arnd Bergmann
  2022-04-30 10:00     ` Huacai Chen
  0 siblings, 1 reply; 94+ messages in thread
From: Arnd Bergmann @ 2022-04-30  9:50 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds,
	linux-arch, open list:DOCUMENTATION, Linux Kernel Mailing List,
	Xuefeng Li, Yanteng Si, Huacai Chen, Guo Ren, Xuerui Wang,
	Jiaxun Yang

On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:

> +unsigned long __xchg_small(volatile void *ptr, unsigned long val, unsigned int size)
> +{
> +       u32 old32, mask, temp;
> +       volatile u32 *ptr32;
> +       unsigned int shift;
> +
> +       /* Check that ptr is naturally aligned */

As discussed, please remove this function and all the references to it.

      Arnd

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 20/24] LoongArch: Add efistub booting support
  2022-04-30  9:05 ` [PATCH V9 20/24] LoongArch: Add efistub booting support Huacai Chen
@ 2022-04-30  9:56   ` Arnd Bergmann
  2022-04-30 10:02     ` Huacai Chen
  2022-05-03  7:23     ` Ard Biesheuvel
  0 siblings, 2 replies; 94+ messages in thread
From: Arnd Bergmann @ 2022-04-30  9:56 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds,
	linux-arch, open list:DOCUMENTATION, Linux Kernel Mailing List,
	Xuefeng Li, Yanteng Si, Huacai Chen, Guo Ren, Xuerui Wang,
	Jiaxun Yang, Ard Biesheuvel

On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
>
> This patch adds efistub booting support, which is the standard UEFI boot
> protocol for us to use.
>
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

It's good to see that you completed this. Unfortunately you did not add Ard
Biesheuvel to Cc, he is the one who needs to review this code. Adding him
to Cc now, with the full patch quoted below for him (no more comments
from me there).

      Arnd

> ---
>  arch/loongarch/Kbuild                         |   3 +
>  arch/loongarch/Kconfig                        |   8 +
>  arch/loongarch/Makefile                       |  18 +-
>  arch/loongarch/boot/Makefile                  |  23 +
>  arch/loongarch/kernel/efi-header.S            | 100 +++++
>  arch/loongarch/kernel/head.S                  |  44 +-
>  arch/loongarch/kernel/image-vars.h            |  30 ++
>  arch/loongarch/kernel/vmlinux.lds.S           |  23 +-
>  drivers/firmware/efi/Kconfig                  |   4 +-
>  drivers/firmware/efi/libstub/Makefile         |  14 +-
>  drivers/firmware/efi/libstub/loongarch-stub.c | 425 ++++++++++++++++++
>  include/linux/pe.h                            |   1 +
>  12 files changed, 680 insertions(+), 13 deletions(-)
>  create mode 100644 arch/loongarch/boot/Makefile
>  create mode 100644 arch/loongarch/kernel/efi-header.S
>  create mode 100644 arch/loongarch/kernel/image-vars.h
>  create mode 100644 drivers/firmware/efi/libstub/loongarch-stub.c
>
> diff --git a/arch/loongarch/Kbuild b/arch/loongarch/Kbuild
> index 1ad35aabdd16..ab5373d0a24f 100644
> --- a/arch/loongarch/Kbuild
> +++ b/arch/loongarch/Kbuild
> @@ -1,3 +1,6 @@
>  obj-y += kernel/
>  obj-y += mm/
>  obj-y += vdso/
> +
> +# for cleaning
> +subdir- += boot
> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> index 44b763046893..55225ee5f868 100644
> --- a/arch/loongarch/Kconfig
> +++ b/arch/loongarch/Kconfig
> @@ -265,6 +265,14 @@ config EFI
>           resultant kernel should continue to boot on existing non-EFI
>           platforms.
>
> +config EFI_STUB
> +       bool "EFI boot stub support"
> +       default y
> +       depends on EFI
> +       help
> +         This kernel feature allows the kernel to be loaded directly by
> +         EFI firmware without the use of a bootloader.
> +
>  config FORCE_MAX_ZONEORDER
>         int "Maximum zone order"
>         range 14 64 if PAGE_SIZE_64KB
> diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
> index c4b3f53cd276..d88a792dafbe 100644
> --- a/arch/loongarch/Makefile
> +++ b/arch/loongarch/Makefile
> @@ -3,6 +3,14 @@
>  # Author: Huacai Chen <chenhuacai@loongson.cn>
>  # Copyright (C) 2020-2022 Loongson Technology Corporation Limited
>
> +boot   := arch/loongarch/boot
> +
> +ifndef CONFIG_EFI_STUB
> +KBUILD_IMAGE   = $(boot)/vmlinux
> +else
> +KBUILD_IMAGE   = $(boot)/vmlinux.efi
> +endif
> +
>  #
>  # Select the object file format to substitute into the linker script.
>  #
> @@ -30,8 +38,6 @@ ld-emul                       = $(64bit-emul)
>  cflags-y               += -mabi=lp64s
>  endif
>
> -all-y                  := vmlinux
> -
>  #
>  # GCC uses -G0 -mabicalls -fpic as default.  We don't want PIC in the kernel
>  # code since it only slows down the whole thing.  At some point we might make
> @@ -75,6 +81,7 @@ endif
>  head-y := arch/loongarch/kernel/head.o
>
>  libs-y += arch/loongarch/lib/
> +libs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
>
>  ifeq ($(KBUILD_EXTMOD),)
>  prepare: vdso_prepare
> @@ -86,12 +93,13 @@ PHONY += vdso_install
>  vdso_install:
>         $(Q)$(MAKE) $(build)=arch/loongarch/vdso $@
>
> -all:   $(all-y)
> +all:   $(KBUILD_IMAGE)
>
> -CLEAN_FILES += vmlinux
> +$(KBUILD_IMAGE): vmlinux
> +       $(Q)$(MAKE) $(build)=$(boot) $(bootvars-y) $@
>
>  install:
> -       $(Q)install -D -m 755 vmlinux $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
> +       $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
>         $(Q)install -D -m 644 .config $(INSTALL_PATH)/config-$(KERNELRELEASE)
>         $(Q)install -D -m 644 System.map $(INSTALL_PATH)/System.map-$(KERNELRELEASE)
>
> diff --git a/arch/loongarch/boot/Makefile b/arch/loongarch/boot/Makefile
> new file mode 100644
> index 000000000000..66f2293c34b2
> --- /dev/null
> +++ b/arch/loongarch/boot/Makefile
> @@ -0,0 +1,23 @@
> +#
> +# arch/loongarch/boot/Makefile
> +#
> +# Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> +#
> +
> +drop-sections := .comment .note .options .note.gnu.build-id
> +strip-flags   := $(addprefix --remove-section=,$(drop-sections)) -S
> +
> +targets := vmlinux
> +quiet_cmd_strip = STRIP          $@
> +      cmd_strip = $(STRIP) -s $@
> +
> +$(obj)/vmlinux: vmlinux FORCE
> +       $(call if_changed,copy)
> +       $(call if_changed,strip)
> +
> +targets += vmlinux.efi
> +quiet_cmd_eficopy = OBJCOPY $@
> +      cmd_eficopy = $(OBJCOPY) -O binary $(strip-flags) $< $@
> +
> +$(obj)/vmlinux.efi: $(obj)/vmlinux FORCE
> +       $(call if_changed,eficopy)
> diff --git a/arch/loongarch/kernel/efi-header.S b/arch/loongarch/kernel/efi-header.S
> new file mode 100644
> index 000000000000..ceb44524944a
> --- /dev/null
> +++ b/arch/loongarch/kernel/efi-header.S
> @@ -0,0 +1,100 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <linux/pe.h>
> +#include <linux/sizes.h>
> +
> +       .macro  __EFI_PE_HEADER
> +       .long   PE_MAGIC
> +coff_header:
> +       .short  IMAGE_FILE_MACHINE_LOONGARCH            /* Machine */
> +       .short  section_count                           /* NumberOfSections */
> +       .long   0                                       /* TimeDateStamp */
> +       .long   0                                       /* PointerToSymbolTable */
> +       .long   0                                       /* NumberOfSymbols */
> +       .short  section_table - optional_header         /* SizeOfOptionalHeader */
> +       .short  IMAGE_FILE_DEBUG_STRIPPED | \
> +               IMAGE_FILE_EXECUTABLE_IMAGE | \
> +               IMAGE_FILE_LINE_NUMS_STRIPPED           /* Characteristics */
> +
> +optional_header:
> +       .short  PE_OPT_MAGIC_PE32PLUS                   /* PE32+ format */
> +       .byte   0x02                                    /* MajorLinkerVersion */
> +       .byte   0x14                                    /* MinorLinkerVersion */
> +       .long   __inittext_end - efi_header_end         /* SizeOfCode */
> +       .long   _end - __initdata_begin                 /* SizeOfInitializedData */
> +       .long   0                                       /* SizeOfUninitializedData */
> +       .long   __efistub_efi_pe_entry - _head          /* AddressOfEntryPoint */
> +       .long   efi_header_end - _head                  /* BaseOfCode */
> +
> +extra_header_fields:
> +       .quad   0                                       /* ImageBase */
> +       .long   PECOFF_SEGMENT_ALIGN                    /* SectionAlignment */
> +       .long   PECOFF_FILE_ALIGN                       /* FileAlignment */
> +       .short  0                                       /* MajorOperatingSystemVersion */
> +       .short  0                                       /* MinorOperatingSystemVersion */
> +       .short  0                                       /* MajorImageVersion */
> +       .short  0                                       /* MinorImageVersion */
> +       .short  0                                       /* MajorSubsystemVersion */
> +       .short  0                                       /* MinorSubsystemVersion */
> +       .long   0                                       /* Win32VersionValue */
> +
> +       .long   _end - _head                            /* SizeOfImage */
> +
> +       /* Everything before the kernel image is considered part of the header */
> +       .long   efi_header_end - _head                  /* SizeOfHeaders */
> +       .long   0                                       /* CheckSum */
> +       .short  IMAGE_SUBSYSTEM_EFI_APPLICATION         /* Subsystem */
> +       .short  0                                       /* DllCharacteristics */
> +       .quad   0                                       /* SizeOfStackReserve */
> +       .quad   0                                       /* SizeOfStackCommit */
> +       .quad   0                                       /* SizeOfHeapReserve */
> +       .quad   0                                       /* SizeOfHeapCommit */
> +       .long   0                                       /* LoaderFlags */
> +       .long   (section_table - .) / 8                 /* NumberOfRvaAndSizes */
> +
> +       .quad   0                                       /* ExportTable */
> +       .quad   0                                       /* ImportTable */
> +       .quad   0                                       /* ResourceTable */
> +       .quad   0                                       /* ExceptionTable */
> +       .quad   0                                       /* CertificationTable */
> +       .quad   0                                       /* BaseRelocationTable */
> +
> +       /* Section table */
> +section_table:
> +       .ascii  ".text\0\0\0"
> +       .long   __inittext_end - efi_header_end         /* VirtualSize */
> +       .long   efi_header_end - _head                  /* VirtualAddress */
> +       .long   __inittext_end - efi_header_end         /* SizeOfRawData */
> +       .long   efi_header_end - _head                  /* PointerToRawData */
> +
> +       .long   0                                       /* PointerToRelocations */
> +       .long   0                                       /* PointerToLineNumbers */
> +       .short  0                                       /* NumberOfRelocations */
> +       .short  0                                       /* NumberOfLineNumbers */
> +       .long   IMAGE_SCN_CNT_CODE | \
> +               IMAGE_SCN_MEM_READ | \
> +               IMAGE_SCN_MEM_EXECUTE                   /* Characteristics */
> +
> +       .ascii  ".data\0\0\0"
> +       .long   _end - __initdata_begin                 /* VirtualSize */
> +       .long   __initdata_begin - _head                /* VirtualAddress */
> +       .long   _edata - __initdata_begin               /* SizeOfRawData */
> +       .long   __initdata_begin - _head                /* PointerToRawData */
> +
> +       .long   0                                       /* PointerToRelocations */
> +       .long   0                                       /* PointerToLineNumbers */
> +       .short  0                                       /* NumberOfRelocations */
> +       .short  0                                       /* NumberOfLineNumbers */
> +       .long   IMAGE_SCN_CNT_INITIALIZED_DATA | \
> +               IMAGE_SCN_MEM_READ | \
> +               IMAGE_SCN_MEM_WRITE                     /* Characteristics */
> +
> +       .org 0x20e
> +       .word kernel_version - 512 -  _head
> +
> +       .set    section_count, (. - section_table) / 40
> +efi_header_end:
> +       .endm
> diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
> index b4a0b28da3e7..361b72e8bfc5 100644
> --- a/arch/loongarch/kernel/head.S
> +++ b/arch/loongarch/kernel/head.S
> @@ -11,11 +11,53 @@
>  #include <asm/regdef.h>
>  #include <asm/loongarch.h>
>  #include <asm/stackframe.h>
> +#include <generated/compile.h>
> +#include <generated/utsrelease.h>
>
> -SYM_ENTRY(_stext, SYM_L_GLOBAL, SYM_A_NONE)
> +#ifdef CONFIG_EFI_STUB
> +
> +#include "efi-header.S"
> +
> +       __HEAD
> +
> +_head:
> +       /* "MZ", MS-DOS header */
> +       .word   MZ_MAGIC
> +       .org    0x28
> +       .ascii  "Loongson\0"
> +       .org    0x3c
> +       /* Offset to the PE header */
> +       .long   pe_header - _head
> +
> +pe_header:
> +       __EFI_PE_HEADER
> +
> +kernel_asize:
> +       .long _end - _text
> +
> +kernel_fsize:
> +       .long _edata - _text
> +
> +kernel_vaddr:
> +       .quad VMLINUX_LOAD_ADDRESS
> +
> +kernel_offset:
> +       .long kernel_offset - _text
> +
> +kernel_version:
> +       .ascii  UTS_RELEASE " (" LINUX_COMPILE_BY "@" LINUX_COMPILE_HOST ") " UTS_VERSION "\0"
> +
> +SYM_L_GLOBAL(kernel_asize)
> +SYM_L_GLOBAL(kernel_fsize)
> +SYM_L_GLOBAL(kernel_vaddr)
> +SYM_L_GLOBAL(kernel_offset)
> +
> +#endif
>
>         __REF
>
> +SYM_ENTRY(_stext, SYM_L_GLOBAL, SYM_A_NONE)
> +
>  SYM_CODE_START(kernel_entry)                   # kernel entry point
>
>         /* Config direct window and set PG */
> diff --git a/arch/loongarch/kernel/image-vars.h b/arch/loongarch/kernel/image-vars.h
> new file mode 100644
> index 000000000000..0162402b6212
> --- /dev/null
> +++ b/arch/loongarch/kernel/image-vars.h
> @@ -0,0 +1,30 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef __LOONGARCH_KERNEL_IMAGE_VARS_H
> +#define __LOONGARCH_KERNEL_IMAGE_VARS_H
> +
> +#ifdef CONFIG_EFI_STUB
> +
> +__efistub_memcmp               = memcmp;
> +__efistub_memcpy               = memcpy;
> +__efistub_memmove              = memmove;
> +__efistub_memset               = memset;
> +__efistub_strcat               = strcat;
> +__efistub_strcmp               = strcmp;
> +__efistub_strlen               = strlen;
> +__efistub_strncat              = strncat;
> +__efistub_strnstr              = strnstr;
> +__efistub_strnlen              = strnlen;
> +__efistub_strpbrk              = strpbrk;
> +__efistub_strsep               = strsep;
> +__efistub_kernel_entry         = kernel_entry;
> +__efistub_kernel_asize         = kernel_asize;
> +__efistub_kernel_fsize         = kernel_fsize;
> +__efistub_kernel_vaddr         = kernel_vaddr;
> +__efistub_kernel_offset                = kernel_offset;
> +
> +#endif
> +
> +#endif /* __LOONGARCH_KERNEL_IMAGE_VARS_H */
> diff --git a/arch/loongarch/kernel/vmlinux.lds.S b/arch/loongarch/kernel/vmlinux.lds.S
> index 02abfaaa4892..7da4c4d7c50d 100644
> --- a/arch/loongarch/kernel/vmlinux.lds.S
> +++ b/arch/loongarch/kernel/vmlinux.lds.S
> @@ -12,6 +12,14 @@
>  #define BSS_FIRST_SECTIONS *(.bss..swapper_pg_dir)
>
>  #include <asm-generic/vmlinux.lds.h>
> +#include "image-vars.h"
> +
> +/*
> + * Max avaliable Page Size is 64K, so we set SectionAlignment
> + * field of EFI application to 64K.
> + */
> +PECOFF_FILE_ALIGN = 0x200;
> +PECOFF_SEGMENT_ALIGN = 0x10000;
>
>  OUTPUT_ARCH(loongarch)
>  ENTRY(kernel_entry)
> @@ -27,6 +35,9 @@ SECTIONS
>         . = VMLINUX_LOAD_ADDRESS;
>
>         _text = .;
> +       HEAD_TEXT_SECTION
> +
> +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
>         .text : {
>                 TEXT_TEXT
>                 SCHED_TEXT
> @@ -38,11 +49,12 @@ SECTIONS
>                 *(.fixup)
>                 *(.gnu.warning)
>         } :text = 0
> +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
>         _etext = .;
>
>         EXCEPTION_TABLE(16)
>
> -       . = ALIGN(PAGE_SIZE);
> +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
>         __init_begin = .;
>         __inittext_begin = .;
>
> @@ -51,6 +63,7 @@ SECTIONS
>                 EXIT_TEXT
>         }
>
> +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
>         __inittext_end = .;
>
>         __initdata_begin = .;
> @@ -60,6 +73,10 @@ SECTIONS
>                 EXIT_DATA
>         }
>
> +       .init.bss : {
> +               *(.init.bss)
> +       }
> +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
>         __initdata_end = .;
>
>         __init_end = .;
> @@ -71,11 +88,11 @@ SECTIONS
>         .sdata : {
>                 *(.sdata)
>         }
> -
> -       . = ALIGN(SZ_64K);
> +       .edata_padding : { BYTE(0); . = ALIGN(PECOFF_FILE_ALIGN); }
>         _edata =  .;
>
>         BSS_SECTION(0, SZ_64K, 8)
> +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
>
>         _end = .;
>
> diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig
> index 2c3dac5ecb36..ecb4e0b1295a 100644
> --- a/drivers/firmware/efi/Kconfig
> +++ b/drivers/firmware/efi/Kconfig
> @@ -121,9 +121,9 @@ config EFI_ARMSTUB_DTB_LOADER
>
>  config EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER
>         bool "Enable the command line initrd loader" if !X86
> -       depends on EFI_STUB && (EFI_GENERIC_STUB || X86)
> -       default y if X86
>         depends on !RISCV
> +       depends on EFI_STUB && (EFI_GENERIC_STUB || X86 || LOONGARCH)
> +       default y if (X86 || LOONGARCH)
>         help
>           Select this config option to add support for the initrd= command
>           line parameter, allowing an initrd that resides on the same volume
> diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
> index d0537573501e..663e9d317299 100644
> --- a/drivers/firmware/efi/libstub/Makefile
> +++ b/drivers/firmware/efi/libstub/Makefile
> @@ -26,6 +26,8 @@ cflags-$(CONFIG_ARM)          := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
>                                    $(call cc-option,-mno-single-pic-base)
>  cflags-$(CONFIG_RISCV)         := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
>                                    -fpic
> +cflags-$(CONFIG_LOONGARCH)     := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> +                                  -fpic
>
>  cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
>
> @@ -55,7 +57,7 @@ KCOV_INSTRUMENT                       := n
>  lib-y                          := efi-stub-helper.o gop.o secureboot.o tpm.o \
>                                    file.o mem.o random.o randomalloc.o pci.o \
>                                    skip_spaces.o lib-cmdline.o lib-ctype.o \
> -                                  alignedmem.o relocate.o vsprintf.o
> +                                  alignedmem.o relocate.o string.o vsprintf.o
>
>  # include the stub's generic dependencies from lib/ when building for ARM/arm64
>  efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c
> @@ -63,13 +65,15 @@ efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c
>  $(obj)/lib-%.o: $(srctree)/lib/%.c FORCE
>         $(call if_changed_rule,cc_o_c)
>
> -lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o fdt.o string.o \
> +lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o fdt.o \
>                                    $(patsubst %.c,lib-%.o,$(efi-deps-y))
>
>  lib-$(CONFIG_ARM)              += arm32-stub.o
>  lib-$(CONFIG_ARM64)            += arm64-stub.o
>  lib-$(CONFIG_X86)              += x86-stub.o
>  lib-$(CONFIG_RISCV)            += riscv-stub.o
> +lib-$(CONFIG_LOONGARCH)                += loongarch-stub.o
> +
>  CFLAGS_arm32-stub.o            := -DTEXT_OFFSET=$(TEXT_OFFSET)
>
>  # Even when -mbranch-protection=none is set, Clang will generate a
> @@ -125,6 +129,12 @@ STUBCOPY_FLAGS-$(CONFIG_RISCV)     += --prefix-alloc-sections=.init \
>                                    --prefix-symbols=__efistub_
>  STUBCOPY_RELOC-$(CONFIG_RISCV) := R_RISCV_HI20
>
> +# For LoongArch, keep all the symbols in .init section and make sure that no
> +# absolute symbols references doesn't exist.
> +STUBCOPY_FLAGS-$(CONFIG_LOONGARCH)     += --prefix-alloc-sections=.init \
> +                                          --prefix-symbols=__efistub_
> +STUBCOPY_RELOC-$(CONFIG_LOONGARCH)     := R_LARCH_MARK_LA
> +
>  $(obj)/%.stub.o: $(obj)/%.o FORCE
>         $(call if_changed,stubcopy)
>
> diff --git a/drivers/firmware/efi/libstub/loongarch-stub.c b/drivers/firmware/efi/libstub/loongarch-stub.c
> new file mode 100644
> index 000000000000..399641a0b0cb
> --- /dev/null
> +++ b/drivers/firmware/efi/libstub/loongarch-stub.c
> @@ -0,0 +1,425 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Author: Yun Liu <liuyun@loongson.cn>
> + *         Huacai Chen <chenhuacai@loongson.cn>
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <linux/efi.h>
> +#include <linux/sort.h>
> +#include <asm/efi.h>
> +#include <asm/addrspace.h>
> +#include <asm/boot_param.h>
> +#include "efistub.h"
> +
> +#define MAX_ARG_COUNT          128
> +#define CMDLINE_MAX_SIZE       0x200
> +
> +static int argc;
> +static char **argv;
> +const efi_system_table_t *efi_system_table;
> +static efi_guid_t screen_info_guid = LINUX_EFI_LARCH_SCREEN_INFO_TABLE_GUID;
> +static unsigned int map_entry[LOONGSON3_BOOT_MEM_MAP_MAX];
> +static struct efi_mmap mmap_array[EFI_MAX_MEMORY_TYPE][LOONGSON3_BOOT_MEM_MAP_MAX];
> +
> +struct exit_boot_struct {
> +       struct boot_params *bp;
> +       unsigned int *runtime_entry_count;
> +};
> +
> +typedef void (*kernel_entry_t)(int argc, char *argv[], struct boot_params *boot_p);
> +
> +extern int kernel_asize;
> +extern int kernel_fsize;
> +extern int kernel_offset;
> +extern unsigned long kernel_vaddr;
> +extern kernel_entry_t kernel_entry;
> +
> +unsigned char efi_crc8(char *buff, int size)
> +{
> +       int sum, cnt;
> +
> +       for (sum = 0, cnt = 0; cnt < size; cnt++)
> +               sum = (char) (sum + *(buff + cnt));
> +
> +       return (char)(0x100 - sum);
> +}
> +
> +struct screen_info *alloc_screen_info(void)
> +{
> +       efi_status_t status;
> +       struct screen_info *si;
> +
> +       status = efi_bs_call(allocate_pool,
> +                       EFI_RUNTIME_SERVICES_DATA, sizeof(*si), (void **)&si);
> +       if (status != EFI_SUCCESS)
> +               return NULL;
> +
> +       status = efi_bs_call(install_configuration_table, &screen_info_guid, si);
> +       if (status == EFI_SUCCESS)
> +               return si;
> +
> +       efi_bs_call(free_pool, si);
> +
> +       return NULL;
> +}
> +
> +static void setup_graphics(void)
> +{
> +       unsigned long size;
> +       efi_status_t status;
> +       efi_guid_t gop_proto = EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID;
> +       void **gop_handle = NULL;
> +       struct screen_info *si = NULL;
> +
> +       size = 0;
> +       status = efi_bs_call(locate_handle, EFI_LOCATE_BY_PROTOCOL,
> +                               &gop_proto, NULL, &size, gop_handle);
> +       if (status == EFI_BUFFER_TOO_SMALL) {
> +               si = alloc_screen_info();
> +               efi_setup_gop(si, &gop_proto, size);
> +       }
> +}
> +
> +struct boot_params *bootparams_init(efi_system_table_t *sys_table)
> +{
> +       efi_status_t status;
> +       struct boot_params *p;
> +       unsigned char sig[8] = {'B', 'P', 'I', '0', '1', '0', '0', '2'};
> +
> +       status = efi_bs_call(allocate_pool, EFI_RUNTIME_SERVICES_DATA, SZ_64K, (void **)&p);
> +       if (status != EFI_SUCCESS)
> +               return NULL;
> +
> +       memset(p, 0, SZ_64K);
> +       memcpy(&p->signature, sig, sizeof(long));
> +
> +       return p;
> +}
> +
> +static unsigned long convert_priv_cmdline(char *cmdline_ptr,
> +               unsigned long rd_addr, unsigned long rd_size)
> +{
> +       unsigned int rdprev_size;
> +       unsigned int cmdline_size;
> +       efi_status_t status;
> +       char *pstr, *substr;
> +       char *initrd_ptr = NULL;
> +       char convert_str[CMDLINE_MAX_SIZE];
> +       static char cmdline_array[CMDLINE_MAX_SIZE];
> +
> +       cmdline_size = strlen(cmdline_ptr);
> +       snprintf(cmdline_array, CMDLINE_MAX_SIZE, "kernel ");
> +
> +       initrd_ptr = strstr(cmdline_ptr, "initrd=");
> +       if (!initrd_ptr) {
> +               snprintf(cmdline_array, CMDLINE_MAX_SIZE, "kernel %s", cmdline_ptr);
> +               goto completed;
> +       }
> +       snprintf(convert_str, CMDLINE_MAX_SIZE, " initrd=0x%lx,0x%lx", rd_addr, rd_size);
> +       rdprev_size = cmdline_size - strlen(initrd_ptr);
> +       strncat(cmdline_array, cmdline_ptr, rdprev_size);
> +
> +       cmdline_ptr = strnstr(initrd_ptr, " ", CMDLINE_MAX_SIZE);
> +       strcat(cmdline_array, convert_str);
> +       if (!cmdline_ptr)
> +               goto completed;
> +
> +       strcat(cmdline_array, cmdline_ptr);
> +
> +completed:
> +       status = efi_allocate_pages((MAX_ARG_COUNT + 1) * (sizeof(char *)),
> +                                       (unsigned long *)&argv, ULONG_MAX);
> +       if (status != EFI_SUCCESS) {
> +               efi_err("Alloc argv mmap_array error\n");
> +               return status;
> +       }
> +
> +       argc = 0;
> +       pstr = cmdline_array;
> +
> +       substr = strsep(&pstr, " \t");
> +       while (substr != NULL) {
> +               if (strlen(substr)) {
> +                       argv[argc++] = substr;
> +                       if (argc == MAX_ARG_COUNT) {
> +                               efi_err("Argv mmap_array full!\n");
> +                               break;
> +                       }
> +               }
> +               substr = strsep(&pstr, " \t");
> +       }
> +
> +       return EFI_SUCCESS;
> +}
> +
> +unsigned int efi_memmap_sort(struct loongsonlist_mem_map *memmap,
> +                       unsigned int index, unsigned int mem_type)
> +{
> +       unsigned int i, t;
> +       unsigned long msize;
> +
> +       for (i = 0; i < map_entry[mem_type]; i = t) {
> +               msize = mmap_array[mem_type][i].mem_size;
> +               for (t = i + 1; t < map_entry[mem_type]; t++) {
> +                       if (mmap_array[mem_type][i].mem_start + msize <
> +                                       mmap_array[mem_type][t].mem_start)
> +                               break;
> +
> +                       msize += mmap_array[mem_type][t].mem_size;
> +               }
> +               memmap->map[index].mem_type = mem_type;
> +               memmap->map[index].mem_start = mmap_array[mem_type][i].mem_start;
> +               memmap->map[index].mem_size = msize;
> +               memmap->map[index].attribute = mmap_array[mem_type][i].attribute;
> +               index++;
> +       }
> +
> +       return index;
> +}
> +
> +static efi_status_t mk_mmap(struct efi_boot_memmap *map, struct boot_params *p)
> +{
> +       char checksum;
> +       unsigned int i;
> +       unsigned int nr_desc;
> +       unsigned int mem_type;
> +       unsigned long count;
> +       efi_memory_desc_t *mem_desc;
> +       struct loongsonlist_mem_map *mhp = NULL;
> +
> +       memset(map_entry, 0, sizeof(map_entry));
> +       memset(mmap_array, 0, sizeof(mmap_array));
> +
> +       if (!strncmp((char *)p, "BPI", 3)) {
> +               p->flags |= BPI_FLAGS_UEFI_SUPPORTED;
> +               p->systemtable = (efi_system_table_t *)efi_system_table;
> +               p->extlist_offset = sizeof(*p) + sizeof(unsigned long);
> +               mhp = (struct loongsonlist_mem_map *)((char *)p + p->extlist_offset);
> +
> +               memcpy(&mhp->header.signature, "MEM", sizeof(unsigned long));
> +               mhp->header.length = sizeof(*mhp);
> +               mhp->desc_version = *map->desc_ver;
> +               mhp->map_count = 0;
> +       }
> +       if (!(*(map->map_size)) || !(*(map->desc_size)) || !mhp) {
> +               efi_err("get memory info error\n");
> +               return EFI_INVALID_PARAMETER;
> +       }
> +       nr_desc = *(map->map_size) / *(map->desc_size);
> +
> +       /*
> +        * According to UEFI SPEC, mmap_buf is the accurate Memory Map
> +        * mmap_array now we can fill platform specific memory structure.
> +        */
> +       for (i = 0; i < nr_desc; i++) {
> +               mem_desc = (efi_memory_desc_t *)((void *)(*map->map) + (i * (*(map->desc_size))));
> +               switch (mem_desc->type) {
> +               case EFI_RESERVED_TYPE:
> +               case EFI_RUNTIME_SERVICES_CODE:
> +               case EFI_RUNTIME_SERVICES_DATA:
> +               case EFI_MEMORY_MAPPED_IO:
> +               case EFI_MEMORY_MAPPED_IO_PORT_SPACE:
> +               case EFI_UNUSABLE_MEMORY:
> +               case EFI_PAL_CODE:
> +                       mem_type = ADDRESS_TYPE_RESERVED;
> +                       break;
> +
> +               case EFI_ACPI_MEMORY_NVS:
> +                       mem_type = ADDRESS_TYPE_NVS;
> +                       break;
> +
> +               case EFI_ACPI_RECLAIM_MEMORY:
> +                       mem_type = ADDRESS_TYPE_ACPI;
> +                       break;
> +
> +               case EFI_LOADER_CODE:
> +               case EFI_LOADER_DATA:
> +               case EFI_PERSISTENT_MEMORY:
> +               case EFI_BOOT_SERVICES_CODE:
> +               case EFI_BOOT_SERVICES_DATA:
> +               case EFI_CONVENTIONAL_MEMORY:
> +                       mem_type = ADDRESS_TYPE_SYSRAM;
> +                       break;
> +
> +               default:
> +                       continue;
> +               }
> +
> +               mmap_array[mem_type][map_entry[mem_type]].mem_type = mem_type;
> +               mmap_array[mem_type][map_entry[mem_type]].mem_start =
> +                                               mem_desc->phys_addr & TO_PHYS_MASK;
> +               mmap_array[mem_type][map_entry[mem_type]].mem_size =
> +                                               mem_desc->num_pages << EFI_PAGE_SHIFT;
> +               mmap_array[mem_type][map_entry[mem_type]].attribute =
> +                                               mem_desc->attribute;
> +               map_entry[mem_type]++;
> +       }
> +
> +       count = mhp->map_count;
> +       /* Sort EFI memmap and add to BPI for kernel */
> +       for (i = 0; i < LOONGSON3_BOOT_MEM_MAP_MAX; i++) {
> +               if (!map_entry[i])
> +                       continue;
> +               count = efi_memmap_sort(mhp, count, i);
> +       }
> +
> +       mhp->map_count = count;
> +       mhp->header.checksum = 0;
> +
> +       checksum = efi_crc8((char *)mhp, mhp->header.length);
> +       mhp->header.checksum = checksum;
> +
> +       return EFI_SUCCESS;
> +}
> +
> +static efi_status_t exit_boot_func(struct efi_boot_memmap *map, void *priv)
> +{
> +       efi_status_t status;
> +       struct exit_boot_struct *p = priv;
> +
> +       status = mk_mmap(map, p->bp);
> +       if (status != EFI_SUCCESS) {
> +               efi_err("Make kernel memory map failed!\n");
> +               return status;
> +       }
> +
> +       return EFI_SUCCESS;
> +}
> +
> +static efi_status_t exit_boot_services(struct boot_params *boot_params, void *handle)
> +{
> +       unsigned int desc_version;
> +       unsigned int runtime_entry_count = 0;
> +       unsigned long map_size, key, desc_size, buff_size;
> +       efi_status_t status;
> +       efi_memory_desc_t *mem_map;
> +       struct efi_boot_memmap map;
> +       struct exit_boot_struct priv;
> +
> +       map.map                 = &mem_map;
> +       map.map_size            = &map_size;
> +       map.desc_size           = &desc_size;
> +       map.desc_ver            = &desc_version;
> +       map.key_ptr             = &key;
> +       map.buff_size           = &buff_size;
> +       status = efi_get_memory_map(&map);
> +       if (status != EFI_SUCCESS) {
> +               efi_err("Unable to retrieve UEFI memory map.\n");
> +               return status;
> +       }
> +
> +       priv.bp = boot_params;
> +       priv.runtime_entry_count = &runtime_entry_count;
> +
> +       /* Might as well exit boot services now */
> +       status = efi_exit_boot_services(handle, &map, &priv, exit_boot_func);
> +       if (status != EFI_SUCCESS)
> +               return status;
> +
> +       return EFI_SUCCESS;
> +}
> +
> +/*
> + * EFI entry point for the LoongArch EFI stub.
> + */
> +efi_status_t __efiapi efi_pe_entry(efi_handle_t handle, efi_system_table_t *sys_table)
> +{
> +       unsigned int cmdline_size = 0;
> +       unsigned long kernel_addr = 0;
> +       unsigned long initrd_addr = 0;
> +       unsigned long initrd_size = 0;
> +       enum efi_secureboot_mode secure_boot;
> +       char *cmdline_ptr = NULL;
> +       struct boot_params *boot_p;
> +       efi_status_t status;
> +       efi_loaded_image_t *image;
> +       efi_guid_t loaded_image_proto;
> +       kernel_entry_t real_kernel_entry;
> +
> +       /* Config Direct Mapping */
> +       csr_writeq(CSR_DMW0_INIT, LOONGARCH_CSR_DMWIN0);
> +       csr_writeq(CSR_DMW1_INIT, LOONGARCH_CSR_DMWIN1);
> +
> +       efi_system_table = sys_table;
> +       loaded_image_proto = LOADED_IMAGE_PROTOCOL_GUID;
> +       kernel_addr = (unsigned long)&kernel_offset - kernel_offset;
> +       real_kernel_entry = (kernel_entry_t)
> +               ((unsigned long)&kernel_entry - kernel_addr + kernel_vaddr);
> +
> +       /* Check if we were booted by the EFI firmware */
> +       if (sys_table->hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE)
> +               goto fail;
> +
> +       /*
> +        * Get a handle to the loaded image protocol.  This is used to get
> +        * information about the running image, such as size and the command
> +        * line.
> +        */
> +       status = sys_table->boottime->handle_protocol(handle,
> +                                       &loaded_image_proto, (void *)&image);
> +       if (status != EFI_SUCCESS) {
> +               efi_err("Failed to get loaded image protocol\n");
> +               goto fail;
> +       }
> +
> +       /* Get the command line from EFI, using the LOADED_IMAGE protocol. */
> +       cmdline_ptr = efi_convert_cmdline(image, &cmdline_size);
> +       if (!cmdline_ptr) {
> +               efi_err("Getting command line failed!\n");
> +               goto fail_free_cmdline;
> +       }
> +
> +#ifdef CONFIG_CMDLINE_BOOL
> +       if (cmdline_size == 0)
> +               efi_parse_options(CONFIG_CMDLINE);
> +#endif
> +       if (!IS_ENABLED(CONFIG_CMDLINE_OVERRIDE) && cmdline_size > 0)
> +               efi_parse_options(cmdline_ptr);
> +
> +       efi_info("Booting Linux Kernel...\n");
> +
> +       efi_relocate_kernel(&kernel_addr, kernel_fsize, kernel_asize,
> +                           PHYSADDR(kernel_vaddr), SZ_2M, PHYSADDR(kernel_vaddr));
> +
> +       setup_graphics();
> +       secure_boot = efi_get_secureboot();
> +       efi_enable_reset_attack_mitigation();
> +
> +       status = efi_load_initrd(image, &initrd_addr, &initrd_size, SZ_4G, ULONG_MAX);
> +       if (status != EFI_SUCCESS) {
> +               efi_err("Failed get initrd addr!\n");
> +               goto fail_free;
> +       }
> +
> +       status = convert_priv_cmdline(cmdline_ptr, initrd_addr, initrd_size);
> +       if (status != EFI_SUCCESS) {
> +               efi_err("Covert cmdline failed!\n");
> +               goto fail_free;
> +       }
> +
> +       boot_p = bootparams_init(sys_table);
> +       if (!boot_p) {
> +               efi_err("Create BPI struct error!\n");
> +               goto fail;
> +       }
> +
> +       status = exit_boot_services(boot_p, handle);
> +       if (status != EFI_SUCCESS) {
> +               efi_err("exit_boot services failed!\n");
> +               goto fail_free;
> +       }
> +
> +       real_kernel_entry(argc, argv, boot_p);
> +
> +       return EFI_SUCCESS;
> +
> +fail_free:
> +       efi_free(initrd_size, initrd_addr);
> +
> +fail_free_cmdline:
> +       efi_free(cmdline_size, (unsigned long)cmdline_ptr);
> +
> +fail:
> +       return status;
> +}
> diff --git a/include/linux/pe.h b/include/linux/pe.h
> index daf09ffffe38..f4bb0b6a416d 100644
> --- a/include/linux/pe.h
> +++ b/include/linux/pe.h
> @@ -65,6 +65,7 @@
>  #define        IMAGE_FILE_MACHINE_SH5          0x01a8
>  #define        IMAGE_FILE_MACHINE_THUMB        0x01c2
>  #define        IMAGE_FILE_MACHINE_WCEMIPSV2    0x0169
> +#define        IMAGE_FILE_MACHINE_LOONGARCH    0x6264
>
>  /* flags */
>  #define IMAGE_FILE_RELOCS_STRIPPED           0x0001
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 16/24] LoongArch: Add misc common routines
  2022-04-30  9:50   ` Arnd Bergmann
@ 2022-04-30 10:00     ` Huacai Chen
  2022-04-30 10:41       ` Arnd Bergmann
  0 siblings, 1 reply; 94+ messages in thread
From: Huacai Chen @ 2022-04-30 10:00 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Huacai Chen, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds,
	linux-arch, open list:DOCUMENTATION, Linux Kernel Mailing List,
	Xuefeng Li, Yanteng Si, Guo Ren, Xuerui Wang, Jiaxun Yang

Hi, Arnd,

On Sat, Apr 30, 2022 at 5:50 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
>
> > +unsigned long __xchg_small(volatile void *ptr, unsigned long val, unsigned int size)
> > +{
> > +       u32 old32, mask, temp;
> > +       volatile u32 *ptr32;
> > +       unsigned int shift;
> > +
> > +       /* Check that ptr is naturally aligned */
>
> As discussed, please remove this function and all the references to it.
It seems that "generic ticket spinlock" hasn't been merged in 5.18?

Huacai
>
>       Arnd

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 20/24] LoongArch: Add efistub booting support
  2022-04-30  9:56   ` Arnd Bergmann
@ 2022-04-30 10:02     ` Huacai Chen
  2022-05-03  7:23     ` Ard Biesheuvel
  1 sibling, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-04-30 10:02 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Huacai Chen, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds,
	linux-arch, open list:DOCUMENTATION, Linux Kernel Mailing List,
	Xuefeng Li, Yanteng Si, Guo Ren, Xuerui Wang, Jiaxun Yang,
	Ard Biesheuvel

Hi, Arnd,

On Sat, Apr 30, 2022 at 5:56 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> >
> > This patch adds efistub booting support, which is the standard UEFI boot
> > protocol for us to use.
> >
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
>
> It's good to see that you completed this. Unfortunately you did not add Ard
> Biesheuvel to Cc, he is the one who needs to review this code. Adding him
> to Cc now, with the full patch quoted below for him (no more comments
> from me there).
I'm sorry I forgot that.

Huacai
>
>       Arnd
>
> > ---
> >  arch/loongarch/Kbuild                         |   3 +
> >  arch/loongarch/Kconfig                        |   8 +
> >  arch/loongarch/Makefile                       |  18 +-
> >  arch/loongarch/boot/Makefile                  |  23 +
> >  arch/loongarch/kernel/efi-header.S            | 100 +++++
> >  arch/loongarch/kernel/head.S                  |  44 +-
> >  arch/loongarch/kernel/image-vars.h            |  30 ++
> >  arch/loongarch/kernel/vmlinux.lds.S           |  23 +-
> >  drivers/firmware/efi/Kconfig                  |   4 +-
> >  drivers/firmware/efi/libstub/Makefile         |  14 +-
> >  drivers/firmware/efi/libstub/loongarch-stub.c | 425 ++++++++++++++++++
> >  include/linux/pe.h                            |   1 +
> >  12 files changed, 680 insertions(+), 13 deletions(-)
> >  create mode 100644 arch/loongarch/boot/Makefile
> >  create mode 100644 arch/loongarch/kernel/efi-header.S
> >  create mode 100644 arch/loongarch/kernel/image-vars.h
> >  create mode 100644 drivers/firmware/efi/libstub/loongarch-stub.c
> >
> > diff --git a/arch/loongarch/Kbuild b/arch/loongarch/Kbuild
> > index 1ad35aabdd16..ab5373d0a24f 100644
> > --- a/arch/loongarch/Kbuild
> > +++ b/arch/loongarch/Kbuild
> > @@ -1,3 +1,6 @@
> >  obj-y += kernel/
> >  obj-y += mm/
> >  obj-y += vdso/
> > +
> > +# for cleaning
> > +subdir- += boot
> > diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> > index 44b763046893..55225ee5f868 100644
> > --- a/arch/loongarch/Kconfig
> > +++ b/arch/loongarch/Kconfig
> > @@ -265,6 +265,14 @@ config EFI
> >           resultant kernel should continue to boot on existing non-EFI
> >           platforms.
> >
> > +config EFI_STUB
> > +       bool "EFI boot stub support"
> > +       default y
> > +       depends on EFI
> > +       help
> > +         This kernel feature allows the kernel to be loaded directly by
> > +         EFI firmware without the use of a bootloader.
> > +
> >  config FORCE_MAX_ZONEORDER
> >         int "Maximum zone order"
> >         range 14 64 if PAGE_SIZE_64KB
> > diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
> > index c4b3f53cd276..d88a792dafbe 100644
> > --- a/arch/loongarch/Makefile
> > +++ b/arch/loongarch/Makefile
> > @@ -3,6 +3,14 @@
> >  # Author: Huacai Chen <chenhuacai@loongson.cn>
> >  # Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> >
> > +boot   := arch/loongarch/boot
> > +
> > +ifndef CONFIG_EFI_STUB
> > +KBUILD_IMAGE   = $(boot)/vmlinux
> > +else
> > +KBUILD_IMAGE   = $(boot)/vmlinux.efi
> > +endif
> > +
> >  #
> >  # Select the object file format to substitute into the linker script.
> >  #
> > @@ -30,8 +38,6 @@ ld-emul                       = $(64bit-emul)
> >  cflags-y               += -mabi=lp64s
> >  endif
> >
> > -all-y                  := vmlinux
> > -
> >  #
> >  # GCC uses -G0 -mabicalls -fpic as default.  We don't want PIC in the kernel
> >  # code since it only slows down the whole thing.  At some point we might make
> > @@ -75,6 +81,7 @@ endif
> >  head-y := arch/loongarch/kernel/head.o
> >
> >  libs-y += arch/loongarch/lib/
> > +libs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
> >
> >  ifeq ($(KBUILD_EXTMOD),)
> >  prepare: vdso_prepare
> > @@ -86,12 +93,13 @@ PHONY += vdso_install
> >  vdso_install:
> >         $(Q)$(MAKE) $(build)=arch/loongarch/vdso $@
> >
> > -all:   $(all-y)
> > +all:   $(KBUILD_IMAGE)
> >
> > -CLEAN_FILES += vmlinux
> > +$(KBUILD_IMAGE): vmlinux
> > +       $(Q)$(MAKE) $(build)=$(boot) $(bootvars-y) $@
> >
> >  install:
> > -       $(Q)install -D -m 755 vmlinux $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
> > +       $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
> >         $(Q)install -D -m 644 .config $(INSTALL_PATH)/config-$(KERNELRELEASE)
> >         $(Q)install -D -m 644 System.map $(INSTALL_PATH)/System.map-$(KERNELRELEASE)
> >
> > diff --git a/arch/loongarch/boot/Makefile b/arch/loongarch/boot/Makefile
> > new file mode 100644
> > index 000000000000..66f2293c34b2
> > --- /dev/null
> > +++ b/arch/loongarch/boot/Makefile
> > @@ -0,0 +1,23 @@
> > +#
> > +# arch/loongarch/boot/Makefile
> > +#
> > +# Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > +#
> > +
> > +drop-sections := .comment .note .options .note.gnu.build-id
> > +strip-flags   := $(addprefix --remove-section=,$(drop-sections)) -S
> > +
> > +targets := vmlinux
> > +quiet_cmd_strip = STRIP          $@
> > +      cmd_strip = $(STRIP) -s $@
> > +
> > +$(obj)/vmlinux: vmlinux FORCE
> > +       $(call if_changed,copy)
> > +       $(call if_changed,strip)
> > +
> > +targets += vmlinux.efi
> > +quiet_cmd_eficopy = OBJCOPY $@
> > +      cmd_eficopy = $(OBJCOPY) -O binary $(strip-flags) $< $@
> > +
> > +$(obj)/vmlinux.efi: $(obj)/vmlinux FORCE
> > +       $(call if_changed,eficopy)
> > diff --git a/arch/loongarch/kernel/efi-header.S b/arch/loongarch/kernel/efi-header.S
> > new file mode 100644
> > index 000000000000..ceb44524944a
> > --- /dev/null
> > +++ b/arch/loongarch/kernel/efi-header.S
> > @@ -0,0 +1,100 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/pe.h>
> > +#include <linux/sizes.h>
> > +
> > +       .macro  __EFI_PE_HEADER
> > +       .long   PE_MAGIC
> > +coff_header:
> > +       .short  IMAGE_FILE_MACHINE_LOONGARCH            /* Machine */
> > +       .short  section_count                           /* NumberOfSections */
> > +       .long   0                                       /* TimeDateStamp */
> > +       .long   0                                       /* PointerToSymbolTable */
> > +       .long   0                                       /* NumberOfSymbols */
> > +       .short  section_table - optional_header         /* SizeOfOptionalHeader */
> > +       .short  IMAGE_FILE_DEBUG_STRIPPED | \
> > +               IMAGE_FILE_EXECUTABLE_IMAGE | \
> > +               IMAGE_FILE_LINE_NUMS_STRIPPED           /* Characteristics */
> > +
> > +optional_header:
> > +       .short  PE_OPT_MAGIC_PE32PLUS                   /* PE32+ format */
> > +       .byte   0x02                                    /* MajorLinkerVersion */
> > +       .byte   0x14                                    /* MinorLinkerVersion */
> > +       .long   __inittext_end - efi_header_end         /* SizeOfCode */
> > +       .long   _end - __initdata_begin                 /* SizeOfInitializedData */
> > +       .long   0                                       /* SizeOfUninitializedData */
> > +       .long   __efistub_efi_pe_entry - _head          /* AddressOfEntryPoint */
> > +       .long   efi_header_end - _head                  /* BaseOfCode */
> > +
> > +extra_header_fields:
> > +       .quad   0                                       /* ImageBase */
> > +       .long   PECOFF_SEGMENT_ALIGN                    /* SectionAlignment */
> > +       .long   PECOFF_FILE_ALIGN                       /* FileAlignment */
> > +       .short  0                                       /* MajorOperatingSystemVersion */
> > +       .short  0                                       /* MinorOperatingSystemVersion */
> > +       .short  0                                       /* MajorImageVersion */
> > +       .short  0                                       /* MinorImageVersion */
> > +       .short  0                                       /* MajorSubsystemVersion */
> > +       .short  0                                       /* MinorSubsystemVersion */
> > +       .long   0                                       /* Win32VersionValue */
> > +
> > +       .long   _end - _head                            /* SizeOfImage */
> > +
> > +       /* Everything before the kernel image is considered part of the header */
> > +       .long   efi_header_end - _head                  /* SizeOfHeaders */
> > +       .long   0                                       /* CheckSum */
> > +       .short  IMAGE_SUBSYSTEM_EFI_APPLICATION         /* Subsystem */
> > +       .short  0                                       /* DllCharacteristics */
> > +       .quad   0                                       /* SizeOfStackReserve */
> > +       .quad   0                                       /* SizeOfStackCommit */
> > +       .quad   0                                       /* SizeOfHeapReserve */
> > +       .quad   0                                       /* SizeOfHeapCommit */
> > +       .long   0                                       /* LoaderFlags */
> > +       .long   (section_table - .) / 8                 /* NumberOfRvaAndSizes */
> > +
> > +       .quad   0                                       /* ExportTable */
> > +       .quad   0                                       /* ImportTable */
> > +       .quad   0                                       /* ResourceTable */
> > +       .quad   0                                       /* ExceptionTable */
> > +       .quad   0                                       /* CertificationTable */
> > +       .quad   0                                       /* BaseRelocationTable */
> > +
> > +       /* Section table */
> > +section_table:
> > +       .ascii  ".text\0\0\0"
> > +       .long   __inittext_end - efi_header_end         /* VirtualSize */
> > +       .long   efi_header_end - _head                  /* VirtualAddress */
> > +       .long   __inittext_end - efi_header_end         /* SizeOfRawData */
> > +       .long   efi_header_end - _head                  /* PointerToRawData */
> > +
> > +       .long   0                                       /* PointerToRelocations */
> > +       .long   0                                       /* PointerToLineNumbers */
> > +       .short  0                                       /* NumberOfRelocations */
> > +       .short  0                                       /* NumberOfLineNumbers */
> > +       .long   IMAGE_SCN_CNT_CODE | \
> > +               IMAGE_SCN_MEM_READ | \
> > +               IMAGE_SCN_MEM_EXECUTE                   /* Characteristics */
> > +
> > +       .ascii  ".data\0\0\0"
> > +       .long   _end - __initdata_begin                 /* VirtualSize */
> > +       .long   __initdata_begin - _head                /* VirtualAddress */
> > +       .long   _edata - __initdata_begin               /* SizeOfRawData */
> > +       .long   __initdata_begin - _head                /* PointerToRawData */
> > +
> > +       .long   0                                       /* PointerToRelocations */
> > +       .long   0                                       /* PointerToLineNumbers */
> > +       .short  0                                       /* NumberOfRelocations */
> > +       .short  0                                       /* NumberOfLineNumbers */
> > +       .long   IMAGE_SCN_CNT_INITIALIZED_DATA | \
> > +               IMAGE_SCN_MEM_READ | \
> > +               IMAGE_SCN_MEM_WRITE                     /* Characteristics */
> > +
> > +       .org 0x20e
> > +       .word kernel_version - 512 -  _head
> > +
> > +       .set    section_count, (. - section_table) / 40
> > +efi_header_end:
> > +       .endm
> > diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
> > index b4a0b28da3e7..361b72e8bfc5 100644
> > --- a/arch/loongarch/kernel/head.S
> > +++ b/arch/loongarch/kernel/head.S
> > @@ -11,11 +11,53 @@
> >  #include <asm/regdef.h>
> >  #include <asm/loongarch.h>
> >  #include <asm/stackframe.h>
> > +#include <generated/compile.h>
> > +#include <generated/utsrelease.h>
> >
> > -SYM_ENTRY(_stext, SYM_L_GLOBAL, SYM_A_NONE)
> > +#ifdef CONFIG_EFI_STUB
> > +
> > +#include "efi-header.S"
> > +
> > +       __HEAD
> > +
> > +_head:
> > +       /* "MZ", MS-DOS header */
> > +       .word   MZ_MAGIC
> > +       .org    0x28
> > +       .ascii  "Loongson\0"
> > +       .org    0x3c
> > +       /* Offset to the PE header */
> > +       .long   pe_header - _head
> > +
> > +pe_header:
> > +       __EFI_PE_HEADER
> > +
> > +kernel_asize:
> > +       .long _end - _text
> > +
> > +kernel_fsize:
> > +       .long _edata - _text
> > +
> > +kernel_vaddr:
> > +       .quad VMLINUX_LOAD_ADDRESS
> > +
> > +kernel_offset:
> > +       .long kernel_offset - _text
> > +
> > +kernel_version:
> > +       .ascii  UTS_RELEASE " (" LINUX_COMPILE_BY "@" LINUX_COMPILE_HOST ") " UTS_VERSION "\0"
> > +
> > +SYM_L_GLOBAL(kernel_asize)
> > +SYM_L_GLOBAL(kernel_fsize)
> > +SYM_L_GLOBAL(kernel_vaddr)
> > +SYM_L_GLOBAL(kernel_offset)
> > +
> > +#endif
> >
> >         __REF
> >
> > +SYM_ENTRY(_stext, SYM_L_GLOBAL, SYM_A_NONE)
> > +
> >  SYM_CODE_START(kernel_entry)                   # kernel entry point
> >
> >         /* Config direct window and set PG */
> > diff --git a/arch/loongarch/kernel/image-vars.h b/arch/loongarch/kernel/image-vars.h
> > new file mode 100644
> > index 000000000000..0162402b6212
> > --- /dev/null
> > +++ b/arch/loongarch/kernel/image-vars.h
> > @@ -0,0 +1,30 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef __LOONGARCH_KERNEL_IMAGE_VARS_H
> > +#define __LOONGARCH_KERNEL_IMAGE_VARS_H
> > +
> > +#ifdef CONFIG_EFI_STUB
> > +
> > +__efistub_memcmp               = memcmp;
> > +__efistub_memcpy               = memcpy;
> > +__efistub_memmove              = memmove;
> > +__efistub_memset               = memset;
> > +__efistub_strcat               = strcat;
> > +__efistub_strcmp               = strcmp;
> > +__efistub_strlen               = strlen;
> > +__efistub_strncat              = strncat;
> > +__efistub_strnstr              = strnstr;
> > +__efistub_strnlen              = strnlen;
> > +__efistub_strpbrk              = strpbrk;
> > +__efistub_strsep               = strsep;
> > +__efistub_kernel_entry         = kernel_entry;
> > +__efistub_kernel_asize         = kernel_asize;
> > +__efistub_kernel_fsize         = kernel_fsize;
> > +__efistub_kernel_vaddr         = kernel_vaddr;
> > +__efistub_kernel_offset                = kernel_offset;
> > +
> > +#endif
> > +
> > +#endif /* __LOONGARCH_KERNEL_IMAGE_VARS_H */
> > diff --git a/arch/loongarch/kernel/vmlinux.lds.S b/arch/loongarch/kernel/vmlinux.lds.S
> > index 02abfaaa4892..7da4c4d7c50d 100644
> > --- a/arch/loongarch/kernel/vmlinux.lds.S
> > +++ b/arch/loongarch/kernel/vmlinux.lds.S
> > @@ -12,6 +12,14 @@
> >  #define BSS_FIRST_SECTIONS *(.bss..swapper_pg_dir)
> >
> >  #include <asm-generic/vmlinux.lds.h>
> > +#include "image-vars.h"
> > +
> > +/*
> > + * Max avaliable Page Size is 64K, so we set SectionAlignment
> > + * field of EFI application to 64K.
> > + */
> > +PECOFF_FILE_ALIGN = 0x200;
> > +PECOFF_SEGMENT_ALIGN = 0x10000;
> >
> >  OUTPUT_ARCH(loongarch)
> >  ENTRY(kernel_entry)
> > @@ -27,6 +35,9 @@ SECTIONS
> >         . = VMLINUX_LOAD_ADDRESS;
> >
> >         _text = .;
> > +       HEAD_TEXT_SECTION
> > +
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> >         .text : {
> >                 TEXT_TEXT
> >                 SCHED_TEXT
> > @@ -38,11 +49,12 @@ SECTIONS
> >                 *(.fixup)
> >                 *(.gnu.warning)
> >         } :text = 0
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> >         _etext = .;
> >
> >         EXCEPTION_TABLE(16)
> >
> > -       . = ALIGN(PAGE_SIZE);
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> >         __init_begin = .;
> >         __inittext_begin = .;
> >
> > @@ -51,6 +63,7 @@ SECTIONS
> >                 EXIT_TEXT
> >         }
> >
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> >         __inittext_end = .;
> >
> >         __initdata_begin = .;
> > @@ -60,6 +73,10 @@ SECTIONS
> >                 EXIT_DATA
> >         }
> >
> > +       .init.bss : {
> > +               *(.init.bss)
> > +       }
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> >         __initdata_end = .;
> >
> >         __init_end = .;
> > @@ -71,11 +88,11 @@ SECTIONS
> >         .sdata : {
> >                 *(.sdata)
> >         }
> > -
> > -       . = ALIGN(SZ_64K);
> > +       .edata_padding : { BYTE(0); . = ALIGN(PECOFF_FILE_ALIGN); }
> >         _edata =  .;
> >
> >         BSS_SECTION(0, SZ_64K, 8)
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> >
> >         _end = .;
> >
> > diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig
> > index 2c3dac5ecb36..ecb4e0b1295a 100644
> > --- a/drivers/firmware/efi/Kconfig
> > +++ b/drivers/firmware/efi/Kconfig
> > @@ -121,9 +121,9 @@ config EFI_ARMSTUB_DTB_LOADER
> >
> >  config EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER
> >         bool "Enable the command line initrd loader" if !X86
> > -       depends on EFI_STUB && (EFI_GENERIC_STUB || X86)
> > -       default y if X86
> >         depends on !RISCV
> > +       depends on EFI_STUB && (EFI_GENERIC_STUB || X86 || LOONGARCH)
> > +       default y if (X86 || LOONGARCH)
> >         help
> >           Select this config option to add support for the initrd= command
> >           line parameter, allowing an initrd that resides on the same volume
> > diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
> > index d0537573501e..663e9d317299 100644
> > --- a/drivers/firmware/efi/libstub/Makefile
> > +++ b/drivers/firmware/efi/libstub/Makefile
> > @@ -26,6 +26,8 @@ cflags-$(CONFIG_ARM)          := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> >                                    $(call cc-option,-mno-single-pic-base)
> >  cflags-$(CONFIG_RISCV)         := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> >                                    -fpic
> > +cflags-$(CONFIG_LOONGARCH)     := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > +                                  -fpic
> >
> >  cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
> >
> > @@ -55,7 +57,7 @@ KCOV_INSTRUMENT                       := n
> >  lib-y                          := efi-stub-helper.o gop.o secureboot.o tpm.o \
> >                                    file.o mem.o random.o randomalloc.o pci.o \
> >                                    skip_spaces.o lib-cmdline.o lib-ctype.o \
> > -                                  alignedmem.o relocate.o vsprintf.o
> > +                                  alignedmem.o relocate.o string.o vsprintf.o
> >
> >  # include the stub's generic dependencies from lib/ when building for ARM/arm64
> >  efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c
> > @@ -63,13 +65,15 @@ efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c
> >  $(obj)/lib-%.o: $(srctree)/lib/%.c FORCE
> >         $(call if_changed_rule,cc_o_c)
> >
> > -lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o fdt.o string.o \
> > +lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o fdt.o \
> >                                    $(patsubst %.c,lib-%.o,$(efi-deps-y))
> >
> >  lib-$(CONFIG_ARM)              += arm32-stub.o
> >  lib-$(CONFIG_ARM64)            += arm64-stub.o
> >  lib-$(CONFIG_X86)              += x86-stub.o
> >  lib-$(CONFIG_RISCV)            += riscv-stub.o
> > +lib-$(CONFIG_LOONGARCH)                += loongarch-stub.o
> > +
> >  CFLAGS_arm32-stub.o            := -DTEXT_OFFSET=$(TEXT_OFFSET)
> >
> >  # Even when -mbranch-protection=none is set, Clang will generate a
> > @@ -125,6 +129,12 @@ STUBCOPY_FLAGS-$(CONFIG_RISCV)     += --prefix-alloc-sections=.init \
> >                                    --prefix-symbols=__efistub_
> >  STUBCOPY_RELOC-$(CONFIG_RISCV) := R_RISCV_HI20
> >
> > +# For LoongArch, keep all the symbols in .init section and make sure that no
> > +# absolute symbols references doesn't exist.
> > +STUBCOPY_FLAGS-$(CONFIG_LOONGARCH)     += --prefix-alloc-sections=.init \
> > +                                          --prefix-symbols=__efistub_
> > +STUBCOPY_RELOC-$(CONFIG_LOONGARCH)     := R_LARCH_MARK_LA
> > +
> >  $(obj)/%.stub.o: $(obj)/%.o FORCE
> >         $(call if_changed,stubcopy)
> >
> > diff --git a/drivers/firmware/efi/libstub/loongarch-stub.c b/drivers/firmware/efi/libstub/loongarch-stub.c
> > new file mode 100644
> > index 000000000000..399641a0b0cb
> > --- /dev/null
> > +++ b/drivers/firmware/efi/libstub/loongarch-stub.c
> > @@ -0,0 +1,425 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Author: Yun Liu <liuyun@loongson.cn>
> > + *         Huacai Chen <chenhuacai@loongson.cn>
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/efi.h>
> > +#include <linux/sort.h>
> > +#include <asm/efi.h>
> > +#include <asm/addrspace.h>
> > +#include <asm/boot_param.h>
> > +#include "efistub.h"
> > +
> > +#define MAX_ARG_COUNT          128
> > +#define CMDLINE_MAX_SIZE       0x200
> > +
> > +static int argc;
> > +static char **argv;
> > +const efi_system_table_t *efi_system_table;
> > +static efi_guid_t screen_info_guid = LINUX_EFI_LARCH_SCREEN_INFO_TABLE_GUID;
> > +static unsigned int map_entry[LOONGSON3_BOOT_MEM_MAP_MAX];
> > +static struct efi_mmap mmap_array[EFI_MAX_MEMORY_TYPE][LOONGSON3_BOOT_MEM_MAP_MAX];
> > +
> > +struct exit_boot_struct {
> > +       struct boot_params *bp;
> > +       unsigned int *runtime_entry_count;
> > +};
> > +
> > +typedef void (*kernel_entry_t)(int argc, char *argv[], struct boot_params *boot_p);
> > +
> > +extern int kernel_asize;
> > +extern int kernel_fsize;
> > +extern int kernel_offset;
> > +extern unsigned long kernel_vaddr;
> > +extern kernel_entry_t kernel_entry;
> > +
> > +unsigned char efi_crc8(char *buff, int size)
> > +{
> > +       int sum, cnt;
> > +
> > +       for (sum = 0, cnt = 0; cnt < size; cnt++)
> > +               sum = (char) (sum + *(buff + cnt));
> > +
> > +       return (char)(0x100 - sum);
> > +}
> > +
> > +struct screen_info *alloc_screen_info(void)
> > +{
> > +       efi_status_t status;
> > +       struct screen_info *si;
> > +
> > +       status = efi_bs_call(allocate_pool,
> > +                       EFI_RUNTIME_SERVICES_DATA, sizeof(*si), (void **)&si);
> > +       if (status != EFI_SUCCESS)
> > +               return NULL;
> > +
> > +       status = efi_bs_call(install_configuration_table, &screen_info_guid, si);
> > +       if (status == EFI_SUCCESS)
> > +               return si;
> > +
> > +       efi_bs_call(free_pool, si);
> > +
> > +       return NULL;
> > +}
> > +
> > +static void setup_graphics(void)
> > +{
> > +       unsigned long size;
> > +       efi_status_t status;
> > +       efi_guid_t gop_proto = EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID;
> > +       void **gop_handle = NULL;
> > +       struct screen_info *si = NULL;
> > +
> > +       size = 0;
> > +       status = efi_bs_call(locate_handle, EFI_LOCATE_BY_PROTOCOL,
> > +                               &gop_proto, NULL, &size, gop_handle);
> > +       if (status == EFI_BUFFER_TOO_SMALL) {
> > +               si = alloc_screen_info();
> > +               efi_setup_gop(si, &gop_proto, size);
> > +       }
> > +}
> > +
> > +struct boot_params *bootparams_init(efi_system_table_t *sys_table)
> > +{
> > +       efi_status_t status;
> > +       struct boot_params *p;
> > +       unsigned char sig[8] = {'B', 'P', 'I', '0', '1', '0', '0', '2'};
> > +
> > +       status = efi_bs_call(allocate_pool, EFI_RUNTIME_SERVICES_DATA, SZ_64K, (void **)&p);
> > +       if (status != EFI_SUCCESS)
> > +               return NULL;
> > +
> > +       memset(p, 0, SZ_64K);
> > +       memcpy(&p->signature, sig, sizeof(long));
> > +
> > +       return p;
> > +}
> > +
> > +static unsigned long convert_priv_cmdline(char *cmdline_ptr,
> > +               unsigned long rd_addr, unsigned long rd_size)
> > +{
> > +       unsigned int rdprev_size;
> > +       unsigned int cmdline_size;
> > +       efi_status_t status;
> > +       char *pstr, *substr;
> > +       char *initrd_ptr = NULL;
> > +       char convert_str[CMDLINE_MAX_SIZE];
> > +       static char cmdline_array[CMDLINE_MAX_SIZE];
> > +
> > +       cmdline_size = strlen(cmdline_ptr);
> > +       snprintf(cmdline_array, CMDLINE_MAX_SIZE, "kernel ");
> > +
> > +       initrd_ptr = strstr(cmdline_ptr, "initrd=");
> > +       if (!initrd_ptr) {
> > +               snprintf(cmdline_array, CMDLINE_MAX_SIZE, "kernel %s", cmdline_ptr);
> > +               goto completed;
> > +       }
> > +       snprintf(convert_str, CMDLINE_MAX_SIZE, " initrd=0x%lx,0x%lx", rd_addr, rd_size);
> > +       rdprev_size = cmdline_size - strlen(initrd_ptr);
> > +       strncat(cmdline_array, cmdline_ptr, rdprev_size);
> > +
> > +       cmdline_ptr = strnstr(initrd_ptr, " ", CMDLINE_MAX_SIZE);
> > +       strcat(cmdline_array, convert_str);
> > +       if (!cmdline_ptr)
> > +               goto completed;
> > +
> > +       strcat(cmdline_array, cmdline_ptr);
> > +
> > +completed:
> > +       status = efi_allocate_pages((MAX_ARG_COUNT + 1) * (sizeof(char *)),
> > +                                       (unsigned long *)&argv, ULONG_MAX);
> > +       if (status != EFI_SUCCESS) {
> > +               efi_err("Alloc argv mmap_array error\n");
> > +               return status;
> > +       }
> > +
> > +       argc = 0;
> > +       pstr = cmdline_array;
> > +
> > +       substr = strsep(&pstr, " \t");
> > +       while (substr != NULL) {
> > +               if (strlen(substr)) {
> > +                       argv[argc++] = substr;
> > +                       if (argc == MAX_ARG_COUNT) {
> > +                               efi_err("Argv mmap_array full!\n");
> > +                               break;
> > +                       }
> > +               }
> > +               substr = strsep(&pstr, " \t");
> > +       }
> > +
> > +       return EFI_SUCCESS;
> > +}
> > +
> > +unsigned int efi_memmap_sort(struct loongsonlist_mem_map *memmap,
> > +                       unsigned int index, unsigned int mem_type)
> > +{
> > +       unsigned int i, t;
> > +       unsigned long msize;
> > +
> > +       for (i = 0; i < map_entry[mem_type]; i = t) {
> > +               msize = mmap_array[mem_type][i].mem_size;
> > +               for (t = i + 1; t < map_entry[mem_type]; t++) {
> > +                       if (mmap_array[mem_type][i].mem_start + msize <
> > +                                       mmap_array[mem_type][t].mem_start)
> > +                               break;
> > +
> > +                       msize += mmap_array[mem_type][t].mem_size;
> > +               }
> > +               memmap->map[index].mem_type = mem_type;
> > +               memmap->map[index].mem_start = mmap_array[mem_type][i].mem_start;
> > +               memmap->map[index].mem_size = msize;
> > +               memmap->map[index].attribute = mmap_array[mem_type][i].attribute;
> > +               index++;
> > +       }
> > +
> > +       return index;
> > +}
> > +
> > +static efi_status_t mk_mmap(struct efi_boot_memmap *map, struct boot_params *p)
> > +{
> > +       char checksum;
> > +       unsigned int i;
> > +       unsigned int nr_desc;
> > +       unsigned int mem_type;
> > +       unsigned long count;
> > +       efi_memory_desc_t *mem_desc;
> > +       struct loongsonlist_mem_map *mhp = NULL;
> > +
> > +       memset(map_entry, 0, sizeof(map_entry));
> > +       memset(mmap_array, 0, sizeof(mmap_array));
> > +
> > +       if (!strncmp((char *)p, "BPI", 3)) {
> > +               p->flags |= BPI_FLAGS_UEFI_SUPPORTED;
> > +               p->systemtable = (efi_system_table_t *)efi_system_table;
> > +               p->extlist_offset = sizeof(*p) + sizeof(unsigned long);
> > +               mhp = (struct loongsonlist_mem_map *)((char *)p + p->extlist_offset);
> > +
> > +               memcpy(&mhp->header.signature, "MEM", sizeof(unsigned long));
> > +               mhp->header.length = sizeof(*mhp);
> > +               mhp->desc_version = *map->desc_ver;
> > +               mhp->map_count = 0;
> > +       }
> > +       if (!(*(map->map_size)) || !(*(map->desc_size)) || !mhp) {
> > +               efi_err("get memory info error\n");
> > +               return EFI_INVALID_PARAMETER;
> > +       }
> > +       nr_desc = *(map->map_size) / *(map->desc_size);
> > +
> > +       /*
> > +        * According to UEFI SPEC, mmap_buf is the accurate Memory Map
> > +        * mmap_array now we can fill platform specific memory structure.
> > +        */
> > +       for (i = 0; i < nr_desc; i++) {
> > +               mem_desc = (efi_memory_desc_t *)((void *)(*map->map) + (i * (*(map->desc_size))));
> > +               switch (mem_desc->type) {
> > +               case EFI_RESERVED_TYPE:
> > +               case EFI_RUNTIME_SERVICES_CODE:
> > +               case EFI_RUNTIME_SERVICES_DATA:
> > +               case EFI_MEMORY_MAPPED_IO:
> > +               case EFI_MEMORY_MAPPED_IO_PORT_SPACE:
> > +               case EFI_UNUSABLE_MEMORY:
> > +               case EFI_PAL_CODE:
> > +                       mem_type = ADDRESS_TYPE_RESERVED;
> > +                       break;
> > +
> > +               case EFI_ACPI_MEMORY_NVS:
> > +                       mem_type = ADDRESS_TYPE_NVS;
> > +                       break;
> > +
> > +               case EFI_ACPI_RECLAIM_MEMORY:
> > +                       mem_type = ADDRESS_TYPE_ACPI;
> > +                       break;
> > +
> > +               case EFI_LOADER_CODE:
> > +               case EFI_LOADER_DATA:
> > +               case EFI_PERSISTENT_MEMORY:
> > +               case EFI_BOOT_SERVICES_CODE:
> > +               case EFI_BOOT_SERVICES_DATA:
> > +               case EFI_CONVENTIONAL_MEMORY:
> > +                       mem_type = ADDRESS_TYPE_SYSRAM;
> > +                       break;
> > +
> > +               default:
> > +                       continue;
> > +               }
> > +
> > +               mmap_array[mem_type][map_entry[mem_type]].mem_type = mem_type;
> > +               mmap_array[mem_type][map_entry[mem_type]].mem_start =
> > +                                               mem_desc->phys_addr & TO_PHYS_MASK;
> > +               mmap_array[mem_type][map_entry[mem_type]].mem_size =
> > +                                               mem_desc->num_pages << EFI_PAGE_SHIFT;
> > +               mmap_array[mem_type][map_entry[mem_type]].attribute =
> > +                                               mem_desc->attribute;
> > +               map_entry[mem_type]++;
> > +       }
> > +
> > +       count = mhp->map_count;
> > +       /* Sort EFI memmap and add to BPI for kernel */
> > +       for (i = 0; i < LOONGSON3_BOOT_MEM_MAP_MAX; i++) {
> > +               if (!map_entry[i])
> > +                       continue;
> > +               count = efi_memmap_sort(mhp, count, i);
> > +       }
> > +
> > +       mhp->map_count = count;
> > +       mhp->header.checksum = 0;
> > +
> > +       checksum = efi_crc8((char *)mhp, mhp->header.length);
> > +       mhp->header.checksum = checksum;
> > +
> > +       return EFI_SUCCESS;
> > +}
> > +
> > +static efi_status_t exit_boot_func(struct efi_boot_memmap *map, void *priv)
> > +{
> > +       efi_status_t status;
> > +       struct exit_boot_struct *p = priv;
> > +
> > +       status = mk_mmap(map, p->bp);
> > +       if (status != EFI_SUCCESS) {
> > +               efi_err("Make kernel memory map failed!\n");
> > +               return status;
> > +       }
> > +
> > +       return EFI_SUCCESS;
> > +}
> > +
> > +static efi_status_t exit_boot_services(struct boot_params *boot_params, void *handle)
> > +{
> > +       unsigned int desc_version;
> > +       unsigned int runtime_entry_count = 0;
> > +       unsigned long map_size, key, desc_size, buff_size;
> > +       efi_status_t status;
> > +       efi_memory_desc_t *mem_map;
> > +       struct efi_boot_memmap map;
> > +       struct exit_boot_struct priv;
> > +
> > +       map.map                 = &mem_map;
> > +       map.map_size            = &map_size;
> > +       map.desc_size           = &desc_size;
> > +       map.desc_ver            = &desc_version;
> > +       map.key_ptr             = &key;
> > +       map.buff_size           = &buff_size;
> > +       status = efi_get_memory_map(&map);
> > +       if (status != EFI_SUCCESS) {
> > +               efi_err("Unable to retrieve UEFI memory map.\n");
> > +               return status;
> > +       }
> > +
> > +       priv.bp = boot_params;
> > +       priv.runtime_entry_count = &runtime_entry_count;
> > +
> > +       /* Might as well exit boot services now */
> > +       status = efi_exit_boot_services(handle, &map, &priv, exit_boot_func);
> > +       if (status != EFI_SUCCESS)
> > +               return status;
> > +
> > +       return EFI_SUCCESS;
> > +}
> > +
> > +/*
> > + * EFI entry point for the LoongArch EFI stub.
> > + */
> > +efi_status_t __efiapi efi_pe_entry(efi_handle_t handle, efi_system_table_t *sys_table)
> > +{
> > +       unsigned int cmdline_size = 0;
> > +       unsigned long kernel_addr = 0;
> > +       unsigned long initrd_addr = 0;
> > +       unsigned long initrd_size = 0;
> > +       enum efi_secureboot_mode secure_boot;
> > +       char *cmdline_ptr = NULL;
> > +       struct boot_params *boot_p;
> > +       efi_status_t status;
> > +       efi_loaded_image_t *image;
> > +       efi_guid_t loaded_image_proto;
> > +       kernel_entry_t real_kernel_entry;
> > +
> > +       /* Config Direct Mapping */
> > +       csr_writeq(CSR_DMW0_INIT, LOONGARCH_CSR_DMWIN0);
> > +       csr_writeq(CSR_DMW1_INIT, LOONGARCH_CSR_DMWIN1);
> > +
> > +       efi_system_table = sys_table;
> > +       loaded_image_proto = LOADED_IMAGE_PROTOCOL_GUID;
> > +       kernel_addr = (unsigned long)&kernel_offset - kernel_offset;
> > +       real_kernel_entry = (kernel_entry_t)
> > +               ((unsigned long)&kernel_entry - kernel_addr + kernel_vaddr);
> > +
> > +       /* Check if we were booted by the EFI firmware */
> > +       if (sys_table->hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE)
> > +               goto fail;
> > +
> > +       /*
> > +        * Get a handle to the loaded image protocol.  This is used to get
> > +        * information about the running image, such as size and the command
> > +        * line.
> > +        */
> > +       status = sys_table->boottime->handle_protocol(handle,
> > +                                       &loaded_image_proto, (void *)&image);
> > +       if (status != EFI_SUCCESS) {
> > +               efi_err("Failed to get loaded image protocol\n");
> > +               goto fail;
> > +       }
> > +
> > +       /* Get the command line from EFI, using the LOADED_IMAGE protocol. */
> > +       cmdline_ptr = efi_convert_cmdline(image, &cmdline_size);
> > +       if (!cmdline_ptr) {
> > +               efi_err("Getting command line failed!\n");
> > +               goto fail_free_cmdline;
> > +       }
> > +
> > +#ifdef CONFIG_CMDLINE_BOOL
> > +       if (cmdline_size == 0)
> > +               efi_parse_options(CONFIG_CMDLINE);
> > +#endif
> > +       if (!IS_ENABLED(CONFIG_CMDLINE_OVERRIDE) && cmdline_size > 0)
> > +               efi_parse_options(cmdline_ptr);
> > +
> > +       efi_info("Booting Linux Kernel...\n");
> > +
> > +       efi_relocate_kernel(&kernel_addr, kernel_fsize, kernel_asize,
> > +                           PHYSADDR(kernel_vaddr), SZ_2M, PHYSADDR(kernel_vaddr));
> > +
> > +       setup_graphics();
> > +       secure_boot = efi_get_secureboot();
> > +       efi_enable_reset_attack_mitigation();
> > +
> > +       status = efi_load_initrd(image, &initrd_addr, &initrd_size, SZ_4G, ULONG_MAX);
> > +       if (status != EFI_SUCCESS) {
> > +               efi_err("Failed get initrd addr!\n");
> > +               goto fail_free;
> > +       }
> > +
> > +       status = convert_priv_cmdline(cmdline_ptr, initrd_addr, initrd_size);
> > +       if (status != EFI_SUCCESS) {
> > +               efi_err("Covert cmdline failed!\n");
> > +               goto fail_free;
> > +       }
> > +
> > +       boot_p = bootparams_init(sys_table);
> > +       if (!boot_p) {
> > +               efi_err("Create BPI struct error!\n");
> > +               goto fail;
> > +       }
> > +
> > +       status = exit_boot_services(boot_p, handle);
> > +       if (status != EFI_SUCCESS) {
> > +               efi_err("exit_boot services failed!\n");
> > +               goto fail_free;
> > +       }
> > +
> > +       real_kernel_entry(argc, argv, boot_p);
> > +
> > +       return EFI_SUCCESS;
> > +
> > +fail_free:
> > +       efi_free(initrd_size, initrd_addr);
> > +
> > +fail_free_cmdline:
> > +       efi_free(cmdline_size, (unsigned long)cmdline_ptr);
> > +
> > +fail:
> > +       return status;
> > +}
> > diff --git a/include/linux/pe.h b/include/linux/pe.h
> > index daf09ffffe38..f4bb0b6a416d 100644
> > --- a/include/linux/pe.h
> > +++ b/include/linux/pe.h
> > @@ -65,6 +65,7 @@
> >  #define        IMAGE_FILE_MACHINE_SH5          0x01a8
> >  #define        IMAGE_FILE_MACHINE_THUMB        0x01c2
> >  #define        IMAGE_FILE_MACHINE_WCEMIPSV2    0x0169
> > +#define        IMAGE_FILE_MACHINE_LOONGARCH    0x6264
> >
> >  /* flags */
> >  #define IMAGE_FILE_RELOCS_STRIPPED           0x0001
> > --
> > 2.27.0
> >

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 13/24] LoongArch: Add system call support
  2022-04-30  9:44   ` Arnd Bergmann
@ 2022-04-30 10:05     ` Huacai Chen
  2022-04-30 10:34       ` Arnd Bergmann
  0 siblings, 1 reply; 94+ messages in thread
From: Huacai Chen @ 2022-04-30 10:05 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Huacai Chen, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds,
	linux-arch, open list:DOCUMENTATION, Linux Kernel Mailing List,
	Xuefeng Li, Yanteng Si, Guo Ren, Xuerui Wang, Jiaxun Yang

Hi, Arnd,

On Sat, Apr 30, 2022 at 5:45 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> >
> > This patch adds system call support and related uaccess.h for LoongArch.
> >
> > Q: Why keep __ARCH_WANT_NEW_STAT definition while there is statx:
> > A: Until the latest glibc release (2.34), statx is only used for 32-bit
> >    platforms, or 64-bit platforms with 32-bit timestamp. I.e., Most 64-
> >    bit platforms still use newstat now.
> >
> > Q: Why keep _ARCH_WANT_SYS_CLONE definition while there is clone3:
> > A: The latest glibc release (2.34) has some basic support for clone3 but
> >    it isn't complete. E.g., pthread_create() and spawni() have converted
> >    to use clone3 but fork() will still use clone. Moreover, some seccomp
> >    related applications can still not work perfectly with clone3. E.g.,
> >    Chromium sandbox cannot work at all and there is no solution for it,
> >    which is more terrible than the fork() story [1].
> >
> > [1] https://chromium-review.googlesource.com/c/chromium/src/+/2936184
>
> I still think these have to be removed. There is no mainline glibc or musl
> port yet, and neither of them should actually be required. Please remove
> them here, and modify your libc patches accordingly when you send those
> upstream.
If this is just a problem that can be resolved by upgrading
glibc/musl, I will remove them. But the Chromium problem (or sandbox
problem in general) seems to have no solution now.

Huacai
>
>        Arnd

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
  2022-04-30  9:05 ` [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support Huacai Chen
  2022-04-30 10:07     ` Arnd Bergmann
@ 2022-04-30 10:07     ` Arnd Bergmann
  0 siblings, 0 replies; 94+ messages in thread
From: Arnd Bergmann @ 2022-04-30 10:07 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds,
	linux-arch, open list:DOCUMENTATION, Linux Kernel Mailing List,
	Xuefeng Li, Yanteng Si, Huacai Chen, Guo Ren, Xuerui Wang,
	Jiaxun Yang, Linux ARM, Catalin Marinas, Will Deacon,
	linux-riscv, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Ard Biesheuvel, linux-efi

On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
>
> This patch adds zboot (self-extracting compressed kernel) support, all
> existing in-kernel compressing algorithm and efistub are supported.
>
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

I have no objections to adding a decompressor in principle, and
the implementation seems reasonable. However, I think we should try to
be consistent between architectures. On both arm64 and riscv, the
maintainers decided to not include a decompressor and instead leave
it up to the boot loader to decompress the kernel and enter it from there.

As I understand it, this is not part of the UEFI boot flow though, so it
means that you don't get any compressed kernel images at all when
booting using UEFI (let me know if that is wrong). I assume this is why
you decided to include the decompressor here after all.

I think we should first aim for consistency here, and handle this the
same way across the modern architectures, either leaving the
decompressor code out, or adding it consistently. Maybe it would
even be possible to have the decompressor code as part of the
EFI stub and share it between the three architectures (x86 and
32-bit arm already support loading compressed kernels using EFI).

Adding the arm64, risc-v and uefi maintainers for further discussion here,
see full below.

       Arnd

> ---
>  arch/loongarch/Kbuild                         |   2 +-
>  arch/loongarch/Kconfig                        |  11 ++
>  arch/loongarch/Makefile                       |  26 ++-
>  arch/loongarch/boot/Makefile                  |  55 ++++++
>  arch/loongarch/boot/boot.lds.S                |  64 +++++++
>  arch/loongarch/boot/decompress.c              |  98 +++++++++++
>  arch/loongarch/boot/string.c                  | 166 ++++++++++++++++++
>  arch/loongarch/boot/zheader.S                 | 100 +++++++++++
>  arch/loongarch/boot/zkernel.S                 |  99 +++++++++++
>  arch/loongarch/tools/Makefile                 |  15 ++
>  arch/loongarch/tools/calc_vmlinuz_load_addr.c |  51 ++++++
>  arch/loongarch/tools/elf-entry.c              |  66 +++++++
>  12 files changed, 749 insertions(+), 4 deletions(-)
>  create mode 100644 arch/loongarch/boot/boot.lds.S
>  create mode 100644 arch/loongarch/boot/decompress.c
>  create mode 100644 arch/loongarch/boot/string.c
>  create mode 100644 arch/loongarch/boot/zheader.S
>  create mode 100644 arch/loongarch/boot/zkernel.S
>  create mode 100644 arch/loongarch/tools/Makefile
>  create mode 100644 arch/loongarch/tools/calc_vmlinuz_load_addr.c
>  create mode 100644 arch/loongarch/tools/elf-entry.c
>
> diff --git a/arch/loongarch/Kbuild b/arch/loongarch/Kbuild
> index ab5373d0a24f..d907fdd7ca08 100644
> --- a/arch/loongarch/Kbuild
> +++ b/arch/loongarch/Kbuild
> @@ -3,4 +3,4 @@ obj-y += mm/
>  obj-y += vdso/
>
>  # for cleaning
> -subdir- += boot
> +subdir- += boot tools
> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> index 55225ee5f868..6c1042746b2d 100644
> --- a/arch/loongarch/Kconfig
> +++ b/arch/loongarch/Kconfig
> @@ -107,6 +107,7 @@ config LOONGARCH
>         select PERF_USE_VMALLOC
>         select RTC_LIB
>         select SPARSE_IRQ
> +       select SYS_SUPPORTS_ZBOOT
>         select SYSCTL_EXCEPTION_TRACE
>         select SWIOTLB
>         select TRACE_IRQFLAGS_SUPPORT
> @@ -143,6 +144,16 @@ config LOCKDEP_SUPPORT
>         bool
>         default y
>
> +config SYS_SUPPORTS_ZBOOT
> +       bool
> +       select HAVE_KERNEL_GZIP
> +       select HAVE_KERNEL_BZIP2
> +       select HAVE_KERNEL_LZ4
> +       select HAVE_KERNEL_LZMA
> +       select HAVE_KERNEL_LZO
> +       select HAVE_KERNEL_XZ
> +       select HAVE_KERNEL_ZSTD
> +
>  config MACH_LOONGSON32
>         def_bool 32BIT
>
> diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
> index d88a792dafbe..1ed5b8466565 100644
> --- a/arch/loongarch/Makefile
> +++ b/arch/loongarch/Makefile
> @@ -5,12 +5,31 @@
>
>  boot   := arch/loongarch/boot
>
> +ifndef CONFIG_SYS_SUPPORTS_ZBOOT
> +
>  ifndef CONFIG_EFI_STUB
>  KBUILD_IMAGE   = $(boot)/vmlinux
>  else
>  KBUILD_IMAGE   = $(boot)/vmlinux.efi
>  endif
>
> +else
> +
> +ifndef CONFIG_EFI_STUB
> +KBUILD_IMAGE   = $(boot)/vmlinuz
> +else
> +KBUILD_IMAGE   = $(boot)/vmlinuz.efi
> +endif
> +
> +endif
> +
> +load-y         = 0x9000000000200000
> +bootvars-y     = VMLINUX_LOAD_ADDRESS=$(load-y)
> +
> +archscripts: scripts_basic
> +       $(Q)$(MAKE) $(build)=arch/loongarch/tools elf-entry
> +       $(Q)$(MAKE) $(build)=arch/loongarch/tools calc_vmlinuz_load_addr
> +
>  #
>  # Select the object file format to substitute into the linker script.
>  #
> @@ -55,9 +74,6 @@ KBUILD_CFLAGS_MODULE          += -fplt -Wa,-mla-global-with-abs,-mla-local-with-abs
>  cflags-y += -ffreestanding
>  cflags-y += $(call as-option,-Wa$(comma)-mno-fix-loongson3-llsc,)
>
> -load-y         = 0x9000000000200000
> -bootvars-y     = VMLINUX_LOAD_ADDRESS=$(load-y)
> -
>  drivers-$(CONFIG_PCI)          += arch/loongarch/pci/
>
>  KBUILD_AFLAGS  += $(cflags-y)
> @@ -99,7 +115,11 @@ $(KBUILD_IMAGE): vmlinux
>         $(Q)$(MAKE) $(build)=$(boot) $(bootvars-y) $@
>
>  install:
> +ifndef CONFIG_SYS_SUPPORTS_ZBOOT
>         $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
> +else
> +       $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinuz-$(KERNELRELEASE)
> +endif
>         $(Q)install -D -m 644 .config $(INSTALL_PATH)/config-$(KERNELRELEASE)
>         $(Q)install -D -m 644 System.map $(INSTALL_PATH)/System.map-$(KERNELRELEASE)
>
> diff --git a/arch/loongarch/boot/Makefile b/arch/loongarch/boot/Makefile
> index 66f2293c34b2..c26a36004ae2 100644
> --- a/arch/loongarch/boot/Makefile
> +++ b/arch/loongarch/boot/Makefile
> @@ -21,3 +21,58 @@ quiet_cmd_eficopy = OBJCOPY $@
>
>  $(obj)/vmlinux.efi: $(obj)/vmlinux FORCE
>         $(call if_changed,eficopy)
> +
> +# zboot
> +extra-y        += boot.lds
> +$(obj)/boot.lds: $(obj)/vmlinux.bin FORCE
> +CPPFLAGS_boot.lds = $(KBUILD_CPPFLAGS) -DVMLINUZ_LOAD_ADDRESS=$(zload-y)
> +
> +entry-y        = $(shell $(objtree)/arch/loongarch/tools/elf-entry $(obj)/vmlinux)
> +zload-y = $(shell $(objtree)/arch/loongarch/tools/calc_vmlinuz_load_addr \
> +                               $(obj)/vmlinux.bin $(VMLINUX_LOAD_ADDRESS))
> +
> +BOOT_HEAP_SIZE := 0x400000
> +BOOT_STACK_SIZE        := 0x002000
> +
> +KBUILD_AFLAGS := $(KBUILD_AFLAGS) -D__ASSEMBLY__ \
> +       -DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \
> +       -DBOOT_STACK_SIZE=$(BOOT_STACK_SIZE)
> +
> +KBUILD_CFLAGS := $(KBUILD_CFLAGS) -fpic -D__KERNEL__ \
> +       -DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \
> +       -DBOOT_STACK_SIZE=$(BOOT_STACK_SIZE)
> +
> +targets += vmlinux.bin
> +OBJCOPYFLAGS_vmlinux.bin := $(OBJCOPYFLAGS) -O binary $(strip-flags)
> +$(obj)/vmlinux.bin: $(obj)/vmlinux FORCE
> +       $(call if_changed,objcopy)
> +
> +tool_$(CONFIG_KERNEL_GZIP)    = gzip
> +tool_$(CONFIG_KERNEL_BZIP2)   = bzip2_with_size
> +tool_$(CONFIG_KERNEL_LZ4)     = lz4_with_size
> +tool_$(CONFIG_KERNEL_LZMA)    = lzma_with_size
> +tool_$(CONFIG_KERNEL_LZO)     = lzo_with_size
> +tool_$(CONFIG_KERNEL_XZ)      = xzkern_with_size
> +tool_$(CONFIG_KERNEL_ZSTD)    = zstd22_with_size
> +
> +targets += vmlinux.bin.z
> +$(obj)/vmlinux.bin.z: $(obj)/vmlinux.bin FORCE
> +       $(call if_changed,$(tool_y))
> +
> +targets += $(notdir $(vmlinuzobjs-y))
> +vmlinuzobjs-y := $(obj)/zkernel.o $(obj)/decompress.o $(obj)/string.o
> +vmlinuzobjs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
> +$(obj)/zkernel.o: $(obj)/vmlinux.bin.z
> +AFLAGS_zkernel.o = $(KBUILD_AFLAGS) -DVMLINUZ_LOAD_ADDRESS=$(zload-y) -DKERNEL_ENTRY=$(entry-y)
> +
> +quiet_cmd_zld = LD      $@
> +      cmd_zld = $(LD) $(KBUILD_LDFLAGS) -T $< $(vmlinuzobjs-y) -o $@
> +
> +targets += vmlinuz
> +$(obj)/vmlinuz: $(src)/boot.lds $(vmlinuzobjs-y) FORCE
> +       $(call if_changed,zld)
> +       $(call if_changed,strip)
> +
> +targets += vmlinuz.efi
> +$(obj)/vmlinuz.efi: $(obj)/vmlinuz FORCE
> +       $(call if_changed,eficopy)
> diff --git a/arch/loongarch/boot/boot.lds.S b/arch/loongarch/boot/boot.lds.S
> new file mode 100644
> index 000000000000..23e698782afd
> --- /dev/null
> +++ b/arch/loongarch/boot/boot.lds.S
> @@ -0,0 +1,64 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * ld.script for compressed kernel support of LoongArch
> + *
> + * Author: Huacai Chen <chenhuacai@loongson.cn>
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include "../kernel/image-vars.h"
> +
> +/*
> + * Max avaliable Page Size is 64K, so we set SectionAlignment
> + * field of EFI application to 64K.
> + */
> +PECOFF_FILE_ALIGN = 0x200;
> +PECOFF_SEGMENT_ALIGN = 0x10000;
> +
> +OUTPUT_ARCH(loongarch)
> +ENTRY(kernel_entry)
> +PHDRS {
> +       text PT_LOAD FLAGS(7); /* RWX */
> +}
> +SECTIONS
> +{
> +       . = VMLINUZ_LOAD_ADDRESS;
> +
> +       _text = .;
> +       .head.text : {
> +               *(.head.text)
> +       }
> +
> +       .text : {
> +               *(.text)
> +               *(.init.text)
> +               *(.rodata)
> +       }: text
> +
> +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> +       _data = .;
> +       .data : {
> +               *(.data)
> +               *(.init.data)
> +               /* Put the compressed image here */
> +               __image_begin = .;
> +               *(.image)
> +               __image_end = .;
> +               CONSTRUCTORS
> +               . = ALIGN(PECOFF_FILE_ALIGN);
> +       }
> +       _edata = .;
> +
> +       .bss : {
> +               *(.bss)
> +               *(.init.bss)
> +       }
> +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> +       _end = .;
> +
> +       /DISCARD/ : {
> +               *(.options)
> +               *(.comment)
> +               *(.note)
> +       }
> +}
> diff --git a/arch/loongarch/boot/decompress.c b/arch/loongarch/boot/decompress.c
> new file mode 100644
> index 000000000000..8f55fcd8f285
> --- /dev/null
> +++ b/arch/loongarch/boot/decompress.c
> @@ -0,0 +1,98 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Author: Huacai Chen <chenhuacai@loongson.cn>
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <linux/types.h>
> +#include <linux/kernel.h>
> +#include <linux/string.h>
> +#include <linux/libfdt.h>
> +
> +#include <asm/addrspace.h>
> +
> +/*
> + * These two variables specify the free mem region
> + * that can be used for temporary malloc area
> + */
> +unsigned long free_mem_ptr;
> +unsigned long free_mem_end_ptr;
> +
> +/* The linker tells us where the image is. */
> +extern unsigned char __image_begin, __image_end;
> +
> +#define puts(s) do {} while (0)
> +#define puthex(val) do {} while (0)
> +
> +void error(char *x)
> +{
> +       puts("\n\n");
> +       puts(x);
> +       puts("\n\n -- System halted");
> +
> +       while (1)
> +               ;       /* Halt */
> +}
> +
> +/* activate the code for pre-boot environment */
> +#define STATIC static
> +
> +#include "../../../../lib/ashldi3.c"
> +
> +#ifdef CONFIG_KERNEL_GZIP
> +#include "../../../../lib/decompress_inflate.c"
> +#endif
> +
> +#ifdef CONFIG_KERNEL_BZIP2
> +#include "../../../../lib/decompress_bunzip2.c"
> +#endif
> +
> +#ifdef CONFIG_KERNEL_LZ4
> +#include "../../../../lib/decompress_unlz4.c"
> +#endif
> +
> +#ifdef CONFIG_KERNEL_LZMA
> +#include "../../../../lib/decompress_unlzma.c"
> +#endif
> +
> +#ifdef CONFIG_KERNEL_LZO
> +#include "../../../../lib/decompress_unlzo.c"
> +#endif
> +
> +#ifdef CONFIG_KERNEL_XZ
> +#include "../../../../lib/decompress_unxz.c"
> +#endif
> +
> +#ifdef CONFIG_KERNEL_ZSTD
> +#include "../../../../lib/decompress_unzstd.c"
> +#endif
> +
> +void decompress_kernel(unsigned long boot_heap_start)
> +{
> +       unsigned long zimage_start, zimage_size;
> +
> +       zimage_start = (unsigned long)(&__image_begin);
> +       zimage_size = (unsigned long)(&__image_end) -
> +           (unsigned long)(&__image_begin);
> +
> +       puts("zimage at:     ");
> +       puthex(zimage_start);
> +       puts(" ");
> +       puthex(zimage_size + zimage_start);
> +       puts("\n");
> +
> +       /* This area are prepared for mallocing when decompressing */
> +       free_mem_ptr = boot_heap_start;
> +       free_mem_end_ptr = boot_heap_start + BOOT_HEAP_SIZE;
> +
> +       /* Display standard Linux/LoongArch boot prompt */
> +       puts("Uncompressing Linux at load address ");
> +       puthex(VMLINUX_LOAD_ADDRESS);
> +       puts("\n");
> +
> +       /* Decompress the kernel with according algorithm */
> +       __decompress((char *)zimage_start, zimage_size, 0, 0,
> +                  (void *)VMLINUX_LOAD_ADDRESS, 0, 0, error);
> +
> +       puts("Now, booting the kernel...\n");
> +}
> diff --git a/arch/loongarch/boot/string.c b/arch/loongarch/boot/string.c
> new file mode 100644
> index 000000000000..3f746e7c2bb5
> --- /dev/null
> +++ b/arch/loongarch/boot/string.c
> @@ -0,0 +1,166 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * arch/loongarch/boot/string.c
> + *
> + * Very small subset of simple string routines
> + */
> +
> +#include <linux/types.h>
> +
> +void __weak *memset(void *s, int c, size_t n)
> +{
> +       int i;
> +       char *ss = s;
> +
> +       for (i = 0; i < n; i++)
> +               ss[i] = c;
> +       return s;
> +}
> +
> +void __weak *memcpy(void *dest, const void *src, size_t n)
> +{
> +       int i;
> +       const char *s = src;
> +       char *d = dest;
> +
> +       for (i = 0; i < n; i++)
> +               d[i] = s[i];
> +       return dest;
> +}
> +
> +void __weak *memmove(void *dest, const void *src, size_t n)
> +{
> +       int i;
> +       const char *s = src;
> +       char *d = dest;
> +
> +       if (d < s) {
> +               for (i = 0; i < n; i++)
> +                       d[i] = s[i];
> +       } else if (d > s) {
> +               for (i = n - 1; i >= 0; i--)
> +                       d[i] = s[i];
> +       }
> +
> +       return dest;
> +}
> +
> +int __weak memcmp(const void *cs, const void *ct, size_t count)
> +{
> +       int res = 0;
> +       const unsigned char *su1, *su2;
> +
> +       for (su1 = cs, su2 = ct; 0 < count; ++su1, ++su2, count--) {
> +               res = *su1 - *su2;
> +               if (res != 0)
> +                       break;
> +       }
> +       return res;
> +}
> +
> +int __weak strcmp(const char *str1, const char *str2)
> +{
> +       int delta = 0;
> +       const unsigned char *s1 = (const unsigned char *)str1;
> +       const unsigned char *s2 = (const unsigned char *)str2;
> +
> +       while (*s1 || *s2) {
> +               delta = *s1 - *s2;
> +               if (delta)
> +                       return delta;
> +               s1++;
> +               s2++;
> +       }
> +       return 0;
> +}
> +
> +size_t __weak strlen(const char *s)
> +{
> +       const char *sc;
> +
> +       for (sc = s; *sc != '\0'; ++sc)
> +               /* nothing */;
> +       return sc - s;
> +}
> +
> +size_t __weak strnlen(const char *s, size_t count)
> +{
> +       const char *sc;
> +
> +       for (sc = s; count-- && *sc != '\0'; ++sc)
> +               /* nothing */;
> +       return sc - s;
> +}
> +
> +char * __weak strnstr(const char *s1, const char *s2, size_t len)
> +{
> +       size_t l2;
> +
> +       l2 = strlen(s2);
> +       if (!l2)
> +               return (char *)s1;
> +       while (len >= l2) {
> +               len--;
> +               if (!memcmp(s1, s2, l2))
> +                       return (char *)s1;
> +               s1++;
> +       }
> +       return NULL;
> +}
> +
> +#undef strcat
> +char * __weak strcat(char *dest, const char *src)
> +{
> +       char *tmp = dest;
> +
> +       while (*dest)
> +               dest++;
> +       while ((*dest++ = *src++) != '\0')
> +               ;
> +       return tmp;
> +}
> +
> +char * __weak strncat(char *dest, const char *src, size_t count)
> +{
> +       char *tmp = dest;
> +
> +       if (count) {
> +               while (*dest)
> +                       dest++;
> +               while ((*dest++ = *src++) != 0) {
> +                       if (--count == 0) {
> +                               *dest = '\0';
> +                               break;
> +                       }
> +               }
> +       }
> +       return tmp;
> +}
> +
> +char * __weak strpbrk(const char *cs, const char *ct)
> +{
> +       const char *sc1, *sc2;
> +
> +       for (sc1 = cs; *sc1 != '\0'; ++sc1) {
> +               for (sc2 = ct; *sc2 != '\0'; ++sc2) {
> +                       if (*sc1 == *sc2)
> +                               return (char *)sc1;
> +               }
> +       }
> +       return NULL;
> +}
> +
> +char * __weak strsep(char **s, const char *ct)
> +{
> +       char *sbegin = *s;
> +       char *end;
> +
> +       if (sbegin == NULL)
> +               return NULL;
> +
> +       end = strpbrk(sbegin, ct);
> +       if (end)
> +               *end++ = '\0';
> +       *s = end;
> +       return sbegin;
> +}
> diff --git a/arch/loongarch/boot/zheader.S b/arch/loongarch/boot/zheader.S
> new file mode 100644
> index 000000000000..4bc50d953ec7
> --- /dev/null
> +++ b/arch/loongarch/boot/zheader.S
> @@ -0,0 +1,100 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <linux/pe.h>
> +#include <linux/sizes.h>
> +
> +       .macro  __EFI_PE_HEADER
> +       .long   PE_MAGIC
> +coff_header:
> +       .short  IMAGE_FILE_MACHINE_LOONGARCH            /* Machine */
> +       .short  section_count                           /* NumberOfSections */
> +       .long   0                                       /* TimeDateStamp */
> +       .long   0                                       /* PointerToSymbolTable */
> +       .long   0                                       /* NumberOfSymbols */
> +       .short  section_table - optional_header         /* SizeOfOptionalHeader */
> +       .short  IMAGE_FILE_DEBUG_STRIPPED | \
> +               IMAGE_FILE_EXECUTABLE_IMAGE | \
> +               IMAGE_FILE_LINE_NUMS_STRIPPED           /* Characteristics */
> +
> +optional_header:
> +       .short  PE_OPT_MAGIC_PE32PLUS                   /* PE32+ format */
> +       .byte   0x02                                    /* MajorLinkerVersion */
> +       .byte   0x14                                    /* MinorLinkerVersion */
> +       .long   _data - efi_header_end                  /* SizeOfCode */
> +       .long   _end - _data                            /* SizeOfInitializedData */
> +       .long   0                                       /* SizeOfUninitializedData */
> +       .long   __efistub_efi_pe_entry - _head          /* AddressOfEntryPoint */
> +       .long   efi_header_end - _head                  /* BaseOfCode */
> +
> +extra_header_fields:
> +       .quad   0                                       /* ImageBase */
> +       .long   PECOFF_SEGMENT_ALIGN                    /* SectionAlignment */
> +       .long   PECOFF_FILE_ALIGN                       /* FileAlignment */
> +       .short  0                                       /* MajorOperatingSystemVersion */
> +       .short  0                                       /* MinorOperatingSystemVersion */
> +       .short  0                                       /* MajorImageVersion */
> +       .short  0                                       /* MinorImageVersion */
> +       .short  0                                       /* MajorSubsystemVersion */
> +       .short  0                                       /* MinorSubsystemVersion */
> +       .long   0                                       /* Win32VersionValue */
> +
> +       .long   _end - _head                            /* SizeOfImage */
> +
> +       /* Everything before the kernel image is considered part of the header */
> +       .long   efi_header_end - _head                  /* SizeOfHeaders */
> +       .long   0                                       /* CheckSum */
> +       .short  IMAGE_SUBSYSTEM_EFI_APPLICATION         /* Subsystem */
> +       .short  0                                       /* DllCharacteristics */
> +       .quad   0                                       /* SizeOfStackReserve */
> +       .quad   0                                       /* SizeOfStackCommit */
> +       .quad   0                                       /* SizeOfHeapReserve */
> +       .quad   0                                       /* SizeOfHeapCommit */
> +       .long   0                                       /* LoaderFlags */
> +       .long   (section_table - .) / 8                 /* NumberOfRvaAndSizes */
> +
> +       .quad   0                                       /* ExportTable */
> +       .quad   0                                       /* ImportTable */
> +       .quad   0                                       /* ResourceTable */
> +       .quad   0                                       /* ExceptionTable */
> +       .quad   0                                       /* CertificationTable */
> +       .quad   0                                       /* BaseRelocationTable */
> +
> +       /* Section table */
> +section_table:
> +       .ascii  ".text\0\0\0"
> +       .long   _data - efi_header_end                  /* VirtualSize */
> +       .long   efi_header_end - _head                  /* VirtualAddress */
> +       .long   _data - efi_header_end                  /* SizeOfRawData */
> +       .long   efi_header_end - _head                  /* PointerToRawData */
> +
> +       .long   0                                       /* PointerToRelocations */
> +       .long   0                                       /* PointerToLineNumbers */
> +       .short  0                                       /* NumberOfRelocations */
> +       .short  0                                       /* NumberOfLineNumbers */
> +       .long   IMAGE_SCN_CNT_CODE | \
> +               IMAGE_SCN_MEM_READ | \
> +               IMAGE_SCN_MEM_EXECUTE                   /* Characteristics */
> +
> +       .ascii  ".data\0\0\0"
> +       .long   _end - _data                            /* VirtualSize */
> +       .long   _data - _head                           /* VirtualAddress */
> +       .long   _edata - _data                          /* SizeOfRawData */
> +       .long   _data - _head                           /* PointerToRawData */
> +
> +       .long   0                                       /* PointerToRelocations */
> +       .long   0                                       /* PointerToLineNumbers */
> +       .short  0                                       /* NumberOfRelocations */
> +       .short  0                                       /* NumberOfLineNumbers */
> +       .long   IMAGE_SCN_CNT_INITIALIZED_DATA | \
> +               IMAGE_SCN_MEM_READ | \
> +               IMAGE_SCN_MEM_WRITE                     /* Characteristics */
> +
> +       .org 0x20e
> +       .word kernel_version - 512 -  _head
> +
> +       .set    section_count, (. - section_table) / 40
> +efi_header_end:
> +       .endm
> diff --git a/arch/loongarch/boot/zkernel.S b/arch/loongarch/boot/zkernel.S
> new file mode 100644
> index 000000000000..13a8a14a2328
> --- /dev/null
> +++ b/arch/loongarch/boot/zkernel.S
> @@ -0,0 +1,99 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <linux/init.h>
> +#include <linux/linkage.h>
> +#include <asm/addrspace.h>
> +#include <asm/asm.h>
> +#include <asm/loongarch.h>
> +#include <asm/regdef.h>
> +#include <generated/compile.h>
> +#include <generated/utsrelease.h>
> +
> +#ifdef CONFIG_EFI_STUB
> +
> +#include "zheader.S"
> +
> +       __HEAD
> +
> +_head:
> +       /* "MZ", MS-DOS header */
> +       .word   MZ_MAGIC
> +       .org    0x28
> +       .ascii  "Loongson\0"
> +       .org    0x3c
> +       /* Offset to the PE header */
> +       .long   pe_header - _head
> +
> +pe_header:
> +       __EFI_PE_HEADER
> +
> +kernel_asize:
> +       .long _end - _text
> +
> +kernel_fsize:
> +       .long _edata - _text
> +
> +kernel_vaddr:
> +       .quad VMLINUZ_LOAD_ADDRESS
> +
> +kernel_offset:
> +       .long kernel_offset - _text
> +
> +kernel_version:
> +       .ascii  UTS_RELEASE " (" LINUX_COMPILE_BY "@" LINUX_COMPILE_HOST ") " UTS_VERSION "\0"
> +
> +SYM_L_GLOBAL(kernel_asize)
> +SYM_L_GLOBAL(kernel_fsize)
> +SYM_L_GLOBAL(kernel_vaddr)
> +SYM_L_GLOBAL(kernel_offset)
> +
> +#endif
> +
> +       __INIT
> +
> +SYM_CODE_START(kernel_entry)
> +       /* Save boot rom start args */
> +       move    s0, a0
> +       move    s1, a1
> +       move    s2, a2
> +       move    s3, a3
> +
> +       /* Config Direct Mapping */
> +       li.d    t0, CSR_DMW0_INIT
> +       csrwr   t0, LOONGARCH_CSR_DMWIN0
> +       li.d    t0, CSR_DMW1_INIT
> +       csrwr   t0, LOONGARCH_CSR_DMWIN1
> +
> +       /* Clear BSS */
> +       la.abs  a0, _edata
> +       la.abs  a2, _end
> +1:     st.d    zero, a0, 0
> +       addi.d  a0, a0, 8
> +       bne     a2, a0, 1b
> +
> +       la.abs  a0, .heap          /* heap address */
> +       la.abs  sp, .stack + 8192  /* stack address */
> +
> +       la      ra, 2f
> +       la      t4, decompress_kernel
> +       jirl    zero, t4, 0
> +2:
> +       move    a0, s0
> +       move    a1, s1
> +       move    a2, s2
> +       move    a3, s3
> +       PTR_LI  t4, KERNEL_ENTRY
> +       jirl    zero, t4, 0
> +3:
> +       b       3b
> +SYM_CODE_END(kernel_entry)
> +
> +       .comm .heap, BOOT_HEAP_SIZE, 4
> +       .comm .stack, BOOT_STACK_SIZE, 4
> +
> +       .align 4
> +       .section .image, "a", %progbits
> +       .incbin "arch/loongarch/boot/vmlinux.bin.z"
> diff --git a/arch/loongarch/tools/Makefile b/arch/loongarch/tools/Makefile
> new file mode 100644
> index 000000000000..8a6181c82a91
> --- /dev/null
> +++ b/arch/loongarch/tools/Makefile
> @@ -0,0 +1,15 @@
> +#
> +# arch/loongarch/boot/Makefile
> +#
> +# Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> +#
> +
> +hostprogs := elf-entry
> +PHONY += elf-entry
> +elf-entry: $(obj)/elf-entry
> +       @:
> +
> +hostprogs += calc_vmlinuz_load_addr
> +PHONY += calc_vmlinuz_load_addr
> +calc_vmlinuz_load_addr: $(obj)/calc_vmlinuz_load_addr
> +       @:
> diff --git a/arch/loongarch/tools/calc_vmlinuz_load_addr.c b/arch/loongarch/tools/calc_vmlinuz_load_addr.c
> new file mode 100644
> index 000000000000..5e2ca6b4dff6
> --- /dev/null
> +++ b/arch/loongarch/tools/calc_vmlinuz_load_addr.c
> @@ -0,0 +1,51 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <errno.h>
> +#include <stdint.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <sys/stat.h>
> +
> +int main(int argc, char *argv[])
> +{
> +       unsigned long long vmlinux_size, vmlinux_load_addr, vmlinuz_load_addr;
> +       struct stat sb;
> +
> +       if (argc != 3) {
> +               fprintf(stderr, "Usage: %s <pathname> <vmlinux_load_addr>\n", argv[0]);
> +               return EXIT_FAILURE;
> +       }
> +
> +       if (stat(argv[1], &sb) == -1) {
> +               perror("stat");
> +               return EXIT_FAILURE;
> +       }
> +
> +       /* Convert hex characters to dec number */
> +       errno = 0;
> +       if (sscanf(argv[2], "%llx", &vmlinux_load_addr) != 1) {
> +               if (errno != 0)
> +                       perror("sscanf");
> +               else
> +                       fprintf(stderr, "No matching characters\n");
> +
> +               return EXIT_FAILURE;
> +       }
> +
> +       vmlinux_size = (uint64_t)sb.st_size;
> +       vmlinuz_load_addr = vmlinux_load_addr + vmlinux_size;
> +
> +       /*
> +        * Align with 64KB: KEXEC needs load sections to be aligned to PAGE_SIZE,
> +        * which may be as large as 64KB depending on the kernel configuration.
> +        */
> +
> +       vmlinuz_load_addr += (0x10000 - vmlinux_size % 0x10000);
> +
> +       printf("0x%llx\n", vmlinuz_load_addr);
> +
> +       return EXIT_SUCCESS;
> +}
> diff --git a/arch/loongarch/tools/elf-entry.c b/arch/loongarch/tools/elf-entry.c
> new file mode 100644
> index 000000000000..c80721e0dee1
> --- /dev/null
> +++ b/arch/loongarch/tools/elf-entry.c
> @@ -0,0 +1,66 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <elf.h>
> +#include <inttypes.h>
> +#include <stdint.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +__attribute__((noreturn))
> +static void die(const char *msg)
> +{
> +       fputs(msg, stderr);
> +       exit(EXIT_FAILURE);
> +}
> +
> +int main(int argc, const char *argv[])
> +{
> +       uint64_t entry;
> +       size_t nread;
> +       FILE *file;
> +       union {
> +               Elf32_Ehdr ehdr32;
> +               Elf64_Ehdr ehdr64;
> +       } hdr;
> +
> +       if (argc != 2)
> +               die("Usage: elf-entry <elf-file>\n");
> +
> +       file = fopen(argv[1], "r");
> +       if (!file) {
> +               perror("Unable to open input file");
> +               return EXIT_FAILURE;
> +       }
> +
> +       nread = fread(&hdr, 1, sizeof(hdr), file);
> +       if (nread != sizeof(hdr)) {
> +               fclose(file);
> +               perror("Unable to read input file");
> +               return EXIT_FAILURE;
> +       }
> +
> +       if (memcmp(hdr.ehdr32.e_ident, ELFMAG, SELFMAG)) {
> +               fclose(file);
> +               die("Input is not an ELF\n");
> +       }
> +
> +       switch (hdr.ehdr32.e_ident[EI_CLASS]) {
> +       case ELFCLASS32:
> +               /* Sign extend to form a canonical address */
> +               entry = (int64_t)(int32_t)hdr.ehdr32.e_entry;
> +               break;
> +
> +       case ELFCLASS64:
> +               entry = hdr.ehdr64.e_entry;
> +               break;
> +
> +       default:
> +               fclose(file);
> +               die("Invalid ELF class\n");
> +       }
> +
> +       fclose(file);
> +       printf("0x%016" PRIx64 "\n", entry);
> +
> +       return EXIT_SUCCESS;
> +}
> --
> 2.27.0
>

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
@ 2022-04-30 10:07     ` Arnd Bergmann
  0 siblings, 0 replies; 94+ messages in thread
From: Arnd Bergmann @ 2022-04-30 10:07 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds,
	linux-arch, open list:DOCUMENTATION, Linux Kernel Mailing List,
	Xuefeng Li, Yanteng Si, Huacai Chen, Guo Ren, Xuerui Wang,
	Jiaxun Yang, Linux ARM, Catalin Marinas, Will Deacon,
	linux-riscv, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Ard Biesheuvel, linux-efi

On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
>
> This patch adds zboot (self-extracting compressed kernel) support, all
> existing in-kernel compressing algorithm and efistub are supported.
>
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

I have no objections to adding a decompressor in principle, and
the implementation seems reasonable. However, I think we should try to
be consistent between architectures. On both arm64 and riscv, the
maintainers decided to not include a decompressor and instead leave
it up to the boot loader to decompress the kernel and enter it from there.

As I understand it, this is not part of the UEFI boot flow though, so it
means that you don't get any compressed kernel images at all when
booting using UEFI (let me know if that is wrong). I assume this is why
you decided to include the decompressor here after all.

I think we should first aim for consistency here, and handle this the
same way across the modern architectures, either leaving the
decompressor code out, or adding it consistently. Maybe it would
even be possible to have the decompressor code as part of the
EFI stub and share it between the three architectures (x86 and
32-bit arm already support loading compressed kernels using EFI).

Adding the arm64, risc-v and uefi maintainers for further discussion here,
see full below.

       Arnd

> ---
>  arch/loongarch/Kbuild                         |   2 +-
>  arch/loongarch/Kconfig                        |  11 ++
>  arch/loongarch/Makefile                       |  26 ++-
>  arch/loongarch/boot/Makefile                  |  55 ++++++
>  arch/loongarch/boot/boot.lds.S                |  64 +++++++
>  arch/loongarch/boot/decompress.c              |  98 +++++++++++
>  arch/loongarch/boot/string.c                  | 166 ++++++++++++++++++
>  arch/loongarch/boot/zheader.S                 | 100 +++++++++++
>  arch/loongarch/boot/zkernel.S                 |  99 +++++++++++
>  arch/loongarch/tools/Makefile                 |  15 ++
>  arch/loongarch/tools/calc_vmlinuz_load_addr.c |  51 ++++++
>  arch/loongarch/tools/elf-entry.c              |  66 +++++++
>  12 files changed, 749 insertions(+), 4 deletions(-)
>  create mode 100644 arch/loongarch/boot/boot.lds.S
>  create mode 100644 arch/loongarch/boot/decompress.c
>  create mode 100644 arch/loongarch/boot/string.c
>  create mode 100644 arch/loongarch/boot/zheader.S
>  create mode 100644 arch/loongarch/boot/zkernel.S
>  create mode 100644 arch/loongarch/tools/Makefile
>  create mode 100644 arch/loongarch/tools/calc_vmlinuz_load_addr.c
>  create mode 100644 arch/loongarch/tools/elf-entry.c
>
> diff --git a/arch/loongarch/Kbuild b/arch/loongarch/Kbuild
> index ab5373d0a24f..d907fdd7ca08 100644
> --- a/arch/loongarch/Kbuild
> +++ b/arch/loongarch/Kbuild
> @@ -3,4 +3,4 @@ obj-y += mm/
>  obj-y += vdso/
>
>  # for cleaning
> -subdir- += boot
> +subdir- += boot tools
> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> index 55225ee5f868..6c1042746b2d 100644
> --- a/arch/loongarch/Kconfig
> +++ b/arch/loongarch/Kconfig
> @@ -107,6 +107,7 @@ config LOONGARCH
>         select PERF_USE_VMALLOC
>         select RTC_LIB
>         select SPARSE_IRQ
> +       select SYS_SUPPORTS_ZBOOT
>         select SYSCTL_EXCEPTION_TRACE
>         select SWIOTLB
>         select TRACE_IRQFLAGS_SUPPORT
> @@ -143,6 +144,16 @@ config LOCKDEP_SUPPORT
>         bool
>         default y
>
> +config SYS_SUPPORTS_ZBOOT
> +       bool
> +       select HAVE_KERNEL_GZIP
> +       select HAVE_KERNEL_BZIP2
> +       select HAVE_KERNEL_LZ4
> +       select HAVE_KERNEL_LZMA
> +       select HAVE_KERNEL_LZO
> +       select HAVE_KERNEL_XZ
> +       select HAVE_KERNEL_ZSTD
> +
>  config MACH_LOONGSON32
>         def_bool 32BIT
>
> diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
> index d88a792dafbe..1ed5b8466565 100644
> --- a/arch/loongarch/Makefile
> +++ b/arch/loongarch/Makefile
> @@ -5,12 +5,31 @@
>
>  boot   := arch/loongarch/boot
>
> +ifndef CONFIG_SYS_SUPPORTS_ZBOOT
> +
>  ifndef CONFIG_EFI_STUB
>  KBUILD_IMAGE   = $(boot)/vmlinux
>  else
>  KBUILD_IMAGE   = $(boot)/vmlinux.efi
>  endif
>
> +else
> +
> +ifndef CONFIG_EFI_STUB
> +KBUILD_IMAGE   = $(boot)/vmlinuz
> +else
> +KBUILD_IMAGE   = $(boot)/vmlinuz.efi
> +endif
> +
> +endif
> +
> +load-y         = 0x9000000000200000
> +bootvars-y     = VMLINUX_LOAD_ADDRESS=$(load-y)
> +
> +archscripts: scripts_basic
> +       $(Q)$(MAKE) $(build)=arch/loongarch/tools elf-entry
> +       $(Q)$(MAKE) $(build)=arch/loongarch/tools calc_vmlinuz_load_addr
> +
>  #
>  # Select the object file format to substitute into the linker script.
>  #
> @@ -55,9 +74,6 @@ KBUILD_CFLAGS_MODULE          += -fplt -Wa,-mla-global-with-abs,-mla-local-with-abs
>  cflags-y += -ffreestanding
>  cflags-y += $(call as-option,-Wa$(comma)-mno-fix-loongson3-llsc,)
>
> -load-y         = 0x9000000000200000
> -bootvars-y     = VMLINUX_LOAD_ADDRESS=$(load-y)
> -
>  drivers-$(CONFIG_PCI)          += arch/loongarch/pci/
>
>  KBUILD_AFLAGS  += $(cflags-y)
> @@ -99,7 +115,11 @@ $(KBUILD_IMAGE): vmlinux
>         $(Q)$(MAKE) $(build)=$(boot) $(bootvars-y) $@
>
>  install:
> +ifndef CONFIG_SYS_SUPPORTS_ZBOOT
>         $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
> +else
> +       $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinuz-$(KERNELRELEASE)
> +endif
>         $(Q)install -D -m 644 .config $(INSTALL_PATH)/config-$(KERNELRELEASE)
>         $(Q)install -D -m 644 System.map $(INSTALL_PATH)/System.map-$(KERNELRELEASE)
>
> diff --git a/arch/loongarch/boot/Makefile b/arch/loongarch/boot/Makefile
> index 66f2293c34b2..c26a36004ae2 100644
> --- a/arch/loongarch/boot/Makefile
> +++ b/arch/loongarch/boot/Makefile
> @@ -21,3 +21,58 @@ quiet_cmd_eficopy = OBJCOPY $@
>
>  $(obj)/vmlinux.efi: $(obj)/vmlinux FORCE
>         $(call if_changed,eficopy)
> +
> +# zboot
> +extra-y        += boot.lds
> +$(obj)/boot.lds: $(obj)/vmlinux.bin FORCE
> +CPPFLAGS_boot.lds = $(KBUILD_CPPFLAGS) -DVMLINUZ_LOAD_ADDRESS=$(zload-y)
> +
> +entry-y        = $(shell $(objtree)/arch/loongarch/tools/elf-entry $(obj)/vmlinux)
> +zload-y = $(shell $(objtree)/arch/loongarch/tools/calc_vmlinuz_load_addr \
> +                               $(obj)/vmlinux.bin $(VMLINUX_LOAD_ADDRESS))
> +
> +BOOT_HEAP_SIZE := 0x400000
> +BOOT_STACK_SIZE        := 0x002000
> +
> +KBUILD_AFLAGS := $(KBUILD_AFLAGS) -D__ASSEMBLY__ \
> +       -DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \
> +       -DBOOT_STACK_SIZE=$(BOOT_STACK_SIZE)
> +
> +KBUILD_CFLAGS := $(KBUILD_CFLAGS) -fpic -D__KERNEL__ \
> +       -DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \
> +       -DBOOT_STACK_SIZE=$(BOOT_STACK_SIZE)
> +
> +targets += vmlinux.bin
> +OBJCOPYFLAGS_vmlinux.bin := $(OBJCOPYFLAGS) -O binary $(strip-flags)
> +$(obj)/vmlinux.bin: $(obj)/vmlinux FORCE
> +       $(call if_changed,objcopy)
> +
> +tool_$(CONFIG_KERNEL_GZIP)    = gzip
> +tool_$(CONFIG_KERNEL_BZIP2)   = bzip2_with_size
> +tool_$(CONFIG_KERNEL_LZ4)     = lz4_with_size
> +tool_$(CONFIG_KERNEL_LZMA)    = lzma_with_size
> +tool_$(CONFIG_KERNEL_LZO)     = lzo_with_size
> +tool_$(CONFIG_KERNEL_XZ)      = xzkern_with_size
> +tool_$(CONFIG_KERNEL_ZSTD)    = zstd22_with_size
> +
> +targets += vmlinux.bin.z
> +$(obj)/vmlinux.bin.z: $(obj)/vmlinux.bin FORCE
> +       $(call if_changed,$(tool_y))
> +
> +targets += $(notdir $(vmlinuzobjs-y))
> +vmlinuzobjs-y := $(obj)/zkernel.o $(obj)/decompress.o $(obj)/string.o
> +vmlinuzobjs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
> +$(obj)/zkernel.o: $(obj)/vmlinux.bin.z
> +AFLAGS_zkernel.o = $(KBUILD_AFLAGS) -DVMLINUZ_LOAD_ADDRESS=$(zload-y) -DKERNEL_ENTRY=$(entry-y)
> +
> +quiet_cmd_zld = LD      $@
> +      cmd_zld = $(LD) $(KBUILD_LDFLAGS) -T $< $(vmlinuzobjs-y) -o $@
> +
> +targets += vmlinuz
> +$(obj)/vmlinuz: $(src)/boot.lds $(vmlinuzobjs-y) FORCE
> +       $(call if_changed,zld)
> +       $(call if_changed,strip)
> +
> +targets += vmlinuz.efi
> +$(obj)/vmlinuz.efi: $(obj)/vmlinuz FORCE
> +       $(call if_changed,eficopy)
> diff --git a/arch/loongarch/boot/boot.lds.S b/arch/loongarch/boot/boot.lds.S
> new file mode 100644
> index 000000000000..23e698782afd
> --- /dev/null
> +++ b/arch/loongarch/boot/boot.lds.S
> @@ -0,0 +1,64 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * ld.script for compressed kernel support of LoongArch
> + *
> + * Author: Huacai Chen <chenhuacai@loongson.cn>
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include "../kernel/image-vars.h"
> +
> +/*
> + * Max avaliable Page Size is 64K, so we set SectionAlignment
> + * field of EFI application to 64K.
> + */
> +PECOFF_FILE_ALIGN = 0x200;
> +PECOFF_SEGMENT_ALIGN = 0x10000;
> +
> +OUTPUT_ARCH(loongarch)
> +ENTRY(kernel_entry)
> +PHDRS {
> +       text PT_LOAD FLAGS(7); /* RWX */
> +}
> +SECTIONS
> +{
> +       . = VMLINUZ_LOAD_ADDRESS;
> +
> +       _text = .;
> +       .head.text : {
> +               *(.head.text)
> +       }
> +
> +       .text : {
> +               *(.text)
> +               *(.init.text)
> +               *(.rodata)
> +       }: text
> +
> +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> +       _data = .;
> +       .data : {
> +               *(.data)
> +               *(.init.data)
> +               /* Put the compressed image here */
> +               __image_begin = .;
> +               *(.image)
> +               __image_end = .;
> +               CONSTRUCTORS
> +               . = ALIGN(PECOFF_FILE_ALIGN);
> +       }
> +       _edata = .;
> +
> +       .bss : {
> +               *(.bss)
> +               *(.init.bss)
> +       }
> +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> +       _end = .;
> +
> +       /DISCARD/ : {
> +               *(.options)
> +               *(.comment)
> +               *(.note)
> +       }
> +}
> diff --git a/arch/loongarch/boot/decompress.c b/arch/loongarch/boot/decompress.c
> new file mode 100644
> index 000000000000..8f55fcd8f285
> --- /dev/null
> +++ b/arch/loongarch/boot/decompress.c
> @@ -0,0 +1,98 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Author: Huacai Chen <chenhuacai@loongson.cn>
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <linux/types.h>
> +#include <linux/kernel.h>
> +#include <linux/string.h>
> +#include <linux/libfdt.h>
> +
> +#include <asm/addrspace.h>
> +
> +/*
> + * These two variables specify the free mem region
> + * that can be used for temporary malloc area
> + */
> +unsigned long free_mem_ptr;
> +unsigned long free_mem_end_ptr;
> +
> +/* The linker tells us where the image is. */
> +extern unsigned char __image_begin, __image_end;
> +
> +#define puts(s) do {} while (0)
> +#define puthex(val) do {} while (0)
> +
> +void error(char *x)
> +{
> +       puts("\n\n");
> +       puts(x);
> +       puts("\n\n -- System halted");
> +
> +       while (1)
> +               ;       /* Halt */
> +}
> +
> +/* activate the code for pre-boot environment */
> +#define STATIC static
> +
> +#include "../../../../lib/ashldi3.c"
> +
> +#ifdef CONFIG_KERNEL_GZIP
> +#include "../../../../lib/decompress_inflate.c"
> +#endif
> +
> +#ifdef CONFIG_KERNEL_BZIP2
> +#include "../../../../lib/decompress_bunzip2.c"
> +#endif
> +
> +#ifdef CONFIG_KERNEL_LZ4
> +#include "../../../../lib/decompress_unlz4.c"
> +#endif
> +
> +#ifdef CONFIG_KERNEL_LZMA
> +#include "../../../../lib/decompress_unlzma.c"
> +#endif
> +
> +#ifdef CONFIG_KERNEL_LZO
> +#include "../../../../lib/decompress_unlzo.c"
> +#endif
> +
> +#ifdef CONFIG_KERNEL_XZ
> +#include "../../../../lib/decompress_unxz.c"
> +#endif
> +
> +#ifdef CONFIG_KERNEL_ZSTD
> +#include "../../../../lib/decompress_unzstd.c"
> +#endif
> +
> +void decompress_kernel(unsigned long boot_heap_start)
> +{
> +       unsigned long zimage_start, zimage_size;
> +
> +       zimage_start = (unsigned long)(&__image_begin);
> +       zimage_size = (unsigned long)(&__image_end) -
> +           (unsigned long)(&__image_begin);
> +
> +       puts("zimage at:     ");
> +       puthex(zimage_start);
> +       puts(" ");
> +       puthex(zimage_size + zimage_start);
> +       puts("\n");
> +
> +       /* This area are prepared for mallocing when decompressing */
> +       free_mem_ptr = boot_heap_start;
> +       free_mem_end_ptr = boot_heap_start + BOOT_HEAP_SIZE;
> +
> +       /* Display standard Linux/LoongArch boot prompt */
> +       puts("Uncompressing Linux at load address ");
> +       puthex(VMLINUX_LOAD_ADDRESS);
> +       puts("\n");
> +
> +       /* Decompress the kernel with according algorithm */
> +       __decompress((char *)zimage_start, zimage_size, 0, 0,
> +                  (void *)VMLINUX_LOAD_ADDRESS, 0, 0, error);
> +
> +       puts("Now, booting the kernel...\n");
> +}
> diff --git a/arch/loongarch/boot/string.c b/arch/loongarch/boot/string.c
> new file mode 100644
> index 000000000000..3f746e7c2bb5
> --- /dev/null
> +++ b/arch/loongarch/boot/string.c
> @@ -0,0 +1,166 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * arch/loongarch/boot/string.c
> + *
> + * Very small subset of simple string routines
> + */
> +
> +#include <linux/types.h>
> +
> +void __weak *memset(void *s, int c, size_t n)
> +{
> +       int i;
> +       char *ss = s;
> +
> +       for (i = 0; i < n; i++)
> +               ss[i] = c;
> +       return s;
> +}
> +
> +void __weak *memcpy(void *dest, const void *src, size_t n)
> +{
> +       int i;
> +       const char *s = src;
> +       char *d = dest;
> +
> +       for (i = 0; i < n; i++)
> +               d[i] = s[i];
> +       return dest;
> +}
> +
> +void __weak *memmove(void *dest, const void *src, size_t n)
> +{
> +       int i;
> +       const char *s = src;
> +       char *d = dest;
> +
> +       if (d < s) {
> +               for (i = 0; i < n; i++)
> +                       d[i] = s[i];
> +       } else if (d > s) {
> +               for (i = n - 1; i >= 0; i--)
> +                       d[i] = s[i];
> +       }
> +
> +       return dest;
> +}
> +
> +int __weak memcmp(const void *cs, const void *ct, size_t count)
> +{
> +       int res = 0;
> +       const unsigned char *su1, *su2;
> +
> +       for (su1 = cs, su2 = ct; 0 < count; ++su1, ++su2, count--) {
> +               res = *su1 - *su2;
> +               if (res != 0)
> +                       break;
> +       }
> +       return res;
> +}
> +
> +int __weak strcmp(const char *str1, const char *str2)
> +{
> +       int delta = 0;
> +       const unsigned char *s1 = (const unsigned char *)str1;
> +       const unsigned char *s2 = (const unsigned char *)str2;
> +
> +       while (*s1 || *s2) {
> +               delta = *s1 - *s2;
> +               if (delta)
> +                       return delta;
> +               s1++;
> +               s2++;
> +       }
> +       return 0;
> +}
> +
> +size_t __weak strlen(const char *s)
> +{
> +       const char *sc;
> +
> +       for (sc = s; *sc != '\0'; ++sc)
> +               /* nothing */;
> +       return sc - s;
> +}
> +
> +size_t __weak strnlen(const char *s, size_t count)
> +{
> +       const char *sc;
> +
> +       for (sc = s; count-- && *sc != '\0'; ++sc)
> +               /* nothing */;
> +       return sc - s;
> +}
> +
> +char * __weak strnstr(const char *s1, const char *s2, size_t len)
> +{
> +       size_t l2;
> +
> +       l2 = strlen(s2);
> +       if (!l2)
> +               return (char *)s1;
> +       while (len >= l2) {
> +               len--;
> +               if (!memcmp(s1, s2, l2))
> +                       return (char *)s1;
> +               s1++;
> +       }
> +       return NULL;
> +}
> +
> +#undef strcat
> +char * __weak strcat(char *dest, const char *src)
> +{
> +       char *tmp = dest;
> +
> +       while (*dest)
> +               dest++;
> +       while ((*dest++ = *src++) != '\0')
> +               ;
> +       return tmp;
> +}
> +
> +char * __weak strncat(char *dest, const char *src, size_t count)
> +{
> +       char *tmp = dest;
> +
> +       if (count) {
> +               while (*dest)
> +                       dest++;
> +               while ((*dest++ = *src++) != 0) {
> +                       if (--count == 0) {
> +                               *dest = '\0';
> +                               break;
> +                       }
> +               }
> +       }
> +       return tmp;
> +}
> +
> +char * __weak strpbrk(const char *cs, const char *ct)
> +{
> +       const char *sc1, *sc2;
> +
> +       for (sc1 = cs; *sc1 != '\0'; ++sc1) {
> +               for (sc2 = ct; *sc2 != '\0'; ++sc2) {
> +                       if (*sc1 == *sc2)
> +                               return (char *)sc1;
> +               }
> +       }
> +       return NULL;
> +}
> +
> +char * __weak strsep(char **s, const char *ct)
> +{
> +       char *sbegin = *s;
> +       char *end;
> +
> +       if (sbegin == NULL)
> +               return NULL;
> +
> +       end = strpbrk(sbegin, ct);
> +       if (end)
> +               *end++ = '\0';
> +       *s = end;
> +       return sbegin;
> +}
> diff --git a/arch/loongarch/boot/zheader.S b/arch/loongarch/boot/zheader.S
> new file mode 100644
> index 000000000000..4bc50d953ec7
> --- /dev/null
> +++ b/arch/loongarch/boot/zheader.S
> @@ -0,0 +1,100 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <linux/pe.h>
> +#include <linux/sizes.h>
> +
> +       .macro  __EFI_PE_HEADER
> +       .long   PE_MAGIC
> +coff_header:
> +       .short  IMAGE_FILE_MACHINE_LOONGARCH            /* Machine */
> +       .short  section_count                           /* NumberOfSections */
> +       .long   0                                       /* TimeDateStamp */
> +       .long   0                                       /* PointerToSymbolTable */
> +       .long   0                                       /* NumberOfSymbols */
> +       .short  section_table - optional_header         /* SizeOfOptionalHeader */
> +       .short  IMAGE_FILE_DEBUG_STRIPPED | \
> +               IMAGE_FILE_EXECUTABLE_IMAGE | \
> +               IMAGE_FILE_LINE_NUMS_STRIPPED           /* Characteristics */
> +
> +optional_header:
> +       .short  PE_OPT_MAGIC_PE32PLUS                   /* PE32+ format */
> +       .byte   0x02                                    /* MajorLinkerVersion */
> +       .byte   0x14                                    /* MinorLinkerVersion */
> +       .long   _data - efi_header_end                  /* SizeOfCode */
> +       .long   _end - _data                            /* SizeOfInitializedData */
> +       .long   0                                       /* SizeOfUninitializedData */
> +       .long   __efistub_efi_pe_entry - _head          /* AddressOfEntryPoint */
> +       .long   efi_header_end - _head                  /* BaseOfCode */
> +
> +extra_header_fields:
> +       .quad   0                                       /* ImageBase */
> +       .long   PECOFF_SEGMENT_ALIGN                    /* SectionAlignment */
> +       .long   PECOFF_FILE_ALIGN                       /* FileAlignment */
> +       .short  0                                       /* MajorOperatingSystemVersion */
> +       .short  0                                       /* MinorOperatingSystemVersion */
> +       .short  0                                       /* MajorImageVersion */
> +       .short  0                                       /* MinorImageVersion */
> +       .short  0                                       /* MajorSubsystemVersion */
> +       .short  0                                       /* MinorSubsystemVersion */
> +       .long   0                                       /* Win32VersionValue */
> +
> +       .long   _end - _head                            /* SizeOfImage */
> +
> +       /* Everything before the kernel image is considered part of the header */
> +       .long   efi_header_end - _head                  /* SizeOfHeaders */
> +       .long   0                                       /* CheckSum */
> +       .short  IMAGE_SUBSYSTEM_EFI_APPLICATION         /* Subsystem */
> +       .short  0                                       /* DllCharacteristics */
> +       .quad   0                                       /* SizeOfStackReserve */
> +       .quad   0                                       /* SizeOfStackCommit */
> +       .quad   0                                       /* SizeOfHeapReserve */
> +       .quad   0                                       /* SizeOfHeapCommit */
> +       .long   0                                       /* LoaderFlags */
> +       .long   (section_table - .) / 8                 /* NumberOfRvaAndSizes */
> +
> +       .quad   0                                       /* ExportTable */
> +       .quad   0                                       /* ImportTable */
> +       .quad   0                                       /* ResourceTable */
> +       .quad   0                                       /* ExceptionTable */
> +       .quad   0                                       /* CertificationTable */
> +       .quad   0                                       /* BaseRelocationTable */
> +
> +       /* Section table */
> +section_table:
> +       .ascii  ".text\0\0\0"
> +       .long   _data - efi_header_end                  /* VirtualSize */
> +       .long   efi_header_end - _head                  /* VirtualAddress */
> +       .long   _data - efi_header_end                  /* SizeOfRawData */
> +       .long   efi_header_end - _head                  /* PointerToRawData */
> +
> +       .long   0                                       /* PointerToRelocations */
> +       .long   0                                       /* PointerToLineNumbers */
> +       .short  0                                       /* NumberOfRelocations */
> +       .short  0                                       /* NumberOfLineNumbers */
> +       .long   IMAGE_SCN_CNT_CODE | \
> +               IMAGE_SCN_MEM_READ | \
> +               IMAGE_SCN_MEM_EXECUTE                   /* Characteristics */
> +
> +       .ascii  ".data\0\0\0"
> +       .long   _end - _data                            /* VirtualSize */
> +       .long   _data - _head                           /* VirtualAddress */
> +       .long   _edata - _data                          /* SizeOfRawData */
> +       .long   _data - _head                           /* PointerToRawData */
> +
> +       .long   0                                       /* PointerToRelocations */
> +       .long   0                                       /* PointerToLineNumbers */
> +       .short  0                                       /* NumberOfRelocations */
> +       .short  0                                       /* NumberOfLineNumbers */
> +       .long   IMAGE_SCN_CNT_INITIALIZED_DATA | \
> +               IMAGE_SCN_MEM_READ | \
> +               IMAGE_SCN_MEM_WRITE                     /* Characteristics */
> +
> +       .org 0x20e
> +       .word kernel_version - 512 -  _head
> +
> +       .set    section_count, (. - section_table) / 40
> +efi_header_end:
> +       .endm
> diff --git a/arch/loongarch/boot/zkernel.S b/arch/loongarch/boot/zkernel.S
> new file mode 100644
> index 000000000000..13a8a14a2328
> --- /dev/null
> +++ b/arch/loongarch/boot/zkernel.S
> @@ -0,0 +1,99 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <linux/init.h>
> +#include <linux/linkage.h>
> +#include <asm/addrspace.h>
> +#include <asm/asm.h>
> +#include <asm/loongarch.h>
> +#include <asm/regdef.h>
> +#include <generated/compile.h>
> +#include <generated/utsrelease.h>
> +
> +#ifdef CONFIG_EFI_STUB
> +
> +#include "zheader.S"
> +
> +       __HEAD
> +
> +_head:
> +       /* "MZ", MS-DOS header */
> +       .word   MZ_MAGIC
> +       .org    0x28
> +       .ascii  "Loongson\0"
> +       .org    0x3c
> +       /* Offset to the PE header */
> +       .long   pe_header - _head
> +
> +pe_header:
> +       __EFI_PE_HEADER
> +
> +kernel_asize:
> +       .long _end - _text
> +
> +kernel_fsize:
> +       .long _edata - _text
> +
> +kernel_vaddr:
> +       .quad VMLINUZ_LOAD_ADDRESS
> +
> +kernel_offset:
> +       .long kernel_offset - _text
> +
> +kernel_version:
> +       .ascii  UTS_RELEASE " (" LINUX_COMPILE_BY "@" LINUX_COMPILE_HOST ") " UTS_VERSION "\0"
> +
> +SYM_L_GLOBAL(kernel_asize)
> +SYM_L_GLOBAL(kernel_fsize)
> +SYM_L_GLOBAL(kernel_vaddr)
> +SYM_L_GLOBAL(kernel_offset)
> +
> +#endif
> +
> +       __INIT
> +
> +SYM_CODE_START(kernel_entry)
> +       /* Save boot rom start args */
> +       move    s0, a0
> +       move    s1, a1
> +       move    s2, a2
> +       move    s3, a3
> +
> +       /* Config Direct Mapping */
> +       li.d    t0, CSR_DMW0_INIT
> +       csrwr   t0, LOONGARCH_CSR_DMWIN0
> +       li.d    t0, CSR_DMW1_INIT
> +       csrwr   t0, LOONGARCH_CSR_DMWIN1
> +
> +       /* Clear BSS */
> +       la.abs  a0, _edata
> +       la.abs  a2, _end
> +1:     st.d    zero, a0, 0
> +       addi.d  a0, a0, 8
> +       bne     a2, a0, 1b
> +
> +       la.abs  a0, .heap          /* heap address */
> +       la.abs  sp, .stack + 8192  /* stack address */
> +
> +       la      ra, 2f
> +       la      t4, decompress_kernel
> +       jirl    zero, t4, 0
> +2:
> +       move    a0, s0
> +       move    a1, s1
> +       move    a2, s2
> +       move    a3, s3
> +       PTR_LI  t4, KERNEL_ENTRY
> +       jirl    zero, t4, 0
> +3:
> +       b       3b
> +SYM_CODE_END(kernel_entry)
> +
> +       .comm .heap, BOOT_HEAP_SIZE, 4
> +       .comm .stack, BOOT_STACK_SIZE, 4
> +
> +       .align 4
> +       .section .image, "a", %progbits
> +       .incbin "arch/loongarch/boot/vmlinux.bin.z"
> diff --git a/arch/loongarch/tools/Makefile b/arch/loongarch/tools/Makefile
> new file mode 100644
> index 000000000000..8a6181c82a91
> --- /dev/null
> +++ b/arch/loongarch/tools/Makefile
> @@ -0,0 +1,15 @@
> +#
> +# arch/loongarch/boot/Makefile
> +#
> +# Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> +#
> +
> +hostprogs := elf-entry
> +PHONY += elf-entry
> +elf-entry: $(obj)/elf-entry
> +       @:
> +
> +hostprogs += calc_vmlinuz_load_addr
> +PHONY += calc_vmlinuz_load_addr
> +calc_vmlinuz_load_addr: $(obj)/calc_vmlinuz_load_addr
> +       @:
> diff --git a/arch/loongarch/tools/calc_vmlinuz_load_addr.c b/arch/loongarch/tools/calc_vmlinuz_load_addr.c
> new file mode 100644
> index 000000000000..5e2ca6b4dff6
> --- /dev/null
> +++ b/arch/loongarch/tools/calc_vmlinuz_load_addr.c
> @@ -0,0 +1,51 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <errno.h>
> +#include <stdint.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <sys/stat.h>
> +
> +int main(int argc, char *argv[])
> +{
> +       unsigned long long vmlinux_size, vmlinux_load_addr, vmlinuz_load_addr;
> +       struct stat sb;
> +
> +       if (argc != 3) {
> +               fprintf(stderr, "Usage: %s <pathname> <vmlinux_load_addr>\n", argv[0]);
> +               return EXIT_FAILURE;
> +       }
> +
> +       if (stat(argv[1], &sb) == -1) {
> +               perror("stat");
> +               return EXIT_FAILURE;
> +       }
> +
> +       /* Convert hex characters to dec number */
> +       errno = 0;
> +       if (sscanf(argv[2], "%llx", &vmlinux_load_addr) != 1) {
> +               if (errno != 0)
> +                       perror("sscanf");
> +               else
> +                       fprintf(stderr, "No matching characters\n");
> +
> +               return EXIT_FAILURE;
> +       }
> +
> +       vmlinux_size = (uint64_t)sb.st_size;
> +       vmlinuz_load_addr = vmlinux_load_addr + vmlinux_size;
> +
> +       /*
> +        * Align with 64KB: KEXEC needs load sections to be aligned to PAGE_SIZE,
> +        * which may be as large as 64KB depending on the kernel configuration.
> +        */
> +
> +       vmlinuz_load_addr += (0x10000 - vmlinux_size % 0x10000);
> +
> +       printf("0x%llx\n", vmlinuz_load_addr);
> +
> +       return EXIT_SUCCESS;
> +}
> diff --git a/arch/loongarch/tools/elf-entry.c b/arch/loongarch/tools/elf-entry.c
> new file mode 100644
> index 000000000000..c80721e0dee1
> --- /dev/null
> +++ b/arch/loongarch/tools/elf-entry.c
> @@ -0,0 +1,66 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <elf.h>
> +#include <inttypes.h>
> +#include <stdint.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +__attribute__((noreturn))
> +static void die(const char *msg)
> +{
> +       fputs(msg, stderr);
> +       exit(EXIT_FAILURE);
> +}
> +
> +int main(int argc, const char *argv[])
> +{
> +       uint64_t entry;
> +       size_t nread;
> +       FILE *file;
> +       union {
> +               Elf32_Ehdr ehdr32;
> +               Elf64_Ehdr ehdr64;
> +       } hdr;
> +
> +       if (argc != 2)
> +               die("Usage: elf-entry <elf-file>\n");
> +
> +       file = fopen(argv[1], "r");
> +       if (!file) {
> +               perror("Unable to open input file");
> +               return EXIT_FAILURE;
> +       }
> +
> +       nread = fread(&hdr, 1, sizeof(hdr), file);
> +       if (nread != sizeof(hdr)) {
> +               fclose(file);
> +               perror("Unable to read input file");
> +               return EXIT_FAILURE;
> +       }
> +
> +       if (memcmp(hdr.ehdr32.e_ident, ELFMAG, SELFMAG)) {
> +               fclose(file);
> +               die("Input is not an ELF\n");
> +       }
> +
> +       switch (hdr.ehdr32.e_ident[EI_CLASS]) {
> +       case ELFCLASS32:
> +               /* Sign extend to form a canonical address */
> +               entry = (int64_t)(int32_t)hdr.ehdr32.e_entry;
> +               break;
> +
> +       case ELFCLASS64:
> +               entry = hdr.ehdr64.e_entry;
> +               break;
> +
> +       default:
> +               fclose(file);
> +               die("Invalid ELF class\n");
> +       }
> +
> +       fclose(file);
> +       printf("0x%016" PRIx64 "\n", entry);
> +
> +       return EXIT_SUCCESS;
> +}
> --
> 2.27.0
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
@ 2022-04-30 10:07     ` Arnd Bergmann
  0 siblings, 0 replies; 94+ messages in thread
From: Arnd Bergmann @ 2022-04-30 10:07 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds,
	linux-arch, open list:DOCUMENTATION, Linux Kernel Mailing List,
	Xuefeng Li, Yanteng Si, Huacai Chen, Guo Ren, Xuerui Wang,
	Jiaxun Yang, Linux ARM, Catalin Marinas, Will Deacon,
	linux-riscv, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Ard Biesheuvel, linux-efi

On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
>
> This patch adds zboot (self-extracting compressed kernel) support, all
> existing in-kernel compressing algorithm and efistub are supported.
>
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

I have no objections to adding a decompressor in principle, and
the implementation seems reasonable. However, I think we should try to
be consistent between architectures. On both arm64 and riscv, the
maintainers decided to not include a decompressor and instead leave
it up to the boot loader to decompress the kernel and enter it from there.

As I understand it, this is not part of the UEFI boot flow though, so it
means that you don't get any compressed kernel images at all when
booting using UEFI (let me know if that is wrong). I assume this is why
you decided to include the decompressor here after all.

I think we should first aim for consistency here, and handle this the
same way across the modern architectures, either leaving the
decompressor code out, or adding it consistently. Maybe it would
even be possible to have the decompressor code as part of the
EFI stub and share it between the three architectures (x86 and
32-bit arm already support loading compressed kernels using EFI).

Adding the arm64, risc-v and uefi maintainers for further discussion here,
see full below.

       Arnd

> ---
>  arch/loongarch/Kbuild                         |   2 +-
>  arch/loongarch/Kconfig                        |  11 ++
>  arch/loongarch/Makefile                       |  26 ++-
>  arch/loongarch/boot/Makefile                  |  55 ++++++
>  arch/loongarch/boot/boot.lds.S                |  64 +++++++
>  arch/loongarch/boot/decompress.c              |  98 +++++++++++
>  arch/loongarch/boot/string.c                  | 166 ++++++++++++++++++
>  arch/loongarch/boot/zheader.S                 | 100 +++++++++++
>  arch/loongarch/boot/zkernel.S                 |  99 +++++++++++
>  arch/loongarch/tools/Makefile                 |  15 ++
>  arch/loongarch/tools/calc_vmlinuz_load_addr.c |  51 ++++++
>  arch/loongarch/tools/elf-entry.c              |  66 +++++++
>  12 files changed, 749 insertions(+), 4 deletions(-)
>  create mode 100644 arch/loongarch/boot/boot.lds.S
>  create mode 100644 arch/loongarch/boot/decompress.c
>  create mode 100644 arch/loongarch/boot/string.c
>  create mode 100644 arch/loongarch/boot/zheader.S
>  create mode 100644 arch/loongarch/boot/zkernel.S
>  create mode 100644 arch/loongarch/tools/Makefile
>  create mode 100644 arch/loongarch/tools/calc_vmlinuz_load_addr.c
>  create mode 100644 arch/loongarch/tools/elf-entry.c
>
> diff --git a/arch/loongarch/Kbuild b/arch/loongarch/Kbuild
> index ab5373d0a24f..d907fdd7ca08 100644
> --- a/arch/loongarch/Kbuild
> +++ b/arch/loongarch/Kbuild
> @@ -3,4 +3,4 @@ obj-y += mm/
>  obj-y += vdso/
>
>  # for cleaning
> -subdir- += boot
> +subdir- += boot tools
> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> index 55225ee5f868..6c1042746b2d 100644
> --- a/arch/loongarch/Kconfig
> +++ b/arch/loongarch/Kconfig
> @@ -107,6 +107,7 @@ config LOONGARCH
>         select PERF_USE_VMALLOC
>         select RTC_LIB
>         select SPARSE_IRQ
> +       select SYS_SUPPORTS_ZBOOT
>         select SYSCTL_EXCEPTION_TRACE
>         select SWIOTLB
>         select TRACE_IRQFLAGS_SUPPORT
> @@ -143,6 +144,16 @@ config LOCKDEP_SUPPORT
>         bool
>         default y
>
> +config SYS_SUPPORTS_ZBOOT
> +       bool
> +       select HAVE_KERNEL_GZIP
> +       select HAVE_KERNEL_BZIP2
> +       select HAVE_KERNEL_LZ4
> +       select HAVE_KERNEL_LZMA
> +       select HAVE_KERNEL_LZO
> +       select HAVE_KERNEL_XZ
> +       select HAVE_KERNEL_ZSTD
> +
>  config MACH_LOONGSON32
>         def_bool 32BIT
>
> diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
> index d88a792dafbe..1ed5b8466565 100644
> --- a/arch/loongarch/Makefile
> +++ b/arch/loongarch/Makefile
> @@ -5,12 +5,31 @@
>
>  boot   := arch/loongarch/boot
>
> +ifndef CONFIG_SYS_SUPPORTS_ZBOOT
> +
>  ifndef CONFIG_EFI_STUB
>  KBUILD_IMAGE   = $(boot)/vmlinux
>  else
>  KBUILD_IMAGE   = $(boot)/vmlinux.efi
>  endif
>
> +else
> +
> +ifndef CONFIG_EFI_STUB
> +KBUILD_IMAGE   = $(boot)/vmlinuz
> +else
> +KBUILD_IMAGE   = $(boot)/vmlinuz.efi
> +endif
> +
> +endif
> +
> +load-y         = 0x9000000000200000
> +bootvars-y     = VMLINUX_LOAD_ADDRESS=$(load-y)
> +
> +archscripts: scripts_basic
> +       $(Q)$(MAKE) $(build)=arch/loongarch/tools elf-entry
> +       $(Q)$(MAKE) $(build)=arch/loongarch/tools calc_vmlinuz_load_addr
> +
>  #
>  # Select the object file format to substitute into the linker script.
>  #
> @@ -55,9 +74,6 @@ KBUILD_CFLAGS_MODULE          += -fplt -Wa,-mla-global-with-abs,-mla-local-with-abs
>  cflags-y += -ffreestanding
>  cflags-y += $(call as-option,-Wa$(comma)-mno-fix-loongson3-llsc,)
>
> -load-y         = 0x9000000000200000
> -bootvars-y     = VMLINUX_LOAD_ADDRESS=$(load-y)
> -
>  drivers-$(CONFIG_PCI)          += arch/loongarch/pci/
>
>  KBUILD_AFLAGS  += $(cflags-y)
> @@ -99,7 +115,11 @@ $(KBUILD_IMAGE): vmlinux
>         $(Q)$(MAKE) $(build)=$(boot) $(bootvars-y) $@
>
>  install:
> +ifndef CONFIG_SYS_SUPPORTS_ZBOOT
>         $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
> +else
> +       $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinuz-$(KERNELRELEASE)
> +endif
>         $(Q)install -D -m 644 .config $(INSTALL_PATH)/config-$(KERNELRELEASE)
>         $(Q)install -D -m 644 System.map $(INSTALL_PATH)/System.map-$(KERNELRELEASE)
>
> diff --git a/arch/loongarch/boot/Makefile b/arch/loongarch/boot/Makefile
> index 66f2293c34b2..c26a36004ae2 100644
> --- a/arch/loongarch/boot/Makefile
> +++ b/arch/loongarch/boot/Makefile
> @@ -21,3 +21,58 @@ quiet_cmd_eficopy = OBJCOPY $@
>
>  $(obj)/vmlinux.efi: $(obj)/vmlinux FORCE
>         $(call if_changed,eficopy)
> +
> +# zboot
> +extra-y        += boot.lds
> +$(obj)/boot.lds: $(obj)/vmlinux.bin FORCE
> +CPPFLAGS_boot.lds = $(KBUILD_CPPFLAGS) -DVMLINUZ_LOAD_ADDRESS=$(zload-y)
> +
> +entry-y        = $(shell $(objtree)/arch/loongarch/tools/elf-entry $(obj)/vmlinux)
> +zload-y = $(shell $(objtree)/arch/loongarch/tools/calc_vmlinuz_load_addr \
> +                               $(obj)/vmlinux.bin $(VMLINUX_LOAD_ADDRESS))
> +
> +BOOT_HEAP_SIZE := 0x400000
> +BOOT_STACK_SIZE        := 0x002000
> +
> +KBUILD_AFLAGS := $(KBUILD_AFLAGS) -D__ASSEMBLY__ \
> +       -DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \
> +       -DBOOT_STACK_SIZE=$(BOOT_STACK_SIZE)
> +
> +KBUILD_CFLAGS := $(KBUILD_CFLAGS) -fpic -D__KERNEL__ \
> +       -DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \
> +       -DBOOT_STACK_SIZE=$(BOOT_STACK_SIZE)
> +
> +targets += vmlinux.bin
> +OBJCOPYFLAGS_vmlinux.bin := $(OBJCOPYFLAGS) -O binary $(strip-flags)
> +$(obj)/vmlinux.bin: $(obj)/vmlinux FORCE
> +       $(call if_changed,objcopy)
> +
> +tool_$(CONFIG_KERNEL_GZIP)    = gzip
> +tool_$(CONFIG_KERNEL_BZIP2)   = bzip2_with_size
> +tool_$(CONFIG_KERNEL_LZ4)     = lz4_with_size
> +tool_$(CONFIG_KERNEL_LZMA)    = lzma_with_size
> +tool_$(CONFIG_KERNEL_LZO)     = lzo_with_size
> +tool_$(CONFIG_KERNEL_XZ)      = xzkern_with_size
> +tool_$(CONFIG_KERNEL_ZSTD)    = zstd22_with_size
> +
> +targets += vmlinux.bin.z
> +$(obj)/vmlinux.bin.z: $(obj)/vmlinux.bin FORCE
> +       $(call if_changed,$(tool_y))
> +
> +targets += $(notdir $(vmlinuzobjs-y))
> +vmlinuzobjs-y := $(obj)/zkernel.o $(obj)/decompress.o $(obj)/string.o
> +vmlinuzobjs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
> +$(obj)/zkernel.o: $(obj)/vmlinux.bin.z
> +AFLAGS_zkernel.o = $(KBUILD_AFLAGS) -DVMLINUZ_LOAD_ADDRESS=$(zload-y) -DKERNEL_ENTRY=$(entry-y)
> +
> +quiet_cmd_zld = LD      $@
> +      cmd_zld = $(LD) $(KBUILD_LDFLAGS) -T $< $(vmlinuzobjs-y) -o $@
> +
> +targets += vmlinuz
> +$(obj)/vmlinuz: $(src)/boot.lds $(vmlinuzobjs-y) FORCE
> +       $(call if_changed,zld)
> +       $(call if_changed,strip)
> +
> +targets += vmlinuz.efi
> +$(obj)/vmlinuz.efi: $(obj)/vmlinuz FORCE
> +       $(call if_changed,eficopy)
> diff --git a/arch/loongarch/boot/boot.lds.S b/arch/loongarch/boot/boot.lds.S
> new file mode 100644
> index 000000000000..23e698782afd
> --- /dev/null
> +++ b/arch/loongarch/boot/boot.lds.S
> @@ -0,0 +1,64 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * ld.script for compressed kernel support of LoongArch
> + *
> + * Author: Huacai Chen <chenhuacai@loongson.cn>
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include "../kernel/image-vars.h"
> +
> +/*
> + * Max avaliable Page Size is 64K, so we set SectionAlignment
> + * field of EFI application to 64K.
> + */
> +PECOFF_FILE_ALIGN = 0x200;
> +PECOFF_SEGMENT_ALIGN = 0x10000;
> +
> +OUTPUT_ARCH(loongarch)
> +ENTRY(kernel_entry)
> +PHDRS {
> +       text PT_LOAD FLAGS(7); /* RWX */
> +}
> +SECTIONS
> +{
> +       . = VMLINUZ_LOAD_ADDRESS;
> +
> +       _text = .;
> +       .head.text : {
> +               *(.head.text)
> +       }
> +
> +       .text : {
> +               *(.text)
> +               *(.init.text)
> +               *(.rodata)
> +       }: text
> +
> +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> +       _data = .;
> +       .data : {
> +               *(.data)
> +               *(.init.data)
> +               /* Put the compressed image here */
> +               __image_begin = .;
> +               *(.image)
> +               __image_end = .;
> +               CONSTRUCTORS
> +               . = ALIGN(PECOFF_FILE_ALIGN);
> +       }
> +       _edata = .;
> +
> +       .bss : {
> +               *(.bss)
> +               *(.init.bss)
> +       }
> +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> +       _end = .;
> +
> +       /DISCARD/ : {
> +               *(.options)
> +               *(.comment)
> +               *(.note)
> +       }
> +}
> diff --git a/arch/loongarch/boot/decompress.c b/arch/loongarch/boot/decompress.c
> new file mode 100644
> index 000000000000..8f55fcd8f285
> --- /dev/null
> +++ b/arch/loongarch/boot/decompress.c
> @@ -0,0 +1,98 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Author: Huacai Chen <chenhuacai@loongson.cn>
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <linux/types.h>
> +#include <linux/kernel.h>
> +#include <linux/string.h>
> +#include <linux/libfdt.h>
> +
> +#include <asm/addrspace.h>
> +
> +/*
> + * These two variables specify the free mem region
> + * that can be used for temporary malloc area
> + */
> +unsigned long free_mem_ptr;
> +unsigned long free_mem_end_ptr;
> +
> +/* The linker tells us where the image is. */
> +extern unsigned char __image_begin, __image_end;
> +
> +#define puts(s) do {} while (0)
> +#define puthex(val) do {} while (0)
> +
> +void error(char *x)
> +{
> +       puts("\n\n");
> +       puts(x);
> +       puts("\n\n -- System halted");
> +
> +       while (1)
> +               ;       /* Halt */
> +}
> +
> +/* activate the code for pre-boot environment */
> +#define STATIC static
> +
> +#include "../../../../lib/ashldi3.c"
> +
> +#ifdef CONFIG_KERNEL_GZIP
> +#include "../../../../lib/decompress_inflate.c"
> +#endif
> +
> +#ifdef CONFIG_KERNEL_BZIP2
> +#include "../../../../lib/decompress_bunzip2.c"
> +#endif
> +
> +#ifdef CONFIG_KERNEL_LZ4
> +#include "../../../../lib/decompress_unlz4.c"
> +#endif
> +
> +#ifdef CONFIG_KERNEL_LZMA
> +#include "../../../../lib/decompress_unlzma.c"
> +#endif
> +
> +#ifdef CONFIG_KERNEL_LZO
> +#include "../../../../lib/decompress_unlzo.c"
> +#endif
> +
> +#ifdef CONFIG_KERNEL_XZ
> +#include "../../../../lib/decompress_unxz.c"
> +#endif
> +
> +#ifdef CONFIG_KERNEL_ZSTD
> +#include "../../../../lib/decompress_unzstd.c"
> +#endif
> +
> +void decompress_kernel(unsigned long boot_heap_start)
> +{
> +       unsigned long zimage_start, zimage_size;
> +
> +       zimage_start = (unsigned long)(&__image_begin);
> +       zimage_size = (unsigned long)(&__image_end) -
> +           (unsigned long)(&__image_begin);
> +
> +       puts("zimage at:     ");
> +       puthex(zimage_start);
> +       puts(" ");
> +       puthex(zimage_size + zimage_start);
> +       puts("\n");
> +
> +       /* This area are prepared for mallocing when decompressing */
> +       free_mem_ptr = boot_heap_start;
> +       free_mem_end_ptr = boot_heap_start + BOOT_HEAP_SIZE;
> +
> +       /* Display standard Linux/LoongArch boot prompt */
> +       puts("Uncompressing Linux at load address ");
> +       puthex(VMLINUX_LOAD_ADDRESS);
> +       puts("\n");
> +
> +       /* Decompress the kernel with according algorithm */
> +       __decompress((char *)zimage_start, zimage_size, 0, 0,
> +                  (void *)VMLINUX_LOAD_ADDRESS, 0, 0, error);
> +
> +       puts("Now, booting the kernel...\n");
> +}
> diff --git a/arch/loongarch/boot/string.c b/arch/loongarch/boot/string.c
> new file mode 100644
> index 000000000000..3f746e7c2bb5
> --- /dev/null
> +++ b/arch/loongarch/boot/string.c
> @@ -0,0 +1,166 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * arch/loongarch/boot/string.c
> + *
> + * Very small subset of simple string routines
> + */
> +
> +#include <linux/types.h>
> +
> +void __weak *memset(void *s, int c, size_t n)
> +{
> +       int i;
> +       char *ss = s;
> +
> +       for (i = 0; i < n; i++)
> +               ss[i] = c;
> +       return s;
> +}
> +
> +void __weak *memcpy(void *dest, const void *src, size_t n)
> +{
> +       int i;
> +       const char *s = src;
> +       char *d = dest;
> +
> +       for (i = 0; i < n; i++)
> +               d[i] = s[i];
> +       return dest;
> +}
> +
> +void __weak *memmove(void *dest, const void *src, size_t n)
> +{
> +       int i;
> +       const char *s = src;
> +       char *d = dest;
> +
> +       if (d < s) {
> +               for (i = 0; i < n; i++)
> +                       d[i] = s[i];
> +       } else if (d > s) {
> +               for (i = n - 1; i >= 0; i--)
> +                       d[i] = s[i];
> +       }
> +
> +       return dest;
> +}
> +
> +int __weak memcmp(const void *cs, const void *ct, size_t count)
> +{
> +       int res = 0;
> +       const unsigned char *su1, *su2;
> +
> +       for (su1 = cs, su2 = ct; 0 < count; ++su1, ++su2, count--) {
> +               res = *su1 - *su2;
> +               if (res != 0)
> +                       break;
> +       }
> +       return res;
> +}
> +
> +int __weak strcmp(const char *str1, const char *str2)
> +{
> +       int delta = 0;
> +       const unsigned char *s1 = (const unsigned char *)str1;
> +       const unsigned char *s2 = (const unsigned char *)str2;
> +
> +       while (*s1 || *s2) {
> +               delta = *s1 - *s2;
> +               if (delta)
> +                       return delta;
> +               s1++;
> +               s2++;
> +       }
> +       return 0;
> +}
> +
> +size_t __weak strlen(const char *s)
> +{
> +       const char *sc;
> +
> +       for (sc = s; *sc != '\0'; ++sc)
> +               /* nothing */;
> +       return sc - s;
> +}
> +
> +size_t __weak strnlen(const char *s, size_t count)
> +{
> +       const char *sc;
> +
> +       for (sc = s; count-- && *sc != '\0'; ++sc)
> +               /* nothing */;
> +       return sc - s;
> +}
> +
> +char * __weak strnstr(const char *s1, const char *s2, size_t len)
> +{
> +       size_t l2;
> +
> +       l2 = strlen(s2);
> +       if (!l2)
> +               return (char *)s1;
> +       while (len >= l2) {
> +               len--;
> +               if (!memcmp(s1, s2, l2))
> +                       return (char *)s1;
> +               s1++;
> +       }
> +       return NULL;
> +}
> +
> +#undef strcat
> +char * __weak strcat(char *dest, const char *src)
> +{
> +       char *tmp = dest;
> +
> +       while (*dest)
> +               dest++;
> +       while ((*dest++ = *src++) != '\0')
> +               ;
> +       return tmp;
> +}
> +
> +char * __weak strncat(char *dest, const char *src, size_t count)
> +{
> +       char *tmp = dest;
> +
> +       if (count) {
> +               while (*dest)
> +                       dest++;
> +               while ((*dest++ = *src++) != 0) {
> +                       if (--count == 0) {
> +                               *dest = '\0';
> +                               break;
> +                       }
> +               }
> +       }
> +       return tmp;
> +}
> +
> +char * __weak strpbrk(const char *cs, const char *ct)
> +{
> +       const char *sc1, *sc2;
> +
> +       for (sc1 = cs; *sc1 != '\0'; ++sc1) {
> +               for (sc2 = ct; *sc2 != '\0'; ++sc2) {
> +                       if (*sc1 == *sc2)
> +                               return (char *)sc1;
> +               }
> +       }
> +       return NULL;
> +}
> +
> +char * __weak strsep(char **s, const char *ct)
> +{
> +       char *sbegin = *s;
> +       char *end;
> +
> +       if (sbegin == NULL)
> +               return NULL;
> +
> +       end = strpbrk(sbegin, ct);
> +       if (end)
> +               *end++ = '\0';
> +       *s = end;
> +       return sbegin;
> +}
> diff --git a/arch/loongarch/boot/zheader.S b/arch/loongarch/boot/zheader.S
> new file mode 100644
> index 000000000000..4bc50d953ec7
> --- /dev/null
> +++ b/arch/loongarch/boot/zheader.S
> @@ -0,0 +1,100 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <linux/pe.h>
> +#include <linux/sizes.h>
> +
> +       .macro  __EFI_PE_HEADER
> +       .long   PE_MAGIC
> +coff_header:
> +       .short  IMAGE_FILE_MACHINE_LOONGARCH            /* Machine */
> +       .short  section_count                           /* NumberOfSections */
> +       .long   0                                       /* TimeDateStamp */
> +       .long   0                                       /* PointerToSymbolTable */
> +       .long   0                                       /* NumberOfSymbols */
> +       .short  section_table - optional_header         /* SizeOfOptionalHeader */
> +       .short  IMAGE_FILE_DEBUG_STRIPPED | \
> +               IMAGE_FILE_EXECUTABLE_IMAGE | \
> +               IMAGE_FILE_LINE_NUMS_STRIPPED           /* Characteristics */
> +
> +optional_header:
> +       .short  PE_OPT_MAGIC_PE32PLUS                   /* PE32+ format */
> +       .byte   0x02                                    /* MajorLinkerVersion */
> +       .byte   0x14                                    /* MinorLinkerVersion */
> +       .long   _data - efi_header_end                  /* SizeOfCode */
> +       .long   _end - _data                            /* SizeOfInitializedData */
> +       .long   0                                       /* SizeOfUninitializedData */
> +       .long   __efistub_efi_pe_entry - _head          /* AddressOfEntryPoint */
> +       .long   efi_header_end - _head                  /* BaseOfCode */
> +
> +extra_header_fields:
> +       .quad   0                                       /* ImageBase */
> +       .long   PECOFF_SEGMENT_ALIGN                    /* SectionAlignment */
> +       .long   PECOFF_FILE_ALIGN                       /* FileAlignment */
> +       .short  0                                       /* MajorOperatingSystemVersion */
> +       .short  0                                       /* MinorOperatingSystemVersion */
> +       .short  0                                       /* MajorImageVersion */
> +       .short  0                                       /* MinorImageVersion */
> +       .short  0                                       /* MajorSubsystemVersion */
> +       .short  0                                       /* MinorSubsystemVersion */
> +       .long   0                                       /* Win32VersionValue */
> +
> +       .long   _end - _head                            /* SizeOfImage */
> +
> +       /* Everything before the kernel image is considered part of the header */
> +       .long   efi_header_end - _head                  /* SizeOfHeaders */
> +       .long   0                                       /* CheckSum */
> +       .short  IMAGE_SUBSYSTEM_EFI_APPLICATION         /* Subsystem */
> +       .short  0                                       /* DllCharacteristics */
> +       .quad   0                                       /* SizeOfStackReserve */
> +       .quad   0                                       /* SizeOfStackCommit */
> +       .quad   0                                       /* SizeOfHeapReserve */
> +       .quad   0                                       /* SizeOfHeapCommit */
> +       .long   0                                       /* LoaderFlags */
> +       .long   (section_table - .) / 8                 /* NumberOfRvaAndSizes */
> +
> +       .quad   0                                       /* ExportTable */
> +       .quad   0                                       /* ImportTable */
> +       .quad   0                                       /* ResourceTable */
> +       .quad   0                                       /* ExceptionTable */
> +       .quad   0                                       /* CertificationTable */
> +       .quad   0                                       /* BaseRelocationTable */
> +
> +       /* Section table */
> +section_table:
> +       .ascii  ".text\0\0\0"
> +       .long   _data - efi_header_end                  /* VirtualSize */
> +       .long   efi_header_end - _head                  /* VirtualAddress */
> +       .long   _data - efi_header_end                  /* SizeOfRawData */
> +       .long   efi_header_end - _head                  /* PointerToRawData */
> +
> +       .long   0                                       /* PointerToRelocations */
> +       .long   0                                       /* PointerToLineNumbers */
> +       .short  0                                       /* NumberOfRelocations */
> +       .short  0                                       /* NumberOfLineNumbers */
> +       .long   IMAGE_SCN_CNT_CODE | \
> +               IMAGE_SCN_MEM_READ | \
> +               IMAGE_SCN_MEM_EXECUTE                   /* Characteristics */
> +
> +       .ascii  ".data\0\0\0"
> +       .long   _end - _data                            /* VirtualSize */
> +       .long   _data - _head                           /* VirtualAddress */
> +       .long   _edata - _data                          /* SizeOfRawData */
> +       .long   _data - _head                           /* PointerToRawData */
> +
> +       .long   0                                       /* PointerToRelocations */
> +       .long   0                                       /* PointerToLineNumbers */
> +       .short  0                                       /* NumberOfRelocations */
> +       .short  0                                       /* NumberOfLineNumbers */
> +       .long   IMAGE_SCN_CNT_INITIALIZED_DATA | \
> +               IMAGE_SCN_MEM_READ | \
> +               IMAGE_SCN_MEM_WRITE                     /* Characteristics */
> +
> +       .org 0x20e
> +       .word kernel_version - 512 -  _head
> +
> +       .set    section_count, (. - section_table) / 40
> +efi_header_end:
> +       .endm
> diff --git a/arch/loongarch/boot/zkernel.S b/arch/loongarch/boot/zkernel.S
> new file mode 100644
> index 000000000000..13a8a14a2328
> --- /dev/null
> +++ b/arch/loongarch/boot/zkernel.S
> @@ -0,0 +1,99 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <linux/init.h>
> +#include <linux/linkage.h>
> +#include <asm/addrspace.h>
> +#include <asm/asm.h>
> +#include <asm/loongarch.h>
> +#include <asm/regdef.h>
> +#include <generated/compile.h>
> +#include <generated/utsrelease.h>
> +
> +#ifdef CONFIG_EFI_STUB
> +
> +#include "zheader.S"
> +
> +       __HEAD
> +
> +_head:
> +       /* "MZ", MS-DOS header */
> +       .word   MZ_MAGIC
> +       .org    0x28
> +       .ascii  "Loongson\0"
> +       .org    0x3c
> +       /* Offset to the PE header */
> +       .long   pe_header - _head
> +
> +pe_header:
> +       __EFI_PE_HEADER
> +
> +kernel_asize:
> +       .long _end - _text
> +
> +kernel_fsize:
> +       .long _edata - _text
> +
> +kernel_vaddr:
> +       .quad VMLINUZ_LOAD_ADDRESS
> +
> +kernel_offset:
> +       .long kernel_offset - _text
> +
> +kernel_version:
> +       .ascii  UTS_RELEASE " (" LINUX_COMPILE_BY "@" LINUX_COMPILE_HOST ") " UTS_VERSION "\0"
> +
> +SYM_L_GLOBAL(kernel_asize)
> +SYM_L_GLOBAL(kernel_fsize)
> +SYM_L_GLOBAL(kernel_vaddr)
> +SYM_L_GLOBAL(kernel_offset)
> +
> +#endif
> +
> +       __INIT
> +
> +SYM_CODE_START(kernel_entry)
> +       /* Save boot rom start args */
> +       move    s0, a0
> +       move    s1, a1
> +       move    s2, a2
> +       move    s3, a3
> +
> +       /* Config Direct Mapping */
> +       li.d    t0, CSR_DMW0_INIT
> +       csrwr   t0, LOONGARCH_CSR_DMWIN0
> +       li.d    t0, CSR_DMW1_INIT
> +       csrwr   t0, LOONGARCH_CSR_DMWIN1
> +
> +       /* Clear BSS */
> +       la.abs  a0, _edata
> +       la.abs  a2, _end
> +1:     st.d    zero, a0, 0
> +       addi.d  a0, a0, 8
> +       bne     a2, a0, 1b
> +
> +       la.abs  a0, .heap          /* heap address */
> +       la.abs  sp, .stack + 8192  /* stack address */
> +
> +       la      ra, 2f
> +       la      t4, decompress_kernel
> +       jirl    zero, t4, 0
> +2:
> +       move    a0, s0
> +       move    a1, s1
> +       move    a2, s2
> +       move    a3, s3
> +       PTR_LI  t4, KERNEL_ENTRY
> +       jirl    zero, t4, 0
> +3:
> +       b       3b
> +SYM_CODE_END(kernel_entry)
> +
> +       .comm .heap, BOOT_HEAP_SIZE, 4
> +       .comm .stack, BOOT_STACK_SIZE, 4
> +
> +       .align 4
> +       .section .image, "a", %progbits
> +       .incbin "arch/loongarch/boot/vmlinux.bin.z"
> diff --git a/arch/loongarch/tools/Makefile b/arch/loongarch/tools/Makefile
> new file mode 100644
> index 000000000000..8a6181c82a91
> --- /dev/null
> +++ b/arch/loongarch/tools/Makefile
> @@ -0,0 +1,15 @@
> +#
> +# arch/loongarch/boot/Makefile
> +#
> +# Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> +#
> +
> +hostprogs := elf-entry
> +PHONY += elf-entry
> +elf-entry: $(obj)/elf-entry
> +       @:
> +
> +hostprogs += calc_vmlinuz_load_addr
> +PHONY += calc_vmlinuz_load_addr
> +calc_vmlinuz_load_addr: $(obj)/calc_vmlinuz_load_addr
> +       @:
> diff --git a/arch/loongarch/tools/calc_vmlinuz_load_addr.c b/arch/loongarch/tools/calc_vmlinuz_load_addr.c
> new file mode 100644
> index 000000000000..5e2ca6b4dff6
> --- /dev/null
> +++ b/arch/loongarch/tools/calc_vmlinuz_load_addr.c
> @@ -0,0 +1,51 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <errno.h>
> +#include <stdint.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <sys/stat.h>
> +
> +int main(int argc, char *argv[])
> +{
> +       unsigned long long vmlinux_size, vmlinux_load_addr, vmlinuz_load_addr;
> +       struct stat sb;
> +
> +       if (argc != 3) {
> +               fprintf(stderr, "Usage: %s <pathname> <vmlinux_load_addr>\n", argv[0]);
> +               return EXIT_FAILURE;
> +       }
> +
> +       if (stat(argv[1], &sb) == -1) {
> +               perror("stat");
> +               return EXIT_FAILURE;
> +       }
> +
> +       /* Convert hex characters to dec number */
> +       errno = 0;
> +       if (sscanf(argv[2], "%llx", &vmlinux_load_addr) != 1) {
> +               if (errno != 0)
> +                       perror("sscanf");
> +               else
> +                       fprintf(stderr, "No matching characters\n");
> +
> +               return EXIT_FAILURE;
> +       }
> +
> +       vmlinux_size = (uint64_t)sb.st_size;
> +       vmlinuz_load_addr = vmlinux_load_addr + vmlinux_size;
> +
> +       /*
> +        * Align with 64KB: KEXEC needs load sections to be aligned to PAGE_SIZE,
> +        * which may be as large as 64KB depending on the kernel configuration.
> +        */
> +
> +       vmlinuz_load_addr += (0x10000 - vmlinux_size % 0x10000);
> +
> +       printf("0x%llx\n", vmlinuz_load_addr);
> +
> +       return EXIT_SUCCESS;
> +}
> diff --git a/arch/loongarch/tools/elf-entry.c b/arch/loongarch/tools/elf-entry.c
> new file mode 100644
> index 000000000000..c80721e0dee1
> --- /dev/null
> +++ b/arch/loongarch/tools/elf-entry.c
> @@ -0,0 +1,66 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <elf.h>
> +#include <inttypes.h>
> +#include <stdint.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +__attribute__((noreturn))
> +static void die(const char *msg)
> +{
> +       fputs(msg, stderr);
> +       exit(EXIT_FAILURE);
> +}
> +
> +int main(int argc, const char *argv[])
> +{
> +       uint64_t entry;
> +       size_t nread;
> +       FILE *file;
> +       union {
> +               Elf32_Ehdr ehdr32;
> +               Elf64_Ehdr ehdr64;
> +       } hdr;
> +
> +       if (argc != 2)
> +               die("Usage: elf-entry <elf-file>\n");
> +
> +       file = fopen(argv[1], "r");
> +       if (!file) {
> +               perror("Unable to open input file");
> +               return EXIT_FAILURE;
> +       }
> +
> +       nread = fread(&hdr, 1, sizeof(hdr), file);
> +       if (nread != sizeof(hdr)) {
> +               fclose(file);
> +               perror("Unable to read input file");
> +               return EXIT_FAILURE;
> +       }
> +
> +       if (memcmp(hdr.ehdr32.e_ident, ELFMAG, SELFMAG)) {
> +               fclose(file);
> +               die("Input is not an ELF\n");
> +       }
> +
> +       switch (hdr.ehdr32.e_ident[EI_CLASS]) {
> +       case ELFCLASS32:
> +               /* Sign extend to form a canonical address */
> +               entry = (int64_t)(int32_t)hdr.ehdr32.e_entry;
> +               break;
> +
> +       case ELFCLASS64:
> +               entry = hdr.ehdr64.e_entry;
> +               break;
> +
> +       default:
> +               fclose(file);
> +               die("Invalid ELF class\n");
> +       }
> +
> +       fclose(file);
> +       printf("0x%016" PRIx64 "\n", entry);
> +
> +       return EXIT_SUCCESS;
> +}
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 13/24] LoongArch: Add system call support
  2022-04-30 10:05     ` Huacai Chen
@ 2022-04-30 10:34       ` Arnd Bergmann
  2022-05-07 12:11         ` Christian Brauner
  0 siblings, 1 reply; 94+ messages in thread
From: Arnd Bergmann @ 2022-04-30 10:34 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Huacai Chen, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Guo Ren,
	Xuerui Wang, Jiaxun Yang, Christian Brauner, Linux API

On Sat, Apr 30, 2022 at 12:05 PM Huacai Chen <chenhuacai@gmail.com> wrote:
> On Sat, Apr 30, 2022 at 5:45 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> > >
> > > This patch adds system call support and related uaccess.h for LoongArch.
> > >
> > > Q: Why keep __ARCH_WANT_NEW_STAT definition while there is statx:
> > > A: Until the latest glibc release (2.34), statx is only used for 32-bit
> > >    platforms, or 64-bit platforms with 32-bit timestamp. I.e., Most 64-
> > >    bit platforms still use newstat now.
> > >
> > > Q: Why keep _ARCH_WANT_SYS_CLONE definition while there is clone3:
> > > A: The latest glibc release (2.34) has some basic support for clone3 but
> > >    it isn't complete. E.g., pthread_create() and spawni() have converted
> > >    to use clone3 but fork() will still use clone. Moreover, some seccomp
> > >    related applications can still not work perfectly with clone3. E.g.,
> > >    Chromium sandbox cannot work at all and there is no solution for it,
> > >    which is more terrible than the fork() story [1].
> > >
> > > [1] https://chromium-review.googlesource.com/c/chromium/src/+/2936184
> >
> > I still think these have to be removed. There is no mainline glibc or musl
> > port yet, and neither of them should actually be required. Please remove
> > them here, and modify your libc patches accordingly when you send those
> > upstream.
>
> If this is just a problem that can be resolved by upgrading
> glibc/musl, I will remove them. But the Chromium problem (or sandbox
> problem in general) seems to have no solution now.

I added Christian Brauner to Cc now, maybe he has come across the
sandbox problem before and has an idea for a solution.

        Arnd

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 16/24] LoongArch: Add misc common routines
  2022-04-30 10:00     ` Huacai Chen
@ 2022-04-30 10:41       ` Arnd Bergmann
  2022-04-30 13:22         ` Palmer Dabbelt
  0 siblings, 1 reply; 94+ messages in thread
From: Arnd Bergmann @ 2022-04-30 10:41 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Huacai Chen, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Guo Ren,
	Xuerui Wang, Jiaxun Yang, Guo Ren, Palmer Dabbelt

On Sat, Apr 30, 2022 at 12:00 PM Huacai Chen <chenhuacai@gmail.com> wrote:
>
> On Sat, Apr 30, 2022 at 5:50 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> >
> > > +unsigned long __xchg_small(volatile void *ptr, unsigned long val, unsigned int size)
> > > +{
> > > +       u32 old32, mask, temp;
> > > +       volatile u32 *ptr32;
> > > +       unsigned int shift;
> > > +
> > > +       /* Check that ptr is naturally aligned */
> >
> > As discussed, please remove this function and all the references to it.
>
> It seems that "generic ticket spinlock" hasn't been merged in 5.18?

No, but we can merge it together with the loongarch architecture for 5.19.

I suggested you coordinate with Guo Ren and Palmer Dabbelt about how
to best merge it. The latest version was pasted two weeks ago [1], and
it sounds like there are only minor issues to work out and that I can merge
v4 into the asm-generic tree before merging the loongarch code in the
same place.

     Arnd

[1] https://lore.kernel.org/lkml/20220414220214.24556-1-palmer@rivosinc.com/

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 16/24] LoongArch: Add misc common routines
  2022-04-30 10:41       ` Arnd Bergmann
@ 2022-04-30 13:22         ` Palmer Dabbelt
  2022-05-01  5:12           ` Huacai Chen
  0 siblings, 1 reply; 94+ messages in thread
From: Palmer Dabbelt @ 2022-04-30 13:22 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: chenhuacai, Arnd Bergmann, chenhuacai, luto, tglx, peterz, akpm,
	airlied, corbet, Linus Torvalds, linux-arch, linux-doc,
	linux-kernel, lixuefeng, siyanteng, guoren, kernel, jiaxun.yang,
	guoren

On Sat, 30 Apr 2022 03:41:59 PDT (-0700), Arnd Bergmann wrote:
> On Sat, Apr 30, 2022 at 12:00 PM Huacai Chen <chenhuacai@gmail.com> wrote:
>>
>> On Sat, Apr 30, 2022 at 5:50 PM Arnd Bergmann <arnd@arndb.de> wrote:
>> >
>> > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
>> >
>> > > +unsigned long __xchg_small(volatile void *ptr, unsigned long val, unsigned int size)
>> > > +{
>> > > +       u32 old32, mask, temp;
>> > > +       volatile u32 *ptr32;
>> > > +       unsigned int shift;
>> > > +
>> > > +       /* Check that ptr is naturally aligned */
>> >
>> > As discussed, please remove this function and all the references to it.
>>
>> It seems that "generic ticket spinlock" hasn't been merged in 5.18?
>
> No, but we can merge it together with the loongarch architecture for 5.19.
>
> I suggested you coordinate with Guo Ren and Palmer Dabbelt about how
> to best merge it. The latest version was pasted two weeks ago [1], and
> it sounds like there are only minor issues to work out and that I can merge
> v4 into the asm-generic tree before merging the loongarch code in the
> same place.
>
>      Arnd
>
> [1] https://lore.kernel.org/lkml/20220414220214.24556-1-palmer@rivosinc.com/

I can just send another version, IIRC it was just that discussion about 
the memory barrier and there's already prototype code so it shouldn't be 
too bad.  I was hoping to do it sooner, sorry.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 16/24] LoongArch: Add misc common routines
  2022-04-30 13:22         ` Palmer Dabbelt
@ 2022-05-01  5:12           ` Huacai Chen
  0 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-05-01  5:12 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: Arnd Bergmann, Huacai Chen, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION, LKML,
	Li Xuefeng, Yanteng Si, Guo Ren, WANG Xuerui, Jiaxun Yang,
	guoren

Hi, Palmer,

On Sat, Apr 30, 2022 at 9:22 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>
> On Sat, 30 Apr 2022 03:41:59 PDT (-0700), Arnd Bergmann wrote:
> > On Sat, Apr 30, 2022 at 12:00 PM Huacai Chen <chenhuacai@gmail.com> wrote:
> >>
> >> On Sat, Apr 30, 2022 at 5:50 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >> >
> >> > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> >> >
> >> > > +unsigned long __xchg_small(volatile void *ptr, unsigned long val, unsigned int size)
> >> > > +{
> >> > > +       u32 old32, mask, temp;
> >> > > +       volatile u32 *ptr32;
> >> > > +       unsigned int shift;
> >> > > +
> >> > > +       /* Check that ptr is naturally aligned */
> >> >
> >> > As discussed, please remove this function and all the references to it.
> >>
> >> It seems that "generic ticket spinlock" hasn't been merged in 5.18?
> >
> > No, but we can merge it together with the loongarch architecture for 5.19.
> >
> > I suggested you coordinate with Guo Ren and Palmer Dabbelt about how
> > to best merge it. The latest version was pasted two weeks ago [1], and
> > it sounds like there are only minor issues to work out and that I can merge
> > v4 into the asm-generic tree before merging the loongarch code in the
> > same place.
> >
> >      Arnd
> >
> > [1] https://lore.kernel.org/lkml/20220414220214.24556-1-palmer@rivosinc.com/
>
> I can just send another version, IIRC it was just that discussion about
> the memory barrier and there's already prototype code so it shouldn't be
> too bad.  I was hoping to do it sooner, sorry.
I've seen your v4 patches, then I will adjust LoongArch code to use
generic ticket spinlocks.

Huacai

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
  2022-04-30 10:07     ` Arnd Bergmann
  (?)
@ 2022-05-01  5:22       ` Huacai Chen
  -1 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-05-01  5:22 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Huacai Chen, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds,
	linux-arch, open list:DOCUMENTATION, Linux Kernel Mailing List,
	Xuefeng Li, Yanteng Si, Guo Ren, Xuerui Wang, Jiaxun Yang,
	Linux ARM, Catalin Marinas, Will Deacon, linux-riscv,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Ard Biesheuvel,
	linux-efi

Hi, Arnd,

On Sat, Apr 30, 2022 at 7:02 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> >
> > This patch adds zboot (self-extracting compressed kernel) support, all
> > existing in-kernel compressing algorithm and efistub are supported.
> >
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
>
> I have no objections to adding a decompressor in principle, and
> the implementation seems reasonable. However, I think we should try to
> be consistent between architectures. On both arm64 and riscv, the
> maintainers decided to not include a decompressor and instead leave
> it up to the boot loader to decompress the kernel and enter it from there.
X86, ARM32 and MIPS already support self-extracting kernel, and in
5.17 we even support self-extracting modules. So I think a
self-extracting kernel is better than a pure compressed kernel.

>
> As I understand it, this is not part of the UEFI boot flow though, so it
> means that you don't get any compressed kernel images at all when
> booting using UEFI (let me know if that is wrong). I assume this is why
> you decided to include the decompressor here after all.
>
> I think we should first aim for consistency here, and handle this the
> same way across the modern architectures, either leaving the
> decompressor code out, or adding it consistently. Maybe it would
> even be possible to have the decompressor code as part of the
> EFI stub and share it between the three architectures (x86 and
> 32-bit arm already support loading compressed kernels using EFI).
>
> Adding the arm64, risc-v and uefi maintainers for further discussion here,
> see full below.
Keeping consistency across architectures (support self-extracting for
all modern architectures) looks good to me, but can we do that after
this series? I think that needs a long time to discuss and develop.

Huacai
>
>        Arnd
>
> > ---
> >  arch/loongarch/Kbuild                         |   2 +-
> >  arch/loongarch/Kconfig                        |  11 ++
> >  arch/loongarch/Makefile                       |  26 ++-
> >  arch/loongarch/boot/Makefile                  |  55 ++++++
> >  arch/loongarch/boot/boot.lds.S                |  64 +++++++
> >  arch/loongarch/boot/decompress.c              |  98 +++++++++++
> >  arch/loongarch/boot/string.c                  | 166 ++++++++++++++++++
> >  arch/loongarch/boot/zheader.S                 | 100 +++++++++++
> >  arch/loongarch/boot/zkernel.S                 |  99 +++++++++++
> >  arch/loongarch/tools/Makefile                 |  15 ++
> >  arch/loongarch/tools/calc_vmlinuz_load_addr.c |  51 ++++++
> >  arch/loongarch/tools/elf-entry.c              |  66 +++++++
> >  12 files changed, 749 insertions(+), 4 deletions(-)
> >  create mode 100644 arch/loongarch/boot/boot.lds.S
> >  create mode 100644 arch/loongarch/boot/decompress.c
> >  create mode 100644 arch/loongarch/boot/string.c
> >  create mode 100644 arch/loongarch/boot/zheader.S
> >  create mode 100644 arch/loongarch/boot/zkernel.S
> >  create mode 100644 arch/loongarch/tools/Makefile
> >  create mode 100644 arch/loongarch/tools/calc_vmlinuz_load_addr.c
> >  create mode 100644 arch/loongarch/tools/elf-entry.c
> >
> > diff --git a/arch/loongarch/Kbuild b/arch/loongarch/Kbuild
> > index ab5373d0a24f..d907fdd7ca08 100644
> > --- a/arch/loongarch/Kbuild
> > +++ b/arch/loongarch/Kbuild
> > @@ -3,4 +3,4 @@ obj-y += mm/
> >  obj-y += vdso/
> >
> >  # for cleaning
> > -subdir- += boot
> > +subdir- += boot tools
> > diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> > index 55225ee5f868..6c1042746b2d 100644
> > --- a/arch/loongarch/Kconfig
> > +++ b/arch/loongarch/Kconfig
> > @@ -107,6 +107,7 @@ config LOONGARCH
> >         select PERF_USE_VMALLOC
> >         select RTC_LIB
> >         select SPARSE_IRQ
> > +       select SYS_SUPPORTS_ZBOOT
> >         select SYSCTL_EXCEPTION_TRACE
> >         select SWIOTLB
> >         select TRACE_IRQFLAGS_SUPPORT
> > @@ -143,6 +144,16 @@ config LOCKDEP_SUPPORT
> >         bool
> >         default y
> >
> > +config SYS_SUPPORTS_ZBOOT
> > +       bool
> > +       select HAVE_KERNEL_GZIP
> > +       select HAVE_KERNEL_BZIP2
> > +       select HAVE_KERNEL_LZ4
> > +       select HAVE_KERNEL_LZMA
> > +       select HAVE_KERNEL_LZO
> > +       select HAVE_KERNEL_XZ
> > +       select HAVE_KERNEL_ZSTD
> > +
> >  config MACH_LOONGSON32
> >         def_bool 32BIT
> >
> > diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
> > index d88a792dafbe..1ed5b8466565 100644
> > --- a/arch/loongarch/Makefile
> > +++ b/arch/loongarch/Makefile
> > @@ -5,12 +5,31 @@
> >
> >  boot   := arch/loongarch/boot
> >
> > +ifndef CONFIG_SYS_SUPPORTS_ZBOOT
> > +
> >  ifndef CONFIG_EFI_STUB
> >  KBUILD_IMAGE   = $(boot)/vmlinux
> >  else
> >  KBUILD_IMAGE   = $(boot)/vmlinux.efi
> >  endif
> >
> > +else
> > +
> > +ifndef CONFIG_EFI_STUB
> > +KBUILD_IMAGE   = $(boot)/vmlinuz
> > +else
> > +KBUILD_IMAGE   = $(boot)/vmlinuz.efi
> > +endif
> > +
> > +endif
> > +
> > +load-y         = 0x9000000000200000
> > +bootvars-y     = VMLINUX_LOAD_ADDRESS=$(load-y)
> > +
> > +archscripts: scripts_basic
> > +       $(Q)$(MAKE) $(build)=arch/loongarch/tools elf-entry
> > +       $(Q)$(MAKE) $(build)=arch/loongarch/tools calc_vmlinuz_load_addr
> > +
> >  #
> >  # Select the object file format to substitute into the linker script.
> >  #
> > @@ -55,9 +74,6 @@ KBUILD_CFLAGS_MODULE          += -fplt -Wa,-mla-global-with-abs,-mla-local-with-abs
> >  cflags-y += -ffreestanding
> >  cflags-y += $(call as-option,-Wa$(comma)-mno-fix-loongson3-llsc,)
> >
> > -load-y         = 0x9000000000200000
> > -bootvars-y     = VMLINUX_LOAD_ADDRESS=$(load-y)
> > -
> >  drivers-$(CONFIG_PCI)          += arch/loongarch/pci/
> >
> >  KBUILD_AFLAGS  += $(cflags-y)
> > @@ -99,7 +115,11 @@ $(KBUILD_IMAGE): vmlinux
> >         $(Q)$(MAKE) $(build)=$(boot) $(bootvars-y) $@
> >
> >  install:
> > +ifndef CONFIG_SYS_SUPPORTS_ZBOOT
> >         $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
> > +else
> > +       $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinuz-$(KERNELRELEASE)
> > +endif
> >         $(Q)install -D -m 644 .config $(INSTALL_PATH)/config-$(KERNELRELEASE)
> >         $(Q)install -D -m 644 System.map $(INSTALL_PATH)/System.map-$(KERNELRELEASE)
> >
> > diff --git a/arch/loongarch/boot/Makefile b/arch/loongarch/boot/Makefile
> > index 66f2293c34b2..c26a36004ae2 100644
> > --- a/arch/loongarch/boot/Makefile
> > +++ b/arch/loongarch/boot/Makefile
> > @@ -21,3 +21,58 @@ quiet_cmd_eficopy = OBJCOPY $@
> >
> >  $(obj)/vmlinux.efi: $(obj)/vmlinux FORCE
> >         $(call if_changed,eficopy)
> > +
> > +# zboot
> > +extra-y        += boot.lds
> > +$(obj)/boot.lds: $(obj)/vmlinux.bin FORCE
> > +CPPFLAGS_boot.lds = $(KBUILD_CPPFLAGS) -DVMLINUZ_LOAD_ADDRESS=$(zload-y)
> > +
> > +entry-y        = $(shell $(objtree)/arch/loongarch/tools/elf-entry $(obj)/vmlinux)
> > +zload-y = $(shell $(objtree)/arch/loongarch/tools/calc_vmlinuz_load_addr \
> > +                               $(obj)/vmlinux.bin $(VMLINUX_LOAD_ADDRESS))
> > +
> > +BOOT_HEAP_SIZE := 0x400000
> > +BOOT_STACK_SIZE        := 0x002000
> > +
> > +KBUILD_AFLAGS := $(KBUILD_AFLAGS) -D__ASSEMBLY__ \
> > +       -DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \
> > +       -DBOOT_STACK_SIZE=$(BOOT_STACK_SIZE)
> > +
> > +KBUILD_CFLAGS := $(KBUILD_CFLAGS) -fpic -D__KERNEL__ \
> > +       -DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \
> > +       -DBOOT_STACK_SIZE=$(BOOT_STACK_SIZE)
> > +
> > +targets += vmlinux.bin
> > +OBJCOPYFLAGS_vmlinux.bin := $(OBJCOPYFLAGS) -O binary $(strip-flags)
> > +$(obj)/vmlinux.bin: $(obj)/vmlinux FORCE
> > +       $(call if_changed,objcopy)
> > +
> > +tool_$(CONFIG_KERNEL_GZIP)    = gzip
> > +tool_$(CONFIG_KERNEL_BZIP2)   = bzip2_with_size
> > +tool_$(CONFIG_KERNEL_LZ4)     = lz4_with_size
> > +tool_$(CONFIG_KERNEL_LZMA)    = lzma_with_size
> > +tool_$(CONFIG_KERNEL_LZO)     = lzo_with_size
> > +tool_$(CONFIG_KERNEL_XZ)      = xzkern_with_size
> > +tool_$(CONFIG_KERNEL_ZSTD)    = zstd22_with_size
> > +
> > +targets += vmlinux.bin.z
> > +$(obj)/vmlinux.bin.z: $(obj)/vmlinux.bin FORCE
> > +       $(call if_changed,$(tool_y))
> > +
> > +targets += $(notdir $(vmlinuzobjs-y))
> > +vmlinuzobjs-y := $(obj)/zkernel.o $(obj)/decompress.o $(obj)/string.o
> > +vmlinuzobjs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
> > +$(obj)/zkernel.o: $(obj)/vmlinux.bin.z
> > +AFLAGS_zkernel.o = $(KBUILD_AFLAGS) -DVMLINUZ_LOAD_ADDRESS=$(zload-y) -DKERNEL_ENTRY=$(entry-y)
> > +
> > +quiet_cmd_zld = LD      $@
> > +      cmd_zld = $(LD) $(KBUILD_LDFLAGS) -T $< $(vmlinuzobjs-y) -o $@
> > +
> > +targets += vmlinuz
> > +$(obj)/vmlinuz: $(src)/boot.lds $(vmlinuzobjs-y) FORCE
> > +       $(call if_changed,zld)
> > +       $(call if_changed,strip)
> > +
> > +targets += vmlinuz.efi
> > +$(obj)/vmlinuz.efi: $(obj)/vmlinuz FORCE
> > +       $(call if_changed,eficopy)
> > diff --git a/arch/loongarch/boot/boot.lds.S b/arch/loongarch/boot/boot.lds.S
> > new file mode 100644
> > index 000000000000..23e698782afd
> > --- /dev/null
> > +++ b/arch/loongarch/boot/boot.lds.S
> > @@ -0,0 +1,64 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * ld.script for compressed kernel support of LoongArch
> > + *
> > + * Author: Huacai Chen <chenhuacai@loongson.cn>
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include "../kernel/image-vars.h"
> > +
> > +/*
> > + * Max avaliable Page Size is 64K, so we set SectionAlignment
> > + * field of EFI application to 64K.
> > + */
> > +PECOFF_FILE_ALIGN = 0x200;
> > +PECOFF_SEGMENT_ALIGN = 0x10000;
> > +
> > +OUTPUT_ARCH(loongarch)
> > +ENTRY(kernel_entry)
> > +PHDRS {
> > +       text PT_LOAD FLAGS(7); /* RWX */
> > +}
> > +SECTIONS
> > +{
> > +       . = VMLINUZ_LOAD_ADDRESS;
> > +
> > +       _text = .;
> > +       .head.text : {
> > +               *(.head.text)
> > +       }
> > +
> > +       .text : {
> > +               *(.text)
> > +               *(.init.text)
> > +               *(.rodata)
> > +       }: text
> > +
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > +       _data = .;
> > +       .data : {
> > +               *(.data)
> > +               *(.init.data)
> > +               /* Put the compressed image here */
> > +               __image_begin = .;
> > +               *(.image)
> > +               __image_end = .;
> > +               CONSTRUCTORS
> > +               . = ALIGN(PECOFF_FILE_ALIGN);
> > +       }
> > +       _edata = .;
> > +
> > +       .bss : {
> > +               *(.bss)
> > +               *(.init.bss)
> > +       }
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > +       _end = .;
> > +
> > +       /DISCARD/ : {
> > +               *(.options)
> > +               *(.comment)
> > +               *(.note)
> > +       }
> > +}
> > diff --git a/arch/loongarch/boot/decompress.c b/arch/loongarch/boot/decompress.c
> > new file mode 100644
> > index 000000000000..8f55fcd8f285
> > --- /dev/null
> > +++ b/arch/loongarch/boot/decompress.c
> > @@ -0,0 +1,98 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +/*
> > + * Author: Huacai Chen <chenhuacai@loongson.cn>
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/types.h>
> > +#include <linux/kernel.h>
> > +#include <linux/string.h>
> > +#include <linux/libfdt.h>
> > +
> > +#include <asm/addrspace.h>
> > +
> > +/*
> > + * These two variables specify the free mem region
> > + * that can be used for temporary malloc area
> > + */
> > +unsigned long free_mem_ptr;
> > +unsigned long free_mem_end_ptr;
> > +
> > +/* The linker tells us where the image is. */
> > +extern unsigned char __image_begin, __image_end;
> > +
> > +#define puts(s) do {} while (0)
> > +#define puthex(val) do {} while (0)
> > +
> > +void error(char *x)
> > +{
> > +       puts("\n\n");
> > +       puts(x);
> > +       puts("\n\n -- System halted");
> > +
> > +       while (1)
> > +               ;       /* Halt */
> > +}
> > +
> > +/* activate the code for pre-boot environment */
> > +#define STATIC static
> > +
> > +#include "../../../../lib/ashldi3.c"
> > +
> > +#ifdef CONFIG_KERNEL_GZIP
> > +#include "../../../../lib/decompress_inflate.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_BZIP2
> > +#include "../../../../lib/decompress_bunzip2.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_LZ4
> > +#include "../../../../lib/decompress_unlz4.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_LZMA
> > +#include "../../../../lib/decompress_unlzma.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_LZO
> > +#include "../../../../lib/decompress_unlzo.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_XZ
> > +#include "../../../../lib/decompress_unxz.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_ZSTD
> > +#include "../../../../lib/decompress_unzstd.c"
> > +#endif
> > +
> > +void decompress_kernel(unsigned long boot_heap_start)
> > +{
> > +       unsigned long zimage_start, zimage_size;
> > +
> > +       zimage_start = (unsigned long)(&__image_begin);
> > +       zimage_size = (unsigned long)(&__image_end) -
> > +           (unsigned long)(&__image_begin);
> > +
> > +       puts("zimage at:     ");
> > +       puthex(zimage_start);
> > +       puts(" ");
> > +       puthex(zimage_size + zimage_start);
> > +       puts("\n");
> > +
> > +       /* This area are prepared for mallocing when decompressing */
> > +       free_mem_ptr = boot_heap_start;
> > +       free_mem_end_ptr = boot_heap_start + BOOT_HEAP_SIZE;
> > +
> > +       /* Display standard Linux/LoongArch boot prompt */
> > +       puts("Uncompressing Linux at load address ");
> > +       puthex(VMLINUX_LOAD_ADDRESS);
> > +       puts("\n");
> > +
> > +       /* Decompress the kernel with according algorithm */
> > +       __decompress((char *)zimage_start, zimage_size, 0, 0,
> > +                  (void *)VMLINUX_LOAD_ADDRESS, 0, 0, error);
> > +
> > +       puts("Now, booting the kernel...\n");
> > +}
> > diff --git a/arch/loongarch/boot/string.c b/arch/loongarch/boot/string.c
> > new file mode 100644
> > index 000000000000..3f746e7c2bb5
> > --- /dev/null
> > +++ b/arch/loongarch/boot/string.c
> > @@ -0,0 +1,166 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * arch/loongarch/boot/string.c
> > + *
> > + * Very small subset of simple string routines
> > + */
> > +
> > +#include <linux/types.h>
> > +
> > +void __weak *memset(void *s, int c, size_t n)
> > +{
> > +       int i;
> > +       char *ss = s;
> > +
> > +       for (i = 0; i < n; i++)
> > +               ss[i] = c;
> > +       return s;
> > +}
> > +
> > +void __weak *memcpy(void *dest, const void *src, size_t n)
> > +{
> > +       int i;
> > +       const char *s = src;
> > +       char *d = dest;
> > +
> > +       for (i = 0; i < n; i++)
> > +               d[i] = s[i];
> > +       return dest;
> > +}
> > +
> > +void __weak *memmove(void *dest, const void *src, size_t n)
> > +{
> > +       int i;
> > +       const char *s = src;
> > +       char *d = dest;
> > +
> > +       if (d < s) {
> > +               for (i = 0; i < n; i++)
> > +                       d[i] = s[i];
> > +       } else if (d > s) {
> > +               for (i = n - 1; i >= 0; i--)
> > +                       d[i] = s[i];
> > +       }
> > +
> > +       return dest;
> > +}
> > +
> > +int __weak memcmp(const void *cs, const void *ct, size_t count)
> > +{
> > +       int res = 0;
> > +       const unsigned char *su1, *su2;
> > +
> > +       for (su1 = cs, su2 = ct; 0 < count; ++su1, ++su2, count--) {
> > +               res = *su1 - *su2;
> > +               if (res != 0)
> > +                       break;
> > +       }
> > +       return res;
> > +}
> > +
> > +int __weak strcmp(const char *str1, const char *str2)
> > +{
> > +       int delta = 0;
> > +       const unsigned char *s1 = (const unsigned char *)str1;
> > +       const unsigned char *s2 = (const unsigned char *)str2;
> > +
> > +       while (*s1 || *s2) {
> > +               delta = *s1 - *s2;
> > +               if (delta)
> > +                       return delta;
> > +               s1++;
> > +               s2++;
> > +       }
> > +       return 0;
> > +}
> > +
> > +size_t __weak strlen(const char *s)
> > +{
> > +       const char *sc;
> > +
> > +       for (sc = s; *sc != '\0'; ++sc)
> > +               /* nothing */;
> > +       return sc - s;
> > +}
> > +
> > +size_t __weak strnlen(const char *s, size_t count)
> > +{
> > +       const char *sc;
> > +
> > +       for (sc = s; count-- && *sc != '\0'; ++sc)
> > +               /* nothing */;
> > +       return sc - s;
> > +}
> > +
> > +char * __weak strnstr(const char *s1, const char *s2, size_t len)
> > +{
> > +       size_t l2;
> > +
> > +       l2 = strlen(s2);
> > +       if (!l2)
> > +               return (char *)s1;
> > +       while (len >= l2) {
> > +               len--;
> > +               if (!memcmp(s1, s2, l2))
> > +                       return (char *)s1;
> > +               s1++;
> > +       }
> > +       return NULL;
> > +}
> > +
> > +#undef strcat
> > +char * __weak strcat(char *dest, const char *src)
> > +{
> > +       char *tmp = dest;
> > +
> > +       while (*dest)
> > +               dest++;
> > +       while ((*dest++ = *src++) != '\0')
> > +               ;
> > +       return tmp;
> > +}
> > +
> > +char * __weak strncat(char *dest, const char *src, size_t count)
> > +{
> > +       char *tmp = dest;
> > +
> > +       if (count) {
> > +               while (*dest)
> > +                       dest++;
> > +               while ((*dest++ = *src++) != 0) {
> > +                       if (--count == 0) {
> > +                               *dest = '\0';
> > +                               break;
> > +                       }
> > +               }
> > +       }
> > +       return tmp;
> > +}
> > +
> > +char * __weak strpbrk(const char *cs, const char *ct)
> > +{
> > +       const char *sc1, *sc2;
> > +
> > +       for (sc1 = cs; *sc1 != '\0'; ++sc1) {
> > +               for (sc2 = ct; *sc2 != '\0'; ++sc2) {
> > +                       if (*sc1 == *sc2)
> > +                               return (char *)sc1;
> > +               }
> > +       }
> > +       return NULL;
> > +}
> > +
> > +char * __weak strsep(char **s, const char *ct)
> > +{
> > +       char *sbegin = *s;
> > +       char *end;
> > +
> > +       if (sbegin == NULL)
> > +               return NULL;
> > +
> > +       end = strpbrk(sbegin, ct);
> > +       if (end)
> > +               *end++ = '\0';
> > +       *s = end;
> > +       return sbegin;
> > +}
> > diff --git a/arch/loongarch/boot/zheader.S b/arch/loongarch/boot/zheader.S
> > new file mode 100644
> > index 000000000000..4bc50d953ec7
> > --- /dev/null
> > +++ b/arch/loongarch/boot/zheader.S
> > @@ -0,0 +1,100 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/pe.h>
> > +#include <linux/sizes.h>
> > +
> > +       .macro  __EFI_PE_HEADER
> > +       .long   PE_MAGIC
> > +coff_header:
> > +       .short  IMAGE_FILE_MACHINE_LOONGARCH            /* Machine */
> > +       .short  section_count                           /* NumberOfSections */
> > +       .long   0                                       /* TimeDateStamp */
> > +       .long   0                                       /* PointerToSymbolTable */
> > +       .long   0                                       /* NumberOfSymbols */
> > +       .short  section_table - optional_header         /* SizeOfOptionalHeader */
> > +       .short  IMAGE_FILE_DEBUG_STRIPPED | \
> > +               IMAGE_FILE_EXECUTABLE_IMAGE | \
> > +               IMAGE_FILE_LINE_NUMS_STRIPPED           /* Characteristics */
> > +
> > +optional_header:
> > +       .short  PE_OPT_MAGIC_PE32PLUS                   /* PE32+ format */
> > +       .byte   0x02                                    /* MajorLinkerVersion */
> > +       .byte   0x14                                    /* MinorLinkerVersion */
> > +       .long   _data - efi_header_end                  /* SizeOfCode */
> > +       .long   _end - _data                            /* SizeOfInitializedData */
> > +       .long   0                                       /* SizeOfUninitializedData */
> > +       .long   __efistub_efi_pe_entry - _head          /* AddressOfEntryPoint */
> > +       .long   efi_header_end - _head                  /* BaseOfCode */
> > +
> > +extra_header_fields:
> > +       .quad   0                                       /* ImageBase */
> > +       .long   PECOFF_SEGMENT_ALIGN                    /* SectionAlignment */
> > +       .long   PECOFF_FILE_ALIGN                       /* FileAlignment */
> > +       .short  0                                       /* MajorOperatingSystemVersion */
> > +       .short  0                                       /* MinorOperatingSystemVersion */
> > +       .short  0                                       /* MajorImageVersion */
> > +       .short  0                                       /* MinorImageVersion */
> > +       .short  0                                       /* MajorSubsystemVersion */
> > +       .short  0                                       /* MinorSubsystemVersion */
> > +       .long   0                                       /* Win32VersionValue */
> > +
> > +       .long   _end - _head                            /* SizeOfImage */
> > +
> > +       /* Everything before the kernel image is considered part of the header */
> > +       .long   efi_header_end - _head                  /* SizeOfHeaders */
> > +       .long   0                                       /* CheckSum */
> > +       .short  IMAGE_SUBSYSTEM_EFI_APPLICATION         /* Subsystem */
> > +       .short  0                                       /* DllCharacteristics */
> > +       .quad   0                                       /* SizeOfStackReserve */
> > +       .quad   0                                       /* SizeOfStackCommit */
> > +       .quad   0                                       /* SizeOfHeapReserve */
> > +       .quad   0                                       /* SizeOfHeapCommit */
> > +       .long   0                                       /* LoaderFlags */
> > +       .long   (section_table - .) / 8                 /* NumberOfRvaAndSizes */
> > +
> > +       .quad   0                                       /* ExportTable */
> > +       .quad   0                                       /* ImportTable */
> > +       .quad   0                                       /* ResourceTable */
> > +       .quad   0                                       /* ExceptionTable */
> > +       .quad   0                                       /* CertificationTable */
> > +       .quad   0                                       /* BaseRelocationTable */
> > +
> > +       /* Section table */
> > +section_table:
> > +       .ascii  ".text\0\0\0"
> > +       .long   _data - efi_header_end                  /* VirtualSize */
> > +       .long   efi_header_end - _head                  /* VirtualAddress */
> > +       .long   _data - efi_header_end                  /* SizeOfRawData */
> > +       .long   efi_header_end - _head                  /* PointerToRawData */
> > +
> > +       .long   0                                       /* PointerToRelocations */
> > +       .long   0                                       /* PointerToLineNumbers */
> > +       .short  0                                       /* NumberOfRelocations */
> > +       .short  0                                       /* NumberOfLineNumbers */
> > +       .long   IMAGE_SCN_CNT_CODE | \
> > +               IMAGE_SCN_MEM_READ | \
> > +               IMAGE_SCN_MEM_EXECUTE                   /* Characteristics */
> > +
> > +       .ascii  ".data\0\0\0"
> > +       .long   _end - _data                            /* VirtualSize */
> > +       .long   _data - _head                           /* VirtualAddress */
> > +       .long   _edata - _data                          /* SizeOfRawData */
> > +       .long   _data - _head                           /* PointerToRawData */
> > +
> > +       .long   0                                       /* PointerToRelocations */
> > +       .long   0                                       /* PointerToLineNumbers */
> > +       .short  0                                       /* NumberOfRelocations */
> > +       .short  0                                       /* NumberOfLineNumbers */
> > +       .long   IMAGE_SCN_CNT_INITIALIZED_DATA | \
> > +               IMAGE_SCN_MEM_READ | \
> > +               IMAGE_SCN_MEM_WRITE                     /* Characteristics */
> > +
> > +       .org 0x20e
> > +       .word kernel_version - 512 -  _head
> > +
> > +       .set    section_count, (. - section_table) / 40
> > +efi_header_end:
> > +       .endm
> > diff --git a/arch/loongarch/boot/zkernel.S b/arch/loongarch/boot/zkernel.S
> > new file mode 100644
> > index 000000000000..13a8a14a2328
> > --- /dev/null
> > +++ b/arch/loongarch/boot/zkernel.S
> > @@ -0,0 +1,99 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/init.h>
> > +#include <linux/linkage.h>
> > +#include <asm/addrspace.h>
> > +#include <asm/asm.h>
> > +#include <asm/loongarch.h>
> > +#include <asm/regdef.h>
> > +#include <generated/compile.h>
> > +#include <generated/utsrelease.h>
> > +
> > +#ifdef CONFIG_EFI_STUB
> > +
> > +#include "zheader.S"
> > +
> > +       __HEAD
> > +
> > +_head:
> > +       /* "MZ", MS-DOS header */
> > +       .word   MZ_MAGIC
> > +       .org    0x28
> > +       .ascii  "Loongson\0"
> > +       .org    0x3c
> > +       /* Offset to the PE header */
> > +       .long   pe_header - _head
> > +
> > +pe_header:
> > +       __EFI_PE_HEADER
> > +
> > +kernel_asize:
> > +       .long _end - _text
> > +
> > +kernel_fsize:
> > +       .long _edata - _text
> > +
> > +kernel_vaddr:
> > +       .quad VMLINUZ_LOAD_ADDRESS
> > +
> > +kernel_offset:
> > +       .long kernel_offset - _text
> > +
> > +kernel_version:
> > +       .ascii  UTS_RELEASE " (" LINUX_COMPILE_BY "@" LINUX_COMPILE_HOST ") " UTS_VERSION "\0"
> > +
> > +SYM_L_GLOBAL(kernel_asize)
> > +SYM_L_GLOBAL(kernel_fsize)
> > +SYM_L_GLOBAL(kernel_vaddr)
> > +SYM_L_GLOBAL(kernel_offset)
> > +
> > +#endif
> > +
> > +       __INIT
> > +
> > +SYM_CODE_START(kernel_entry)
> > +       /* Save boot rom start args */
> > +       move    s0, a0
> > +       move    s1, a1
> > +       move    s2, a2
> > +       move    s3, a3
> > +
> > +       /* Config Direct Mapping */
> > +       li.d    t0, CSR_DMW0_INIT
> > +       csrwr   t0, LOONGARCH_CSR_DMWIN0
> > +       li.d    t0, CSR_DMW1_INIT
> > +       csrwr   t0, LOONGARCH_CSR_DMWIN1
> > +
> > +       /* Clear BSS */
> > +       la.abs  a0, _edata
> > +       la.abs  a2, _end
> > +1:     st.d    zero, a0, 0
> > +       addi.d  a0, a0, 8
> > +       bne     a2, a0, 1b
> > +
> > +       la.abs  a0, .heap          /* heap address */
> > +       la.abs  sp, .stack + 8192  /* stack address */
> > +
> > +       la      ra, 2f
> > +       la      t4, decompress_kernel
> > +       jirl    zero, t4, 0
> > +2:
> > +       move    a0, s0
> > +       move    a1, s1
> > +       move    a2, s2
> > +       move    a3, s3
> > +       PTR_LI  t4, KERNEL_ENTRY
> > +       jirl    zero, t4, 0
> > +3:
> > +       b       3b
> > +SYM_CODE_END(kernel_entry)
> > +
> > +       .comm .heap, BOOT_HEAP_SIZE, 4
> > +       .comm .stack, BOOT_STACK_SIZE, 4
> > +
> > +       .align 4
> > +       .section .image, "a", %progbits
> > +       .incbin "arch/loongarch/boot/vmlinux.bin.z"
> > diff --git a/arch/loongarch/tools/Makefile b/arch/loongarch/tools/Makefile
> > new file mode 100644
> > index 000000000000..8a6181c82a91
> > --- /dev/null
> > +++ b/arch/loongarch/tools/Makefile
> > @@ -0,0 +1,15 @@
> > +#
> > +# arch/loongarch/boot/Makefile
> > +#
> > +# Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > +#
> > +
> > +hostprogs := elf-entry
> > +PHONY += elf-entry
> > +elf-entry: $(obj)/elf-entry
> > +       @:
> > +
> > +hostprogs += calc_vmlinuz_load_addr
> > +PHONY += calc_vmlinuz_load_addr
> > +calc_vmlinuz_load_addr: $(obj)/calc_vmlinuz_load_addr
> > +       @:
> > diff --git a/arch/loongarch/tools/calc_vmlinuz_load_addr.c b/arch/loongarch/tools/calc_vmlinuz_load_addr.c
> > new file mode 100644
> > index 000000000000..5e2ca6b4dff6
> > --- /dev/null
> > +++ b/arch/loongarch/tools/calc_vmlinuz_load_addr.c
> > @@ -0,0 +1,51 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <errno.h>
> > +#include <stdint.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <sys/stat.h>
> > +
> > +int main(int argc, char *argv[])
> > +{
> > +       unsigned long long vmlinux_size, vmlinux_load_addr, vmlinuz_load_addr;
> > +       struct stat sb;
> > +
> > +       if (argc != 3) {
> > +               fprintf(stderr, "Usage: %s <pathname> <vmlinux_load_addr>\n", argv[0]);
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       if (stat(argv[1], &sb) == -1) {
> > +               perror("stat");
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       /* Convert hex characters to dec number */
> > +       errno = 0;
> > +       if (sscanf(argv[2], "%llx", &vmlinux_load_addr) != 1) {
> > +               if (errno != 0)
> > +                       perror("sscanf");
> > +               else
> > +                       fprintf(stderr, "No matching characters\n");
> > +
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       vmlinux_size = (uint64_t)sb.st_size;
> > +       vmlinuz_load_addr = vmlinux_load_addr + vmlinux_size;
> > +
> > +       /*
> > +        * Align with 64KB: KEXEC needs load sections to be aligned to PAGE_SIZE,
> > +        * which may be as large as 64KB depending on the kernel configuration.
> > +        */
> > +
> > +       vmlinuz_load_addr += (0x10000 - vmlinux_size % 0x10000);
> > +
> > +       printf("0x%llx\n", vmlinuz_load_addr);
> > +
> > +       return EXIT_SUCCESS;
> > +}
> > diff --git a/arch/loongarch/tools/elf-entry.c b/arch/loongarch/tools/elf-entry.c
> > new file mode 100644
> > index 000000000000..c80721e0dee1
> > --- /dev/null
> > +++ b/arch/loongarch/tools/elf-entry.c
> > @@ -0,0 +1,66 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <elf.h>
> > +#include <inttypes.h>
> > +#include <stdint.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <string.h>
> > +
> > +__attribute__((noreturn))
> > +static void die(const char *msg)
> > +{
> > +       fputs(msg, stderr);
> > +       exit(EXIT_FAILURE);
> > +}
> > +
> > +int main(int argc, const char *argv[])
> > +{
> > +       uint64_t entry;
> > +       size_t nread;
> > +       FILE *file;
> > +       union {
> > +               Elf32_Ehdr ehdr32;
> > +               Elf64_Ehdr ehdr64;
> > +       } hdr;
> > +
> > +       if (argc != 2)
> > +               die("Usage: elf-entry <elf-file>\n");
> > +
> > +       file = fopen(argv[1], "r");
> > +       if (!file) {
> > +               perror("Unable to open input file");
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       nread = fread(&hdr, 1, sizeof(hdr), file);
> > +       if (nread != sizeof(hdr)) {
> > +               fclose(file);
> > +               perror("Unable to read input file");
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       if (memcmp(hdr.ehdr32.e_ident, ELFMAG, SELFMAG)) {
> > +               fclose(file);
> > +               die("Input is not an ELF\n");
> > +       }
> > +
> > +       switch (hdr.ehdr32.e_ident[EI_CLASS]) {
> > +       case ELFCLASS32:
> > +               /* Sign extend to form a canonical address */
> > +               entry = (int64_t)(int32_t)hdr.ehdr32.e_entry;
> > +               break;
> > +
> > +       case ELFCLASS64:
> > +               entry = hdr.ehdr64.e_entry;
> > +               break;
> > +
> > +       default:
> > +               fclose(file);
> > +               die("Invalid ELF class\n");
> > +       }
> > +
> > +       fclose(file);
> > +       printf("0x%016" PRIx64 "\n", entry);
> > +
> > +       return EXIT_SUCCESS;
> > +}
> > --
> > 2.27.0
> >

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
@ 2022-05-01  5:22       ` Huacai Chen
  0 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-05-01  5:22 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Huacai Chen, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds,
	linux-arch, open list:DOCUMENTATION, Linux Kernel Mailing List,
	Xuefeng Li, Yanteng Si, Guo Ren, Xuerui Wang, Jiaxun Yang,
	Linux ARM, Catalin Marinas, Will Deacon, linux-riscv,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Ard Biesheuvel,
	linux-efi

Hi, Arnd,

On Sat, Apr 30, 2022 at 7:02 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> >
> > This patch adds zboot (self-extracting compressed kernel) support, all
> > existing in-kernel compressing algorithm and efistub are supported.
> >
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
>
> I have no objections to adding a decompressor in principle, and
> the implementation seems reasonable. However, I think we should try to
> be consistent between architectures. On both arm64 and riscv, the
> maintainers decided to not include a decompressor and instead leave
> it up to the boot loader to decompress the kernel and enter it from there.
X86, ARM32 and MIPS already support self-extracting kernel, and in
5.17 we even support self-extracting modules. So I think a
self-extracting kernel is better than a pure compressed kernel.

>
> As I understand it, this is not part of the UEFI boot flow though, so it
> means that you don't get any compressed kernel images at all when
> booting using UEFI (let me know if that is wrong). I assume this is why
> you decided to include the decompressor here after all.
>
> I think we should first aim for consistency here, and handle this the
> same way across the modern architectures, either leaving the
> decompressor code out, or adding it consistently. Maybe it would
> even be possible to have the decompressor code as part of the
> EFI stub and share it between the three architectures (x86 and
> 32-bit arm already support loading compressed kernels using EFI).
>
> Adding the arm64, risc-v and uefi maintainers for further discussion here,
> see full below.
Keeping consistency across architectures (support self-extracting for
all modern architectures) looks good to me, but can we do that after
this series? I think that needs a long time to discuss and develop.

Huacai
>
>        Arnd
>
> > ---
> >  arch/loongarch/Kbuild                         |   2 +-
> >  arch/loongarch/Kconfig                        |  11 ++
> >  arch/loongarch/Makefile                       |  26 ++-
> >  arch/loongarch/boot/Makefile                  |  55 ++++++
> >  arch/loongarch/boot/boot.lds.S                |  64 +++++++
> >  arch/loongarch/boot/decompress.c              |  98 +++++++++++
> >  arch/loongarch/boot/string.c                  | 166 ++++++++++++++++++
> >  arch/loongarch/boot/zheader.S                 | 100 +++++++++++
> >  arch/loongarch/boot/zkernel.S                 |  99 +++++++++++
> >  arch/loongarch/tools/Makefile                 |  15 ++
> >  arch/loongarch/tools/calc_vmlinuz_load_addr.c |  51 ++++++
> >  arch/loongarch/tools/elf-entry.c              |  66 +++++++
> >  12 files changed, 749 insertions(+), 4 deletions(-)
> >  create mode 100644 arch/loongarch/boot/boot.lds.S
> >  create mode 100644 arch/loongarch/boot/decompress.c
> >  create mode 100644 arch/loongarch/boot/string.c
> >  create mode 100644 arch/loongarch/boot/zheader.S
> >  create mode 100644 arch/loongarch/boot/zkernel.S
> >  create mode 100644 arch/loongarch/tools/Makefile
> >  create mode 100644 arch/loongarch/tools/calc_vmlinuz_load_addr.c
> >  create mode 100644 arch/loongarch/tools/elf-entry.c
> >
> > diff --git a/arch/loongarch/Kbuild b/arch/loongarch/Kbuild
> > index ab5373d0a24f..d907fdd7ca08 100644
> > --- a/arch/loongarch/Kbuild
> > +++ b/arch/loongarch/Kbuild
> > @@ -3,4 +3,4 @@ obj-y += mm/
> >  obj-y += vdso/
> >
> >  # for cleaning
> > -subdir- += boot
> > +subdir- += boot tools
> > diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> > index 55225ee5f868..6c1042746b2d 100644
> > --- a/arch/loongarch/Kconfig
> > +++ b/arch/loongarch/Kconfig
> > @@ -107,6 +107,7 @@ config LOONGARCH
> >         select PERF_USE_VMALLOC
> >         select RTC_LIB
> >         select SPARSE_IRQ
> > +       select SYS_SUPPORTS_ZBOOT
> >         select SYSCTL_EXCEPTION_TRACE
> >         select SWIOTLB
> >         select TRACE_IRQFLAGS_SUPPORT
> > @@ -143,6 +144,16 @@ config LOCKDEP_SUPPORT
> >         bool
> >         default y
> >
> > +config SYS_SUPPORTS_ZBOOT
> > +       bool
> > +       select HAVE_KERNEL_GZIP
> > +       select HAVE_KERNEL_BZIP2
> > +       select HAVE_KERNEL_LZ4
> > +       select HAVE_KERNEL_LZMA
> > +       select HAVE_KERNEL_LZO
> > +       select HAVE_KERNEL_XZ
> > +       select HAVE_KERNEL_ZSTD
> > +
> >  config MACH_LOONGSON32
> >         def_bool 32BIT
> >
> > diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
> > index d88a792dafbe..1ed5b8466565 100644
> > --- a/arch/loongarch/Makefile
> > +++ b/arch/loongarch/Makefile
> > @@ -5,12 +5,31 @@
> >
> >  boot   := arch/loongarch/boot
> >
> > +ifndef CONFIG_SYS_SUPPORTS_ZBOOT
> > +
> >  ifndef CONFIG_EFI_STUB
> >  KBUILD_IMAGE   = $(boot)/vmlinux
> >  else
> >  KBUILD_IMAGE   = $(boot)/vmlinux.efi
> >  endif
> >
> > +else
> > +
> > +ifndef CONFIG_EFI_STUB
> > +KBUILD_IMAGE   = $(boot)/vmlinuz
> > +else
> > +KBUILD_IMAGE   = $(boot)/vmlinuz.efi
> > +endif
> > +
> > +endif
> > +
> > +load-y         = 0x9000000000200000
> > +bootvars-y     = VMLINUX_LOAD_ADDRESS=$(load-y)
> > +
> > +archscripts: scripts_basic
> > +       $(Q)$(MAKE) $(build)=arch/loongarch/tools elf-entry
> > +       $(Q)$(MAKE) $(build)=arch/loongarch/tools calc_vmlinuz_load_addr
> > +
> >  #
> >  # Select the object file format to substitute into the linker script.
> >  #
> > @@ -55,9 +74,6 @@ KBUILD_CFLAGS_MODULE          += -fplt -Wa,-mla-global-with-abs,-mla-local-with-abs
> >  cflags-y += -ffreestanding
> >  cflags-y += $(call as-option,-Wa$(comma)-mno-fix-loongson3-llsc,)
> >
> > -load-y         = 0x9000000000200000
> > -bootvars-y     = VMLINUX_LOAD_ADDRESS=$(load-y)
> > -
> >  drivers-$(CONFIG_PCI)          += arch/loongarch/pci/
> >
> >  KBUILD_AFLAGS  += $(cflags-y)
> > @@ -99,7 +115,11 @@ $(KBUILD_IMAGE): vmlinux
> >         $(Q)$(MAKE) $(build)=$(boot) $(bootvars-y) $@
> >
> >  install:
> > +ifndef CONFIG_SYS_SUPPORTS_ZBOOT
> >         $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
> > +else
> > +       $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinuz-$(KERNELRELEASE)
> > +endif
> >         $(Q)install -D -m 644 .config $(INSTALL_PATH)/config-$(KERNELRELEASE)
> >         $(Q)install -D -m 644 System.map $(INSTALL_PATH)/System.map-$(KERNELRELEASE)
> >
> > diff --git a/arch/loongarch/boot/Makefile b/arch/loongarch/boot/Makefile
> > index 66f2293c34b2..c26a36004ae2 100644
> > --- a/arch/loongarch/boot/Makefile
> > +++ b/arch/loongarch/boot/Makefile
> > @@ -21,3 +21,58 @@ quiet_cmd_eficopy = OBJCOPY $@
> >
> >  $(obj)/vmlinux.efi: $(obj)/vmlinux FORCE
> >         $(call if_changed,eficopy)
> > +
> > +# zboot
> > +extra-y        += boot.lds
> > +$(obj)/boot.lds: $(obj)/vmlinux.bin FORCE
> > +CPPFLAGS_boot.lds = $(KBUILD_CPPFLAGS) -DVMLINUZ_LOAD_ADDRESS=$(zload-y)
> > +
> > +entry-y        = $(shell $(objtree)/arch/loongarch/tools/elf-entry $(obj)/vmlinux)
> > +zload-y = $(shell $(objtree)/arch/loongarch/tools/calc_vmlinuz_load_addr \
> > +                               $(obj)/vmlinux.bin $(VMLINUX_LOAD_ADDRESS))
> > +
> > +BOOT_HEAP_SIZE := 0x400000
> > +BOOT_STACK_SIZE        := 0x002000
> > +
> > +KBUILD_AFLAGS := $(KBUILD_AFLAGS) -D__ASSEMBLY__ \
> > +       -DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \
> > +       -DBOOT_STACK_SIZE=$(BOOT_STACK_SIZE)
> > +
> > +KBUILD_CFLAGS := $(KBUILD_CFLAGS) -fpic -D__KERNEL__ \
> > +       -DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \
> > +       -DBOOT_STACK_SIZE=$(BOOT_STACK_SIZE)
> > +
> > +targets += vmlinux.bin
> > +OBJCOPYFLAGS_vmlinux.bin := $(OBJCOPYFLAGS) -O binary $(strip-flags)
> > +$(obj)/vmlinux.bin: $(obj)/vmlinux FORCE
> > +       $(call if_changed,objcopy)
> > +
> > +tool_$(CONFIG_KERNEL_GZIP)    = gzip
> > +tool_$(CONFIG_KERNEL_BZIP2)   = bzip2_with_size
> > +tool_$(CONFIG_KERNEL_LZ4)     = lz4_with_size
> > +tool_$(CONFIG_KERNEL_LZMA)    = lzma_with_size
> > +tool_$(CONFIG_KERNEL_LZO)     = lzo_with_size
> > +tool_$(CONFIG_KERNEL_XZ)      = xzkern_with_size
> > +tool_$(CONFIG_KERNEL_ZSTD)    = zstd22_with_size
> > +
> > +targets += vmlinux.bin.z
> > +$(obj)/vmlinux.bin.z: $(obj)/vmlinux.bin FORCE
> > +       $(call if_changed,$(tool_y))
> > +
> > +targets += $(notdir $(vmlinuzobjs-y))
> > +vmlinuzobjs-y := $(obj)/zkernel.o $(obj)/decompress.o $(obj)/string.o
> > +vmlinuzobjs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
> > +$(obj)/zkernel.o: $(obj)/vmlinux.bin.z
> > +AFLAGS_zkernel.o = $(KBUILD_AFLAGS) -DVMLINUZ_LOAD_ADDRESS=$(zload-y) -DKERNEL_ENTRY=$(entry-y)
> > +
> > +quiet_cmd_zld = LD      $@
> > +      cmd_zld = $(LD) $(KBUILD_LDFLAGS) -T $< $(vmlinuzobjs-y) -o $@
> > +
> > +targets += vmlinuz
> > +$(obj)/vmlinuz: $(src)/boot.lds $(vmlinuzobjs-y) FORCE
> > +       $(call if_changed,zld)
> > +       $(call if_changed,strip)
> > +
> > +targets += vmlinuz.efi
> > +$(obj)/vmlinuz.efi: $(obj)/vmlinuz FORCE
> > +       $(call if_changed,eficopy)
> > diff --git a/arch/loongarch/boot/boot.lds.S b/arch/loongarch/boot/boot.lds.S
> > new file mode 100644
> > index 000000000000..23e698782afd
> > --- /dev/null
> > +++ b/arch/loongarch/boot/boot.lds.S
> > @@ -0,0 +1,64 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * ld.script for compressed kernel support of LoongArch
> > + *
> > + * Author: Huacai Chen <chenhuacai@loongson.cn>
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include "../kernel/image-vars.h"
> > +
> > +/*
> > + * Max avaliable Page Size is 64K, so we set SectionAlignment
> > + * field of EFI application to 64K.
> > + */
> > +PECOFF_FILE_ALIGN = 0x200;
> > +PECOFF_SEGMENT_ALIGN = 0x10000;
> > +
> > +OUTPUT_ARCH(loongarch)
> > +ENTRY(kernel_entry)
> > +PHDRS {
> > +       text PT_LOAD FLAGS(7); /* RWX */
> > +}
> > +SECTIONS
> > +{
> > +       . = VMLINUZ_LOAD_ADDRESS;
> > +
> > +       _text = .;
> > +       .head.text : {
> > +               *(.head.text)
> > +       }
> > +
> > +       .text : {
> > +               *(.text)
> > +               *(.init.text)
> > +               *(.rodata)
> > +       }: text
> > +
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > +       _data = .;
> > +       .data : {
> > +               *(.data)
> > +               *(.init.data)
> > +               /* Put the compressed image here */
> > +               __image_begin = .;
> > +               *(.image)
> > +               __image_end = .;
> > +               CONSTRUCTORS
> > +               . = ALIGN(PECOFF_FILE_ALIGN);
> > +       }
> > +       _edata = .;
> > +
> > +       .bss : {
> > +               *(.bss)
> > +               *(.init.bss)
> > +       }
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > +       _end = .;
> > +
> > +       /DISCARD/ : {
> > +               *(.options)
> > +               *(.comment)
> > +               *(.note)
> > +       }
> > +}
> > diff --git a/arch/loongarch/boot/decompress.c b/arch/loongarch/boot/decompress.c
> > new file mode 100644
> > index 000000000000..8f55fcd8f285
> > --- /dev/null
> > +++ b/arch/loongarch/boot/decompress.c
> > @@ -0,0 +1,98 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +/*
> > + * Author: Huacai Chen <chenhuacai@loongson.cn>
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/types.h>
> > +#include <linux/kernel.h>
> > +#include <linux/string.h>
> > +#include <linux/libfdt.h>
> > +
> > +#include <asm/addrspace.h>
> > +
> > +/*
> > + * These two variables specify the free mem region
> > + * that can be used for temporary malloc area
> > + */
> > +unsigned long free_mem_ptr;
> > +unsigned long free_mem_end_ptr;
> > +
> > +/* The linker tells us where the image is. */
> > +extern unsigned char __image_begin, __image_end;
> > +
> > +#define puts(s) do {} while (0)
> > +#define puthex(val) do {} while (0)
> > +
> > +void error(char *x)
> > +{
> > +       puts("\n\n");
> > +       puts(x);
> > +       puts("\n\n -- System halted");
> > +
> > +       while (1)
> > +               ;       /* Halt */
> > +}
> > +
> > +/* activate the code for pre-boot environment */
> > +#define STATIC static
> > +
> > +#include "../../../../lib/ashldi3.c"
> > +
> > +#ifdef CONFIG_KERNEL_GZIP
> > +#include "../../../../lib/decompress_inflate.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_BZIP2
> > +#include "../../../../lib/decompress_bunzip2.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_LZ4
> > +#include "../../../../lib/decompress_unlz4.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_LZMA
> > +#include "../../../../lib/decompress_unlzma.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_LZO
> > +#include "../../../../lib/decompress_unlzo.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_XZ
> > +#include "../../../../lib/decompress_unxz.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_ZSTD
> > +#include "../../../../lib/decompress_unzstd.c"
> > +#endif
> > +
> > +void decompress_kernel(unsigned long boot_heap_start)
> > +{
> > +       unsigned long zimage_start, zimage_size;
> > +
> > +       zimage_start = (unsigned long)(&__image_begin);
> > +       zimage_size = (unsigned long)(&__image_end) -
> > +           (unsigned long)(&__image_begin);
> > +
> > +       puts("zimage at:     ");
> > +       puthex(zimage_start);
> > +       puts(" ");
> > +       puthex(zimage_size + zimage_start);
> > +       puts("\n");
> > +
> > +       /* This area are prepared for mallocing when decompressing */
> > +       free_mem_ptr = boot_heap_start;
> > +       free_mem_end_ptr = boot_heap_start + BOOT_HEAP_SIZE;
> > +
> > +       /* Display standard Linux/LoongArch boot prompt */
> > +       puts("Uncompressing Linux at load address ");
> > +       puthex(VMLINUX_LOAD_ADDRESS);
> > +       puts("\n");
> > +
> > +       /* Decompress the kernel with according algorithm */
> > +       __decompress((char *)zimage_start, zimage_size, 0, 0,
> > +                  (void *)VMLINUX_LOAD_ADDRESS, 0, 0, error);
> > +
> > +       puts("Now, booting the kernel...\n");
> > +}
> > diff --git a/arch/loongarch/boot/string.c b/arch/loongarch/boot/string.c
> > new file mode 100644
> > index 000000000000..3f746e7c2bb5
> > --- /dev/null
> > +++ b/arch/loongarch/boot/string.c
> > @@ -0,0 +1,166 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * arch/loongarch/boot/string.c
> > + *
> > + * Very small subset of simple string routines
> > + */
> > +
> > +#include <linux/types.h>
> > +
> > +void __weak *memset(void *s, int c, size_t n)
> > +{
> > +       int i;
> > +       char *ss = s;
> > +
> > +       for (i = 0; i < n; i++)
> > +               ss[i] = c;
> > +       return s;
> > +}
> > +
> > +void __weak *memcpy(void *dest, const void *src, size_t n)
> > +{
> > +       int i;
> > +       const char *s = src;
> > +       char *d = dest;
> > +
> > +       for (i = 0; i < n; i++)
> > +               d[i] = s[i];
> > +       return dest;
> > +}
> > +
> > +void __weak *memmove(void *dest, const void *src, size_t n)
> > +{
> > +       int i;
> > +       const char *s = src;
> > +       char *d = dest;
> > +
> > +       if (d < s) {
> > +               for (i = 0; i < n; i++)
> > +                       d[i] = s[i];
> > +       } else if (d > s) {
> > +               for (i = n - 1; i >= 0; i--)
> > +                       d[i] = s[i];
> > +       }
> > +
> > +       return dest;
> > +}
> > +
> > +int __weak memcmp(const void *cs, const void *ct, size_t count)
> > +{
> > +       int res = 0;
> > +       const unsigned char *su1, *su2;
> > +
> > +       for (su1 = cs, su2 = ct; 0 < count; ++su1, ++su2, count--) {
> > +               res = *su1 - *su2;
> > +               if (res != 0)
> > +                       break;
> > +       }
> > +       return res;
> > +}
> > +
> > +int __weak strcmp(const char *str1, const char *str2)
> > +{
> > +       int delta = 0;
> > +       const unsigned char *s1 = (const unsigned char *)str1;
> > +       const unsigned char *s2 = (const unsigned char *)str2;
> > +
> > +       while (*s1 || *s2) {
> > +               delta = *s1 - *s2;
> > +               if (delta)
> > +                       return delta;
> > +               s1++;
> > +               s2++;
> > +       }
> > +       return 0;
> > +}
> > +
> > +size_t __weak strlen(const char *s)
> > +{
> > +       const char *sc;
> > +
> > +       for (sc = s; *sc != '\0'; ++sc)
> > +               /* nothing */;
> > +       return sc - s;
> > +}
> > +
> > +size_t __weak strnlen(const char *s, size_t count)
> > +{
> > +       const char *sc;
> > +
> > +       for (sc = s; count-- && *sc != '\0'; ++sc)
> > +               /* nothing */;
> > +       return sc - s;
> > +}
> > +
> > +char * __weak strnstr(const char *s1, const char *s2, size_t len)
> > +{
> > +       size_t l2;
> > +
> > +       l2 = strlen(s2);
> > +       if (!l2)
> > +               return (char *)s1;
> > +       while (len >= l2) {
> > +               len--;
> > +               if (!memcmp(s1, s2, l2))
> > +                       return (char *)s1;
> > +               s1++;
> > +       }
> > +       return NULL;
> > +}
> > +
> > +#undef strcat
> > +char * __weak strcat(char *dest, const char *src)
> > +{
> > +       char *tmp = dest;
> > +
> > +       while (*dest)
> > +               dest++;
> > +       while ((*dest++ = *src++) != '\0')
> > +               ;
> > +       return tmp;
> > +}
> > +
> > +char * __weak strncat(char *dest, const char *src, size_t count)
> > +{
> > +       char *tmp = dest;
> > +
> > +       if (count) {
> > +               while (*dest)
> > +                       dest++;
> > +               while ((*dest++ = *src++) != 0) {
> > +                       if (--count == 0) {
> > +                               *dest = '\0';
> > +                               break;
> > +                       }
> > +               }
> > +       }
> > +       return tmp;
> > +}
> > +
> > +char * __weak strpbrk(const char *cs, const char *ct)
> > +{
> > +       const char *sc1, *sc2;
> > +
> > +       for (sc1 = cs; *sc1 != '\0'; ++sc1) {
> > +               for (sc2 = ct; *sc2 != '\0'; ++sc2) {
> > +                       if (*sc1 == *sc2)
> > +                               return (char *)sc1;
> > +               }
> > +       }
> > +       return NULL;
> > +}
> > +
> > +char * __weak strsep(char **s, const char *ct)
> > +{
> > +       char *sbegin = *s;
> > +       char *end;
> > +
> > +       if (sbegin == NULL)
> > +               return NULL;
> > +
> > +       end = strpbrk(sbegin, ct);
> > +       if (end)
> > +               *end++ = '\0';
> > +       *s = end;
> > +       return sbegin;
> > +}
> > diff --git a/arch/loongarch/boot/zheader.S b/arch/loongarch/boot/zheader.S
> > new file mode 100644
> > index 000000000000..4bc50d953ec7
> > --- /dev/null
> > +++ b/arch/loongarch/boot/zheader.S
> > @@ -0,0 +1,100 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/pe.h>
> > +#include <linux/sizes.h>
> > +
> > +       .macro  __EFI_PE_HEADER
> > +       .long   PE_MAGIC
> > +coff_header:
> > +       .short  IMAGE_FILE_MACHINE_LOONGARCH            /* Machine */
> > +       .short  section_count                           /* NumberOfSections */
> > +       .long   0                                       /* TimeDateStamp */
> > +       .long   0                                       /* PointerToSymbolTable */
> > +       .long   0                                       /* NumberOfSymbols */
> > +       .short  section_table - optional_header         /* SizeOfOptionalHeader */
> > +       .short  IMAGE_FILE_DEBUG_STRIPPED | \
> > +               IMAGE_FILE_EXECUTABLE_IMAGE | \
> > +               IMAGE_FILE_LINE_NUMS_STRIPPED           /* Characteristics */
> > +
> > +optional_header:
> > +       .short  PE_OPT_MAGIC_PE32PLUS                   /* PE32+ format */
> > +       .byte   0x02                                    /* MajorLinkerVersion */
> > +       .byte   0x14                                    /* MinorLinkerVersion */
> > +       .long   _data - efi_header_end                  /* SizeOfCode */
> > +       .long   _end - _data                            /* SizeOfInitializedData */
> > +       .long   0                                       /* SizeOfUninitializedData */
> > +       .long   __efistub_efi_pe_entry - _head          /* AddressOfEntryPoint */
> > +       .long   efi_header_end - _head                  /* BaseOfCode */
> > +
> > +extra_header_fields:
> > +       .quad   0                                       /* ImageBase */
> > +       .long   PECOFF_SEGMENT_ALIGN                    /* SectionAlignment */
> > +       .long   PECOFF_FILE_ALIGN                       /* FileAlignment */
> > +       .short  0                                       /* MajorOperatingSystemVersion */
> > +       .short  0                                       /* MinorOperatingSystemVersion */
> > +       .short  0                                       /* MajorImageVersion */
> > +       .short  0                                       /* MinorImageVersion */
> > +       .short  0                                       /* MajorSubsystemVersion */
> > +       .short  0                                       /* MinorSubsystemVersion */
> > +       .long   0                                       /* Win32VersionValue */
> > +
> > +       .long   _end - _head                            /* SizeOfImage */
> > +
> > +       /* Everything before the kernel image is considered part of the header */
> > +       .long   efi_header_end - _head                  /* SizeOfHeaders */
> > +       .long   0                                       /* CheckSum */
> > +       .short  IMAGE_SUBSYSTEM_EFI_APPLICATION         /* Subsystem */
> > +       .short  0                                       /* DllCharacteristics */
> > +       .quad   0                                       /* SizeOfStackReserve */
> > +       .quad   0                                       /* SizeOfStackCommit */
> > +       .quad   0                                       /* SizeOfHeapReserve */
> > +       .quad   0                                       /* SizeOfHeapCommit */
> > +       .long   0                                       /* LoaderFlags */
> > +       .long   (section_table - .) / 8                 /* NumberOfRvaAndSizes */
> > +
> > +       .quad   0                                       /* ExportTable */
> > +       .quad   0                                       /* ImportTable */
> > +       .quad   0                                       /* ResourceTable */
> > +       .quad   0                                       /* ExceptionTable */
> > +       .quad   0                                       /* CertificationTable */
> > +       .quad   0                                       /* BaseRelocationTable */
> > +
> > +       /* Section table */
> > +section_table:
> > +       .ascii  ".text\0\0\0"
> > +       .long   _data - efi_header_end                  /* VirtualSize */
> > +       .long   efi_header_end - _head                  /* VirtualAddress */
> > +       .long   _data - efi_header_end                  /* SizeOfRawData */
> > +       .long   efi_header_end - _head                  /* PointerToRawData */
> > +
> > +       .long   0                                       /* PointerToRelocations */
> > +       .long   0                                       /* PointerToLineNumbers */
> > +       .short  0                                       /* NumberOfRelocations */
> > +       .short  0                                       /* NumberOfLineNumbers */
> > +       .long   IMAGE_SCN_CNT_CODE | \
> > +               IMAGE_SCN_MEM_READ | \
> > +               IMAGE_SCN_MEM_EXECUTE                   /* Characteristics */
> > +
> > +       .ascii  ".data\0\0\0"
> > +       .long   _end - _data                            /* VirtualSize */
> > +       .long   _data - _head                           /* VirtualAddress */
> > +       .long   _edata - _data                          /* SizeOfRawData */
> > +       .long   _data - _head                           /* PointerToRawData */
> > +
> > +       .long   0                                       /* PointerToRelocations */
> > +       .long   0                                       /* PointerToLineNumbers */
> > +       .short  0                                       /* NumberOfRelocations */
> > +       .short  0                                       /* NumberOfLineNumbers */
> > +       .long   IMAGE_SCN_CNT_INITIALIZED_DATA | \
> > +               IMAGE_SCN_MEM_READ | \
> > +               IMAGE_SCN_MEM_WRITE                     /* Characteristics */
> > +
> > +       .org 0x20e
> > +       .word kernel_version - 512 -  _head
> > +
> > +       .set    section_count, (. - section_table) / 40
> > +efi_header_end:
> > +       .endm
> > diff --git a/arch/loongarch/boot/zkernel.S b/arch/loongarch/boot/zkernel.S
> > new file mode 100644
> > index 000000000000..13a8a14a2328
> > --- /dev/null
> > +++ b/arch/loongarch/boot/zkernel.S
> > @@ -0,0 +1,99 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/init.h>
> > +#include <linux/linkage.h>
> > +#include <asm/addrspace.h>
> > +#include <asm/asm.h>
> > +#include <asm/loongarch.h>
> > +#include <asm/regdef.h>
> > +#include <generated/compile.h>
> > +#include <generated/utsrelease.h>
> > +
> > +#ifdef CONFIG_EFI_STUB
> > +
> > +#include "zheader.S"
> > +
> > +       __HEAD
> > +
> > +_head:
> > +       /* "MZ", MS-DOS header */
> > +       .word   MZ_MAGIC
> > +       .org    0x28
> > +       .ascii  "Loongson\0"
> > +       .org    0x3c
> > +       /* Offset to the PE header */
> > +       .long   pe_header - _head
> > +
> > +pe_header:
> > +       __EFI_PE_HEADER
> > +
> > +kernel_asize:
> > +       .long _end - _text
> > +
> > +kernel_fsize:
> > +       .long _edata - _text
> > +
> > +kernel_vaddr:
> > +       .quad VMLINUZ_LOAD_ADDRESS
> > +
> > +kernel_offset:
> > +       .long kernel_offset - _text
> > +
> > +kernel_version:
> > +       .ascii  UTS_RELEASE " (" LINUX_COMPILE_BY "@" LINUX_COMPILE_HOST ") " UTS_VERSION "\0"
> > +
> > +SYM_L_GLOBAL(kernel_asize)
> > +SYM_L_GLOBAL(kernel_fsize)
> > +SYM_L_GLOBAL(kernel_vaddr)
> > +SYM_L_GLOBAL(kernel_offset)
> > +
> > +#endif
> > +
> > +       __INIT
> > +
> > +SYM_CODE_START(kernel_entry)
> > +       /* Save boot rom start args */
> > +       move    s0, a0
> > +       move    s1, a1
> > +       move    s2, a2
> > +       move    s3, a3
> > +
> > +       /* Config Direct Mapping */
> > +       li.d    t0, CSR_DMW0_INIT
> > +       csrwr   t0, LOONGARCH_CSR_DMWIN0
> > +       li.d    t0, CSR_DMW1_INIT
> > +       csrwr   t0, LOONGARCH_CSR_DMWIN1
> > +
> > +       /* Clear BSS */
> > +       la.abs  a0, _edata
> > +       la.abs  a2, _end
> > +1:     st.d    zero, a0, 0
> > +       addi.d  a0, a0, 8
> > +       bne     a2, a0, 1b
> > +
> > +       la.abs  a0, .heap          /* heap address */
> > +       la.abs  sp, .stack + 8192  /* stack address */
> > +
> > +       la      ra, 2f
> > +       la      t4, decompress_kernel
> > +       jirl    zero, t4, 0
> > +2:
> > +       move    a0, s0
> > +       move    a1, s1
> > +       move    a2, s2
> > +       move    a3, s3
> > +       PTR_LI  t4, KERNEL_ENTRY
> > +       jirl    zero, t4, 0
> > +3:
> > +       b       3b
> > +SYM_CODE_END(kernel_entry)
> > +
> > +       .comm .heap, BOOT_HEAP_SIZE, 4
> > +       .comm .stack, BOOT_STACK_SIZE, 4
> > +
> > +       .align 4
> > +       .section .image, "a", %progbits
> > +       .incbin "arch/loongarch/boot/vmlinux.bin.z"
> > diff --git a/arch/loongarch/tools/Makefile b/arch/loongarch/tools/Makefile
> > new file mode 100644
> > index 000000000000..8a6181c82a91
> > --- /dev/null
> > +++ b/arch/loongarch/tools/Makefile
> > @@ -0,0 +1,15 @@
> > +#
> > +# arch/loongarch/boot/Makefile
> > +#
> > +# Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > +#
> > +
> > +hostprogs := elf-entry
> > +PHONY += elf-entry
> > +elf-entry: $(obj)/elf-entry
> > +       @:
> > +
> > +hostprogs += calc_vmlinuz_load_addr
> > +PHONY += calc_vmlinuz_load_addr
> > +calc_vmlinuz_load_addr: $(obj)/calc_vmlinuz_load_addr
> > +       @:
> > diff --git a/arch/loongarch/tools/calc_vmlinuz_load_addr.c b/arch/loongarch/tools/calc_vmlinuz_load_addr.c
> > new file mode 100644
> > index 000000000000..5e2ca6b4dff6
> > --- /dev/null
> > +++ b/arch/loongarch/tools/calc_vmlinuz_load_addr.c
> > @@ -0,0 +1,51 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <errno.h>
> > +#include <stdint.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <sys/stat.h>
> > +
> > +int main(int argc, char *argv[])
> > +{
> > +       unsigned long long vmlinux_size, vmlinux_load_addr, vmlinuz_load_addr;
> > +       struct stat sb;
> > +
> > +       if (argc != 3) {
> > +               fprintf(stderr, "Usage: %s <pathname> <vmlinux_load_addr>\n", argv[0]);
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       if (stat(argv[1], &sb) == -1) {
> > +               perror("stat");
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       /* Convert hex characters to dec number */
> > +       errno = 0;
> > +       if (sscanf(argv[2], "%llx", &vmlinux_load_addr) != 1) {
> > +               if (errno != 0)
> > +                       perror("sscanf");
> > +               else
> > +                       fprintf(stderr, "No matching characters\n");
> > +
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       vmlinux_size = (uint64_t)sb.st_size;
> > +       vmlinuz_load_addr = vmlinux_load_addr + vmlinux_size;
> > +
> > +       /*
> > +        * Align with 64KB: KEXEC needs load sections to be aligned to PAGE_SIZE,
> > +        * which may be as large as 64KB depending on the kernel configuration.
> > +        */
> > +
> > +       vmlinuz_load_addr += (0x10000 - vmlinux_size % 0x10000);
> > +
> > +       printf("0x%llx\n", vmlinuz_load_addr);
> > +
> > +       return EXIT_SUCCESS;
> > +}
> > diff --git a/arch/loongarch/tools/elf-entry.c b/arch/loongarch/tools/elf-entry.c
> > new file mode 100644
> > index 000000000000..c80721e0dee1
> > --- /dev/null
> > +++ b/arch/loongarch/tools/elf-entry.c
> > @@ -0,0 +1,66 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <elf.h>
> > +#include <inttypes.h>
> > +#include <stdint.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <string.h>
> > +
> > +__attribute__((noreturn))
> > +static void die(const char *msg)
> > +{
> > +       fputs(msg, stderr);
> > +       exit(EXIT_FAILURE);
> > +}
> > +
> > +int main(int argc, const char *argv[])
> > +{
> > +       uint64_t entry;
> > +       size_t nread;
> > +       FILE *file;
> > +       union {
> > +               Elf32_Ehdr ehdr32;
> > +               Elf64_Ehdr ehdr64;
> > +       } hdr;
> > +
> > +       if (argc != 2)
> > +               die("Usage: elf-entry <elf-file>\n");
> > +
> > +       file = fopen(argv[1], "r");
> > +       if (!file) {
> > +               perror("Unable to open input file");
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       nread = fread(&hdr, 1, sizeof(hdr), file);
> > +       if (nread != sizeof(hdr)) {
> > +               fclose(file);
> > +               perror("Unable to read input file");
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       if (memcmp(hdr.ehdr32.e_ident, ELFMAG, SELFMAG)) {
> > +               fclose(file);
> > +               die("Input is not an ELF\n");
> > +       }
> > +
> > +       switch (hdr.ehdr32.e_ident[EI_CLASS]) {
> > +       case ELFCLASS32:
> > +               /* Sign extend to form a canonical address */
> > +               entry = (int64_t)(int32_t)hdr.ehdr32.e_entry;
> > +               break;
> > +
> > +       case ELFCLASS64:
> > +               entry = hdr.ehdr64.e_entry;
> > +               break;
> > +
> > +       default:
> > +               fclose(file);
> > +               die("Invalid ELF class\n");
> > +       }
> > +
> > +       fclose(file);
> > +       printf("0x%016" PRIx64 "\n", entry);
> > +
> > +       return EXIT_SUCCESS;
> > +}
> > --
> > 2.27.0
> >

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
@ 2022-05-01  5:22       ` Huacai Chen
  0 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-05-01  5:22 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Huacai Chen, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds,
	linux-arch, open list:DOCUMENTATION, Linux Kernel Mailing List,
	Xuefeng Li, Yanteng Si, Guo Ren, Xuerui Wang, Jiaxun Yang,
	Linux ARM, Catalin Marinas, Will Deacon, linux-riscv,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Ard Biesheuvel,
	linux-efi

Hi, Arnd,

On Sat, Apr 30, 2022 at 7:02 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> >
> > This patch adds zboot (self-extracting compressed kernel) support, all
> > existing in-kernel compressing algorithm and efistub are supported.
> >
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
>
> I have no objections to adding a decompressor in principle, and
> the implementation seems reasonable. However, I think we should try to
> be consistent between architectures. On both arm64 and riscv, the
> maintainers decided to not include a decompressor and instead leave
> it up to the boot loader to decompress the kernel and enter it from there.
X86, ARM32 and MIPS already support self-extracting kernel, and in
5.17 we even support self-extracting modules. So I think a
self-extracting kernel is better than a pure compressed kernel.

>
> As I understand it, this is not part of the UEFI boot flow though, so it
> means that you don't get any compressed kernel images at all when
> booting using UEFI (let me know if that is wrong). I assume this is why
> you decided to include the decompressor here after all.
>
> I think we should first aim for consistency here, and handle this the
> same way across the modern architectures, either leaving the
> decompressor code out, or adding it consistently. Maybe it would
> even be possible to have the decompressor code as part of the
> EFI stub and share it between the three architectures (x86 and
> 32-bit arm already support loading compressed kernels using EFI).
>
> Adding the arm64, risc-v and uefi maintainers for further discussion here,
> see full below.
Keeping consistency across architectures (support self-extracting for
all modern architectures) looks good to me, but can we do that after
this series? I think that needs a long time to discuss and develop.

Huacai
>
>        Arnd
>
> > ---
> >  arch/loongarch/Kbuild                         |   2 +-
> >  arch/loongarch/Kconfig                        |  11 ++
> >  arch/loongarch/Makefile                       |  26 ++-
> >  arch/loongarch/boot/Makefile                  |  55 ++++++
> >  arch/loongarch/boot/boot.lds.S                |  64 +++++++
> >  arch/loongarch/boot/decompress.c              |  98 +++++++++++
> >  arch/loongarch/boot/string.c                  | 166 ++++++++++++++++++
> >  arch/loongarch/boot/zheader.S                 | 100 +++++++++++
> >  arch/loongarch/boot/zkernel.S                 |  99 +++++++++++
> >  arch/loongarch/tools/Makefile                 |  15 ++
> >  arch/loongarch/tools/calc_vmlinuz_load_addr.c |  51 ++++++
> >  arch/loongarch/tools/elf-entry.c              |  66 +++++++
> >  12 files changed, 749 insertions(+), 4 deletions(-)
> >  create mode 100644 arch/loongarch/boot/boot.lds.S
> >  create mode 100644 arch/loongarch/boot/decompress.c
> >  create mode 100644 arch/loongarch/boot/string.c
> >  create mode 100644 arch/loongarch/boot/zheader.S
> >  create mode 100644 arch/loongarch/boot/zkernel.S
> >  create mode 100644 arch/loongarch/tools/Makefile
> >  create mode 100644 arch/loongarch/tools/calc_vmlinuz_load_addr.c
> >  create mode 100644 arch/loongarch/tools/elf-entry.c
> >
> > diff --git a/arch/loongarch/Kbuild b/arch/loongarch/Kbuild
> > index ab5373d0a24f..d907fdd7ca08 100644
> > --- a/arch/loongarch/Kbuild
> > +++ b/arch/loongarch/Kbuild
> > @@ -3,4 +3,4 @@ obj-y += mm/
> >  obj-y += vdso/
> >
> >  # for cleaning
> > -subdir- += boot
> > +subdir- += boot tools
> > diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> > index 55225ee5f868..6c1042746b2d 100644
> > --- a/arch/loongarch/Kconfig
> > +++ b/arch/loongarch/Kconfig
> > @@ -107,6 +107,7 @@ config LOONGARCH
> >         select PERF_USE_VMALLOC
> >         select RTC_LIB
> >         select SPARSE_IRQ
> > +       select SYS_SUPPORTS_ZBOOT
> >         select SYSCTL_EXCEPTION_TRACE
> >         select SWIOTLB
> >         select TRACE_IRQFLAGS_SUPPORT
> > @@ -143,6 +144,16 @@ config LOCKDEP_SUPPORT
> >         bool
> >         default y
> >
> > +config SYS_SUPPORTS_ZBOOT
> > +       bool
> > +       select HAVE_KERNEL_GZIP
> > +       select HAVE_KERNEL_BZIP2
> > +       select HAVE_KERNEL_LZ4
> > +       select HAVE_KERNEL_LZMA
> > +       select HAVE_KERNEL_LZO
> > +       select HAVE_KERNEL_XZ
> > +       select HAVE_KERNEL_ZSTD
> > +
> >  config MACH_LOONGSON32
> >         def_bool 32BIT
> >
> > diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
> > index d88a792dafbe..1ed5b8466565 100644
> > --- a/arch/loongarch/Makefile
> > +++ b/arch/loongarch/Makefile
> > @@ -5,12 +5,31 @@
> >
> >  boot   := arch/loongarch/boot
> >
> > +ifndef CONFIG_SYS_SUPPORTS_ZBOOT
> > +
> >  ifndef CONFIG_EFI_STUB
> >  KBUILD_IMAGE   = $(boot)/vmlinux
> >  else
> >  KBUILD_IMAGE   = $(boot)/vmlinux.efi
> >  endif
> >
> > +else
> > +
> > +ifndef CONFIG_EFI_STUB
> > +KBUILD_IMAGE   = $(boot)/vmlinuz
> > +else
> > +KBUILD_IMAGE   = $(boot)/vmlinuz.efi
> > +endif
> > +
> > +endif
> > +
> > +load-y         = 0x9000000000200000
> > +bootvars-y     = VMLINUX_LOAD_ADDRESS=$(load-y)
> > +
> > +archscripts: scripts_basic
> > +       $(Q)$(MAKE) $(build)=arch/loongarch/tools elf-entry
> > +       $(Q)$(MAKE) $(build)=arch/loongarch/tools calc_vmlinuz_load_addr
> > +
> >  #
> >  # Select the object file format to substitute into the linker script.
> >  #
> > @@ -55,9 +74,6 @@ KBUILD_CFLAGS_MODULE          += -fplt -Wa,-mla-global-with-abs,-mla-local-with-abs
> >  cflags-y += -ffreestanding
> >  cflags-y += $(call as-option,-Wa$(comma)-mno-fix-loongson3-llsc,)
> >
> > -load-y         = 0x9000000000200000
> > -bootvars-y     = VMLINUX_LOAD_ADDRESS=$(load-y)
> > -
> >  drivers-$(CONFIG_PCI)          += arch/loongarch/pci/
> >
> >  KBUILD_AFLAGS  += $(cflags-y)
> > @@ -99,7 +115,11 @@ $(KBUILD_IMAGE): vmlinux
> >         $(Q)$(MAKE) $(build)=$(boot) $(bootvars-y) $@
> >
> >  install:
> > +ifndef CONFIG_SYS_SUPPORTS_ZBOOT
> >         $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
> > +else
> > +       $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinuz-$(KERNELRELEASE)
> > +endif
> >         $(Q)install -D -m 644 .config $(INSTALL_PATH)/config-$(KERNELRELEASE)
> >         $(Q)install -D -m 644 System.map $(INSTALL_PATH)/System.map-$(KERNELRELEASE)
> >
> > diff --git a/arch/loongarch/boot/Makefile b/arch/loongarch/boot/Makefile
> > index 66f2293c34b2..c26a36004ae2 100644
> > --- a/arch/loongarch/boot/Makefile
> > +++ b/arch/loongarch/boot/Makefile
> > @@ -21,3 +21,58 @@ quiet_cmd_eficopy = OBJCOPY $@
> >
> >  $(obj)/vmlinux.efi: $(obj)/vmlinux FORCE
> >         $(call if_changed,eficopy)
> > +
> > +# zboot
> > +extra-y        += boot.lds
> > +$(obj)/boot.lds: $(obj)/vmlinux.bin FORCE
> > +CPPFLAGS_boot.lds = $(KBUILD_CPPFLAGS) -DVMLINUZ_LOAD_ADDRESS=$(zload-y)
> > +
> > +entry-y        = $(shell $(objtree)/arch/loongarch/tools/elf-entry $(obj)/vmlinux)
> > +zload-y = $(shell $(objtree)/arch/loongarch/tools/calc_vmlinuz_load_addr \
> > +                               $(obj)/vmlinux.bin $(VMLINUX_LOAD_ADDRESS))
> > +
> > +BOOT_HEAP_SIZE := 0x400000
> > +BOOT_STACK_SIZE        := 0x002000
> > +
> > +KBUILD_AFLAGS := $(KBUILD_AFLAGS) -D__ASSEMBLY__ \
> > +       -DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \
> > +       -DBOOT_STACK_SIZE=$(BOOT_STACK_SIZE)
> > +
> > +KBUILD_CFLAGS := $(KBUILD_CFLAGS) -fpic -D__KERNEL__ \
> > +       -DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \
> > +       -DBOOT_STACK_SIZE=$(BOOT_STACK_SIZE)
> > +
> > +targets += vmlinux.bin
> > +OBJCOPYFLAGS_vmlinux.bin := $(OBJCOPYFLAGS) -O binary $(strip-flags)
> > +$(obj)/vmlinux.bin: $(obj)/vmlinux FORCE
> > +       $(call if_changed,objcopy)
> > +
> > +tool_$(CONFIG_KERNEL_GZIP)    = gzip
> > +tool_$(CONFIG_KERNEL_BZIP2)   = bzip2_with_size
> > +tool_$(CONFIG_KERNEL_LZ4)     = lz4_with_size
> > +tool_$(CONFIG_KERNEL_LZMA)    = lzma_with_size
> > +tool_$(CONFIG_KERNEL_LZO)     = lzo_with_size
> > +tool_$(CONFIG_KERNEL_XZ)      = xzkern_with_size
> > +tool_$(CONFIG_KERNEL_ZSTD)    = zstd22_with_size
> > +
> > +targets += vmlinux.bin.z
> > +$(obj)/vmlinux.bin.z: $(obj)/vmlinux.bin FORCE
> > +       $(call if_changed,$(tool_y))
> > +
> > +targets += $(notdir $(vmlinuzobjs-y))
> > +vmlinuzobjs-y := $(obj)/zkernel.o $(obj)/decompress.o $(obj)/string.o
> > +vmlinuzobjs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
> > +$(obj)/zkernel.o: $(obj)/vmlinux.bin.z
> > +AFLAGS_zkernel.o = $(KBUILD_AFLAGS) -DVMLINUZ_LOAD_ADDRESS=$(zload-y) -DKERNEL_ENTRY=$(entry-y)
> > +
> > +quiet_cmd_zld = LD      $@
> > +      cmd_zld = $(LD) $(KBUILD_LDFLAGS) -T $< $(vmlinuzobjs-y) -o $@
> > +
> > +targets += vmlinuz
> > +$(obj)/vmlinuz: $(src)/boot.lds $(vmlinuzobjs-y) FORCE
> > +       $(call if_changed,zld)
> > +       $(call if_changed,strip)
> > +
> > +targets += vmlinuz.efi
> > +$(obj)/vmlinuz.efi: $(obj)/vmlinuz FORCE
> > +       $(call if_changed,eficopy)
> > diff --git a/arch/loongarch/boot/boot.lds.S b/arch/loongarch/boot/boot.lds.S
> > new file mode 100644
> > index 000000000000..23e698782afd
> > --- /dev/null
> > +++ b/arch/loongarch/boot/boot.lds.S
> > @@ -0,0 +1,64 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * ld.script for compressed kernel support of LoongArch
> > + *
> > + * Author: Huacai Chen <chenhuacai@loongson.cn>
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include "../kernel/image-vars.h"
> > +
> > +/*
> > + * Max avaliable Page Size is 64K, so we set SectionAlignment
> > + * field of EFI application to 64K.
> > + */
> > +PECOFF_FILE_ALIGN = 0x200;
> > +PECOFF_SEGMENT_ALIGN = 0x10000;
> > +
> > +OUTPUT_ARCH(loongarch)
> > +ENTRY(kernel_entry)
> > +PHDRS {
> > +       text PT_LOAD FLAGS(7); /* RWX */
> > +}
> > +SECTIONS
> > +{
> > +       . = VMLINUZ_LOAD_ADDRESS;
> > +
> > +       _text = .;
> > +       .head.text : {
> > +               *(.head.text)
> > +       }
> > +
> > +       .text : {
> > +               *(.text)
> > +               *(.init.text)
> > +               *(.rodata)
> > +       }: text
> > +
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > +       _data = .;
> > +       .data : {
> > +               *(.data)
> > +               *(.init.data)
> > +               /* Put the compressed image here */
> > +               __image_begin = .;
> > +               *(.image)
> > +               __image_end = .;
> > +               CONSTRUCTORS
> > +               . = ALIGN(PECOFF_FILE_ALIGN);
> > +       }
> > +       _edata = .;
> > +
> > +       .bss : {
> > +               *(.bss)
> > +               *(.init.bss)
> > +       }
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > +       _end = .;
> > +
> > +       /DISCARD/ : {
> > +               *(.options)
> > +               *(.comment)
> > +               *(.note)
> > +       }
> > +}
> > diff --git a/arch/loongarch/boot/decompress.c b/arch/loongarch/boot/decompress.c
> > new file mode 100644
> > index 000000000000..8f55fcd8f285
> > --- /dev/null
> > +++ b/arch/loongarch/boot/decompress.c
> > @@ -0,0 +1,98 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +/*
> > + * Author: Huacai Chen <chenhuacai@loongson.cn>
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/types.h>
> > +#include <linux/kernel.h>
> > +#include <linux/string.h>
> > +#include <linux/libfdt.h>
> > +
> > +#include <asm/addrspace.h>
> > +
> > +/*
> > + * These two variables specify the free mem region
> > + * that can be used for temporary malloc area
> > + */
> > +unsigned long free_mem_ptr;
> > +unsigned long free_mem_end_ptr;
> > +
> > +/* The linker tells us where the image is. */
> > +extern unsigned char __image_begin, __image_end;
> > +
> > +#define puts(s) do {} while (0)
> > +#define puthex(val) do {} while (0)
> > +
> > +void error(char *x)
> > +{
> > +       puts("\n\n");
> > +       puts(x);
> > +       puts("\n\n -- System halted");
> > +
> > +       while (1)
> > +               ;       /* Halt */
> > +}
> > +
> > +/* activate the code for pre-boot environment */
> > +#define STATIC static
> > +
> > +#include "../../../../lib/ashldi3.c"
> > +
> > +#ifdef CONFIG_KERNEL_GZIP
> > +#include "../../../../lib/decompress_inflate.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_BZIP2
> > +#include "../../../../lib/decompress_bunzip2.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_LZ4
> > +#include "../../../../lib/decompress_unlz4.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_LZMA
> > +#include "../../../../lib/decompress_unlzma.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_LZO
> > +#include "../../../../lib/decompress_unlzo.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_XZ
> > +#include "../../../../lib/decompress_unxz.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_ZSTD
> > +#include "../../../../lib/decompress_unzstd.c"
> > +#endif
> > +
> > +void decompress_kernel(unsigned long boot_heap_start)
> > +{
> > +       unsigned long zimage_start, zimage_size;
> > +
> > +       zimage_start = (unsigned long)(&__image_begin);
> > +       zimage_size = (unsigned long)(&__image_end) -
> > +           (unsigned long)(&__image_begin);
> > +
> > +       puts("zimage at:     ");
> > +       puthex(zimage_start);
> > +       puts(" ");
> > +       puthex(zimage_size + zimage_start);
> > +       puts("\n");
> > +
> > +       /* This area are prepared for mallocing when decompressing */
> > +       free_mem_ptr = boot_heap_start;
> > +       free_mem_end_ptr = boot_heap_start + BOOT_HEAP_SIZE;
> > +
> > +       /* Display standard Linux/LoongArch boot prompt */
> > +       puts("Uncompressing Linux at load address ");
> > +       puthex(VMLINUX_LOAD_ADDRESS);
> > +       puts("\n");
> > +
> > +       /* Decompress the kernel with according algorithm */
> > +       __decompress((char *)zimage_start, zimage_size, 0, 0,
> > +                  (void *)VMLINUX_LOAD_ADDRESS, 0, 0, error);
> > +
> > +       puts("Now, booting the kernel...\n");
> > +}
> > diff --git a/arch/loongarch/boot/string.c b/arch/loongarch/boot/string.c
> > new file mode 100644
> > index 000000000000..3f746e7c2bb5
> > --- /dev/null
> > +++ b/arch/loongarch/boot/string.c
> > @@ -0,0 +1,166 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * arch/loongarch/boot/string.c
> > + *
> > + * Very small subset of simple string routines
> > + */
> > +
> > +#include <linux/types.h>
> > +
> > +void __weak *memset(void *s, int c, size_t n)
> > +{
> > +       int i;
> > +       char *ss = s;
> > +
> > +       for (i = 0; i < n; i++)
> > +               ss[i] = c;
> > +       return s;
> > +}
> > +
> > +void __weak *memcpy(void *dest, const void *src, size_t n)
> > +{
> > +       int i;
> > +       const char *s = src;
> > +       char *d = dest;
> > +
> > +       for (i = 0; i < n; i++)
> > +               d[i] = s[i];
> > +       return dest;
> > +}
> > +
> > +void __weak *memmove(void *dest, const void *src, size_t n)
> > +{
> > +       int i;
> > +       const char *s = src;
> > +       char *d = dest;
> > +
> > +       if (d < s) {
> > +               for (i = 0; i < n; i++)
> > +                       d[i] = s[i];
> > +       } else if (d > s) {
> > +               for (i = n - 1; i >= 0; i--)
> > +                       d[i] = s[i];
> > +       }
> > +
> > +       return dest;
> > +}
> > +
> > +int __weak memcmp(const void *cs, const void *ct, size_t count)
> > +{
> > +       int res = 0;
> > +       const unsigned char *su1, *su2;
> > +
> > +       for (su1 = cs, su2 = ct; 0 < count; ++su1, ++su2, count--) {
> > +               res = *su1 - *su2;
> > +               if (res != 0)
> > +                       break;
> > +       }
> > +       return res;
> > +}
> > +
> > +int __weak strcmp(const char *str1, const char *str2)
> > +{
> > +       int delta = 0;
> > +       const unsigned char *s1 = (const unsigned char *)str1;
> > +       const unsigned char *s2 = (const unsigned char *)str2;
> > +
> > +       while (*s1 || *s2) {
> > +               delta = *s1 - *s2;
> > +               if (delta)
> > +                       return delta;
> > +               s1++;
> > +               s2++;
> > +       }
> > +       return 0;
> > +}
> > +
> > +size_t __weak strlen(const char *s)
> > +{
> > +       const char *sc;
> > +
> > +       for (sc = s; *sc != '\0'; ++sc)
> > +               /* nothing */;
> > +       return sc - s;
> > +}
> > +
> > +size_t __weak strnlen(const char *s, size_t count)
> > +{
> > +       const char *sc;
> > +
> > +       for (sc = s; count-- && *sc != '\0'; ++sc)
> > +               /* nothing */;
> > +       return sc - s;
> > +}
> > +
> > +char * __weak strnstr(const char *s1, const char *s2, size_t len)
> > +{
> > +       size_t l2;
> > +
> > +       l2 = strlen(s2);
> > +       if (!l2)
> > +               return (char *)s1;
> > +       while (len >= l2) {
> > +               len--;
> > +               if (!memcmp(s1, s2, l2))
> > +                       return (char *)s1;
> > +               s1++;
> > +       }
> > +       return NULL;
> > +}
> > +
> > +#undef strcat
> > +char * __weak strcat(char *dest, const char *src)
> > +{
> > +       char *tmp = dest;
> > +
> > +       while (*dest)
> > +               dest++;
> > +       while ((*dest++ = *src++) != '\0')
> > +               ;
> > +       return tmp;
> > +}
> > +
> > +char * __weak strncat(char *dest, const char *src, size_t count)
> > +{
> > +       char *tmp = dest;
> > +
> > +       if (count) {
> > +               while (*dest)
> > +                       dest++;
> > +               while ((*dest++ = *src++) != 0) {
> > +                       if (--count == 0) {
> > +                               *dest = '\0';
> > +                               break;
> > +                       }
> > +               }
> > +       }
> > +       return tmp;
> > +}
> > +
> > +char * __weak strpbrk(const char *cs, const char *ct)
> > +{
> > +       const char *sc1, *sc2;
> > +
> > +       for (sc1 = cs; *sc1 != '\0'; ++sc1) {
> > +               for (sc2 = ct; *sc2 != '\0'; ++sc2) {
> > +                       if (*sc1 == *sc2)
> > +                               return (char *)sc1;
> > +               }
> > +       }
> > +       return NULL;
> > +}
> > +
> > +char * __weak strsep(char **s, const char *ct)
> > +{
> > +       char *sbegin = *s;
> > +       char *end;
> > +
> > +       if (sbegin == NULL)
> > +               return NULL;
> > +
> > +       end = strpbrk(sbegin, ct);
> > +       if (end)
> > +               *end++ = '\0';
> > +       *s = end;
> > +       return sbegin;
> > +}
> > diff --git a/arch/loongarch/boot/zheader.S b/arch/loongarch/boot/zheader.S
> > new file mode 100644
> > index 000000000000..4bc50d953ec7
> > --- /dev/null
> > +++ b/arch/loongarch/boot/zheader.S
> > @@ -0,0 +1,100 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/pe.h>
> > +#include <linux/sizes.h>
> > +
> > +       .macro  __EFI_PE_HEADER
> > +       .long   PE_MAGIC
> > +coff_header:
> > +       .short  IMAGE_FILE_MACHINE_LOONGARCH            /* Machine */
> > +       .short  section_count                           /* NumberOfSections */
> > +       .long   0                                       /* TimeDateStamp */
> > +       .long   0                                       /* PointerToSymbolTable */
> > +       .long   0                                       /* NumberOfSymbols */
> > +       .short  section_table - optional_header         /* SizeOfOptionalHeader */
> > +       .short  IMAGE_FILE_DEBUG_STRIPPED | \
> > +               IMAGE_FILE_EXECUTABLE_IMAGE | \
> > +               IMAGE_FILE_LINE_NUMS_STRIPPED           /* Characteristics */
> > +
> > +optional_header:
> > +       .short  PE_OPT_MAGIC_PE32PLUS                   /* PE32+ format */
> > +       .byte   0x02                                    /* MajorLinkerVersion */
> > +       .byte   0x14                                    /* MinorLinkerVersion */
> > +       .long   _data - efi_header_end                  /* SizeOfCode */
> > +       .long   _end - _data                            /* SizeOfInitializedData */
> > +       .long   0                                       /* SizeOfUninitializedData */
> > +       .long   __efistub_efi_pe_entry - _head          /* AddressOfEntryPoint */
> > +       .long   efi_header_end - _head                  /* BaseOfCode */
> > +
> > +extra_header_fields:
> > +       .quad   0                                       /* ImageBase */
> > +       .long   PECOFF_SEGMENT_ALIGN                    /* SectionAlignment */
> > +       .long   PECOFF_FILE_ALIGN                       /* FileAlignment */
> > +       .short  0                                       /* MajorOperatingSystemVersion */
> > +       .short  0                                       /* MinorOperatingSystemVersion */
> > +       .short  0                                       /* MajorImageVersion */
> > +       .short  0                                       /* MinorImageVersion */
> > +       .short  0                                       /* MajorSubsystemVersion */
> > +       .short  0                                       /* MinorSubsystemVersion */
> > +       .long   0                                       /* Win32VersionValue */
> > +
> > +       .long   _end - _head                            /* SizeOfImage */
> > +
> > +       /* Everything before the kernel image is considered part of the header */
> > +       .long   efi_header_end - _head                  /* SizeOfHeaders */
> > +       .long   0                                       /* CheckSum */
> > +       .short  IMAGE_SUBSYSTEM_EFI_APPLICATION         /* Subsystem */
> > +       .short  0                                       /* DllCharacteristics */
> > +       .quad   0                                       /* SizeOfStackReserve */
> > +       .quad   0                                       /* SizeOfStackCommit */
> > +       .quad   0                                       /* SizeOfHeapReserve */
> > +       .quad   0                                       /* SizeOfHeapCommit */
> > +       .long   0                                       /* LoaderFlags */
> > +       .long   (section_table - .) / 8                 /* NumberOfRvaAndSizes */
> > +
> > +       .quad   0                                       /* ExportTable */
> > +       .quad   0                                       /* ImportTable */
> > +       .quad   0                                       /* ResourceTable */
> > +       .quad   0                                       /* ExceptionTable */
> > +       .quad   0                                       /* CertificationTable */
> > +       .quad   0                                       /* BaseRelocationTable */
> > +
> > +       /* Section table */
> > +section_table:
> > +       .ascii  ".text\0\0\0"
> > +       .long   _data - efi_header_end                  /* VirtualSize */
> > +       .long   efi_header_end - _head                  /* VirtualAddress */
> > +       .long   _data - efi_header_end                  /* SizeOfRawData */
> > +       .long   efi_header_end - _head                  /* PointerToRawData */
> > +
> > +       .long   0                                       /* PointerToRelocations */
> > +       .long   0                                       /* PointerToLineNumbers */
> > +       .short  0                                       /* NumberOfRelocations */
> > +       .short  0                                       /* NumberOfLineNumbers */
> > +       .long   IMAGE_SCN_CNT_CODE | \
> > +               IMAGE_SCN_MEM_READ | \
> > +               IMAGE_SCN_MEM_EXECUTE                   /* Characteristics */
> > +
> > +       .ascii  ".data\0\0\0"
> > +       .long   _end - _data                            /* VirtualSize */
> > +       .long   _data - _head                           /* VirtualAddress */
> > +       .long   _edata - _data                          /* SizeOfRawData */
> > +       .long   _data - _head                           /* PointerToRawData */
> > +
> > +       .long   0                                       /* PointerToRelocations */
> > +       .long   0                                       /* PointerToLineNumbers */
> > +       .short  0                                       /* NumberOfRelocations */
> > +       .short  0                                       /* NumberOfLineNumbers */
> > +       .long   IMAGE_SCN_CNT_INITIALIZED_DATA | \
> > +               IMAGE_SCN_MEM_READ | \
> > +               IMAGE_SCN_MEM_WRITE                     /* Characteristics */
> > +
> > +       .org 0x20e
> > +       .word kernel_version - 512 -  _head
> > +
> > +       .set    section_count, (. - section_table) / 40
> > +efi_header_end:
> > +       .endm
> > diff --git a/arch/loongarch/boot/zkernel.S b/arch/loongarch/boot/zkernel.S
> > new file mode 100644
> > index 000000000000..13a8a14a2328
> > --- /dev/null
> > +++ b/arch/loongarch/boot/zkernel.S
> > @@ -0,0 +1,99 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/init.h>
> > +#include <linux/linkage.h>
> > +#include <asm/addrspace.h>
> > +#include <asm/asm.h>
> > +#include <asm/loongarch.h>
> > +#include <asm/regdef.h>
> > +#include <generated/compile.h>
> > +#include <generated/utsrelease.h>
> > +
> > +#ifdef CONFIG_EFI_STUB
> > +
> > +#include "zheader.S"
> > +
> > +       __HEAD
> > +
> > +_head:
> > +       /* "MZ", MS-DOS header */
> > +       .word   MZ_MAGIC
> > +       .org    0x28
> > +       .ascii  "Loongson\0"
> > +       .org    0x3c
> > +       /* Offset to the PE header */
> > +       .long   pe_header - _head
> > +
> > +pe_header:
> > +       __EFI_PE_HEADER
> > +
> > +kernel_asize:
> > +       .long _end - _text
> > +
> > +kernel_fsize:
> > +       .long _edata - _text
> > +
> > +kernel_vaddr:
> > +       .quad VMLINUZ_LOAD_ADDRESS
> > +
> > +kernel_offset:
> > +       .long kernel_offset - _text
> > +
> > +kernel_version:
> > +       .ascii  UTS_RELEASE " (" LINUX_COMPILE_BY "@" LINUX_COMPILE_HOST ") " UTS_VERSION "\0"
> > +
> > +SYM_L_GLOBAL(kernel_asize)
> > +SYM_L_GLOBAL(kernel_fsize)
> > +SYM_L_GLOBAL(kernel_vaddr)
> > +SYM_L_GLOBAL(kernel_offset)
> > +
> > +#endif
> > +
> > +       __INIT
> > +
> > +SYM_CODE_START(kernel_entry)
> > +       /* Save boot rom start args */
> > +       move    s0, a0
> > +       move    s1, a1
> > +       move    s2, a2
> > +       move    s3, a3
> > +
> > +       /* Config Direct Mapping */
> > +       li.d    t0, CSR_DMW0_INIT
> > +       csrwr   t0, LOONGARCH_CSR_DMWIN0
> > +       li.d    t0, CSR_DMW1_INIT
> > +       csrwr   t0, LOONGARCH_CSR_DMWIN1
> > +
> > +       /* Clear BSS */
> > +       la.abs  a0, _edata
> > +       la.abs  a2, _end
> > +1:     st.d    zero, a0, 0
> > +       addi.d  a0, a0, 8
> > +       bne     a2, a0, 1b
> > +
> > +       la.abs  a0, .heap          /* heap address */
> > +       la.abs  sp, .stack + 8192  /* stack address */
> > +
> > +       la      ra, 2f
> > +       la      t4, decompress_kernel
> > +       jirl    zero, t4, 0
> > +2:
> > +       move    a0, s0
> > +       move    a1, s1
> > +       move    a2, s2
> > +       move    a3, s3
> > +       PTR_LI  t4, KERNEL_ENTRY
> > +       jirl    zero, t4, 0
> > +3:
> > +       b       3b
> > +SYM_CODE_END(kernel_entry)
> > +
> > +       .comm .heap, BOOT_HEAP_SIZE, 4
> > +       .comm .stack, BOOT_STACK_SIZE, 4
> > +
> > +       .align 4
> > +       .section .image, "a", %progbits
> > +       .incbin "arch/loongarch/boot/vmlinux.bin.z"
> > diff --git a/arch/loongarch/tools/Makefile b/arch/loongarch/tools/Makefile
> > new file mode 100644
> > index 000000000000..8a6181c82a91
> > --- /dev/null
> > +++ b/arch/loongarch/tools/Makefile
> > @@ -0,0 +1,15 @@
> > +#
> > +# arch/loongarch/boot/Makefile
> > +#
> > +# Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > +#
> > +
> > +hostprogs := elf-entry
> > +PHONY += elf-entry
> > +elf-entry: $(obj)/elf-entry
> > +       @:
> > +
> > +hostprogs += calc_vmlinuz_load_addr
> > +PHONY += calc_vmlinuz_load_addr
> > +calc_vmlinuz_load_addr: $(obj)/calc_vmlinuz_load_addr
> > +       @:
> > diff --git a/arch/loongarch/tools/calc_vmlinuz_load_addr.c b/arch/loongarch/tools/calc_vmlinuz_load_addr.c
> > new file mode 100644
> > index 000000000000..5e2ca6b4dff6
> > --- /dev/null
> > +++ b/arch/loongarch/tools/calc_vmlinuz_load_addr.c
> > @@ -0,0 +1,51 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <errno.h>
> > +#include <stdint.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <sys/stat.h>
> > +
> > +int main(int argc, char *argv[])
> > +{
> > +       unsigned long long vmlinux_size, vmlinux_load_addr, vmlinuz_load_addr;
> > +       struct stat sb;
> > +
> > +       if (argc != 3) {
> > +               fprintf(stderr, "Usage: %s <pathname> <vmlinux_load_addr>\n", argv[0]);
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       if (stat(argv[1], &sb) == -1) {
> > +               perror("stat");
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       /* Convert hex characters to dec number */
> > +       errno = 0;
> > +       if (sscanf(argv[2], "%llx", &vmlinux_load_addr) != 1) {
> > +               if (errno != 0)
> > +                       perror("sscanf");
> > +               else
> > +                       fprintf(stderr, "No matching characters\n");
> > +
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       vmlinux_size = (uint64_t)sb.st_size;
> > +       vmlinuz_load_addr = vmlinux_load_addr + vmlinux_size;
> > +
> > +       /*
> > +        * Align with 64KB: KEXEC needs load sections to be aligned to PAGE_SIZE,
> > +        * which may be as large as 64KB depending on the kernel configuration.
> > +        */
> > +
> > +       vmlinuz_load_addr += (0x10000 - vmlinux_size % 0x10000);
> > +
> > +       printf("0x%llx\n", vmlinuz_load_addr);
> > +
> > +       return EXIT_SUCCESS;
> > +}
> > diff --git a/arch/loongarch/tools/elf-entry.c b/arch/loongarch/tools/elf-entry.c
> > new file mode 100644
> > index 000000000000..c80721e0dee1
> > --- /dev/null
> > +++ b/arch/loongarch/tools/elf-entry.c
> > @@ -0,0 +1,66 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <elf.h>
> > +#include <inttypes.h>
> > +#include <stdint.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <string.h>
> > +
> > +__attribute__((noreturn))
> > +static void die(const char *msg)
> > +{
> > +       fputs(msg, stderr);
> > +       exit(EXIT_FAILURE);
> > +}
> > +
> > +int main(int argc, const char *argv[])
> > +{
> > +       uint64_t entry;
> > +       size_t nread;
> > +       FILE *file;
> > +       union {
> > +               Elf32_Ehdr ehdr32;
> > +               Elf64_Ehdr ehdr64;
> > +       } hdr;
> > +
> > +       if (argc != 2)
> > +               die("Usage: elf-entry <elf-file>\n");
> > +
> > +       file = fopen(argv[1], "r");
> > +       if (!file) {
> > +               perror("Unable to open input file");
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       nread = fread(&hdr, 1, sizeof(hdr), file);
> > +       if (nread != sizeof(hdr)) {
> > +               fclose(file);
> > +               perror("Unable to read input file");
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       if (memcmp(hdr.ehdr32.e_ident, ELFMAG, SELFMAG)) {
> > +               fclose(file);
> > +               die("Input is not an ELF\n");
> > +       }
> > +
> > +       switch (hdr.ehdr32.e_ident[EI_CLASS]) {
> > +       case ELFCLASS32:
> > +               /* Sign extend to form a canonical address */
> > +               entry = (int64_t)(int32_t)hdr.ehdr32.e_entry;
> > +               break;
> > +
> > +       case ELFCLASS64:
> > +               entry = hdr.ehdr64.e_entry;
> > +               break;
> > +
> > +       default:
> > +               fclose(file);
> > +               die("Invalid ELF class\n");
> > +       }
> > +
> > +       fclose(file);
> > +       printf("0x%016" PRIx64 "\n", entry);
> > +
> > +       return EXIT_SUCCESS;
> > +}
> > --
> > 2.27.0
> >

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
  2022-05-01  5:22       ` Huacai Chen
  (?)
@ 2022-05-01  6:35         ` Russell King (Oracle)
  -1 siblings, 0 replies; 94+ messages in thread
From: Russell King (Oracle) @ 2022-05-01  6:35 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Huacai Chen, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Guo Ren,
	Xuerui Wang, Jiaxun Yang, Linux ARM, Catalin Marinas,
	Will Deacon, linux-riscv, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Ard Biesheuvel, linux-efi

On Sun, May 01, 2022 at 01:22:25PM +0800, Huacai Chen wrote:
> Hi, Arnd,
> 
> On Sat, Apr 30, 2022 at 7:02 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> > >
> > > This patch adds zboot (self-extracting compressed kernel) support, all
> > > existing in-kernel compressing algorithm and efistub are supported.
> > >
> > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> >
> > I have no objections to adding a decompressor in principle, and
> > the implementation seems reasonable. However, I think we should try to
> > be consistent between architectures. On both arm64 and riscv, the
> > maintainers decided to not include a decompressor and instead leave
> > it up to the boot loader to decompress the kernel and enter it from there.
> X86, ARM32 and MIPS already support self-extracting kernel, and in
> 5.17 we even support self-extracting modules. So I think a
> self-extracting kernel is better than a pure compressed kernel.

FYI, kernel modules are not self-extracting. They don't contain the code
to do the decompression - that is contained within the kernel, and it is
the kernel that does the decompression. The userspace tooling tells the
kernel that the module is compressed.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
@ 2022-05-01  6:35         ` Russell King (Oracle)
  0 siblings, 0 replies; 94+ messages in thread
From: Russell King (Oracle) @ 2022-05-01  6:35 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Huacai Chen, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Guo Ren,
	Xuerui Wang, Jiaxun Yang, Linux ARM, Catalin Marinas,
	Will Deacon, linux-riscv, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Ard Biesheuvel, linux-efi

On Sun, May 01, 2022 at 01:22:25PM +0800, Huacai Chen wrote:
> Hi, Arnd,
> 
> On Sat, Apr 30, 2022 at 7:02 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> > >
> > > This patch adds zboot (self-extracting compressed kernel) support, all
> > > existing in-kernel compressing algorithm and efistub are supported.
> > >
> > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> >
> > I have no objections to adding a decompressor in principle, and
> > the implementation seems reasonable. However, I think we should try to
> > be consistent between architectures. On both arm64 and riscv, the
> > maintainers decided to not include a decompressor and instead leave
> > it up to the boot loader to decompress the kernel and enter it from there.
> X86, ARM32 and MIPS already support self-extracting kernel, and in
> 5.17 we even support self-extracting modules. So I think a
> self-extracting kernel is better than a pure compressed kernel.

FYI, kernel modules are not self-extracting. They don't contain the code
to do the decompression - that is contained within the kernel, and it is
the kernel that does the decompression. The userspace tooling tells the
kernel that the module is compressed.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
@ 2022-05-01  6:35         ` Russell King (Oracle)
  0 siblings, 0 replies; 94+ messages in thread
From: Russell King (Oracle) @ 2022-05-01  6:35 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Huacai Chen, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Guo Ren,
	Xuerui Wang, Jiaxun Yang, Linux ARM, Catalin Marinas,
	Will Deacon, linux-riscv, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Ard Biesheuvel, linux-efi

On Sun, May 01, 2022 at 01:22:25PM +0800, Huacai Chen wrote:
> Hi, Arnd,
> 
> On Sat, Apr 30, 2022 at 7:02 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> > >
> > > This patch adds zboot (self-extracting compressed kernel) support, all
> > > existing in-kernel compressing algorithm and efistub are supported.
> > >
> > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> >
> > I have no objections to adding a decompressor in principle, and
> > the implementation seems reasonable. However, I think we should try to
> > be consistent between architectures. On both arm64 and riscv, the
> > maintainers decided to not include a decompressor and instead leave
> > it up to the boot loader to decompress the kernel and enter it from there.
> X86, ARM32 and MIPS already support self-extracting kernel, and in
> 5.17 we even support self-extracting modules. So I think a
> self-extracting kernel is better than a pure compressed kernel.

FYI, kernel modules are not self-extracting. They don't contain the code
to do the decompression - that is contained within the kernel, and it is
the kernel that does the decompression. The userspace tooling tells the
kernel that the module is compressed.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 01/24] Documentation: LoongArch: Add basic documentations
  2022-04-30  9:04 ` [PATCH V9 01/24] Documentation: LoongArch: Add basic documentations Huacai Chen
@ 2022-05-01  7:48   ` Bagas Sanjaya
  2022-05-01  8:55     ` Huacai Chen
  2022-05-01  9:32   ` WANG Xuerui
  1 sibling, 1 reply; 94+ messages in thread
From: Bagas Sanjaya @ 2022-05-01  7:48 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds,
	linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang

On Sat, Apr 30, 2022 at 05:04:55PM +0800, Huacai Chen wrote:
> +Instruction names (Mnemonics)
> +-----------------------------
> +
> +We only list the instruction names here, for details please read the references.
> +
> +Arithmetic Operation Instructions::
> +
> +  ADD.W SUB.W ADDI.W ADD.D SUB.D ADDI.D
> +  SLT SLTU SLTI SLTUI
> +  AND OR NOR XOR ANDN ORN ANDI ORI XORI
> +  MUL.W MULH.W MULH.WU DIV.W DIV.WU MOD.W MOD.WU
> +  MUL.D MULH.D MULH.DU DIV.D DIV.DU MOD.D MOD.DU
> +  PCADDI PCADDU12I PCADDU18I
> +  LU12I.W LU32I.D LU52I.D ADDU16I.D
> +
> +Bit-shift Instructions::
> +
> +  SLL.W SRL.W SRA.W ROTR.W SLLI.W SRLI.W SRAI.W ROTRI.W
> +  SLL.D SRL.D SRA.D ROTR.D SLLI.D SRLI.D SRAI.D ROTRI.D
> +
> +Bit-manipulation Instructions::
> +
> +  EXT.W.B EXT.W.H CLO.W CLO.D SLZ.W CLZ.D CTO.W CTO.D CTZ.W CTZ.D
> +  BYTEPICK.W BYTEPICK.D BSTRINS.W BSTRINS.D BSTRPICK.W BSTRPICK.D
> +  REVB.2H REVB.4H REVB.2W REVB.D REVH.2W REVH.D BITREV.4B BITREV.8B BITREV.W BITREV.D
> +  MASKEQZ MASKNEZ
> +
> +Branch Instructions::
> +
> +  BEQ BNE BLT BGE BLTU BGEU BEQZ BNEZ B BL JIRL
> +
> +Load/Store Instructions::
> +
> +  LD.B LD.BU LD.H LD.HU LD.W LD.WU LD.D ST.B ST.H ST.W ST.D
> +  LDX.B LDX.BU LDX.H LDX.HU LDX.W LDX.WU LDX.D STX.B STX.H STX.W STX.D
> +  LDPTR.W LDPTR.D STPTR.W STPTR.D
> +  PRELD PRELDX
> +
> +Atomic Operation Instructions::
> +
> +  LL.W SC.W LL.D SC.D
> +  AMSWAP.W AMSWAP.D AMADD.W AMADD.D AMAND.W AMAND.D AMOR.W AMOR.D AMXOR.W AMXOR.D
> +  AMMAX.W AMMAX.D AMMIN.W AMMIN.D
> +
> +Barrier Instructions::
> +
> +  IBAR DBAR
> +
> +Special Instructions::
> +
> +  SYSCALL BREAK CPUCFG NOP IDLE ERTN DBCL RDTIMEL.W RDTIMEH.W RDTIME.D ASRTLE.D ASRTGT.D
> +
> +Privileged Instructions::
> +
> +  CSRRD CSRWR CSRXCHG
> +  IOCSRRD.B IOCSRRD.H IOCSRRD.W IOCSRRD.D IOCSRWR.B IOCSRWR.H IOCSRWR.W IOCSRWR.D
> +  CACOP TLBP(TLBSRCH) TLBRD TLBWR TLBFILL TLBCLR TLBFLUSH INVTLB LDDIR LDPTE
> +

Since these above are list of instruction categories, it's better to use
enumerated lists. Also, make use of ReST labels to link to References
sections, like this:

-- >8 --

diff --git a/Documentation/loongarch/introduction.rst b/Documentation/loongarch/introduction.rst
index 420c0d2ebcfbe7..2d83283ecf28b9 100644
--- a/Documentation/loongarch/introduction.rst
+++ b/Documentation/loongarch/introduction.rst
@@ -194,60 +194,61 @@ can see I21L/I21H and I26L/I26H here.
 Instruction names (Mnemonics)
 -----------------------------
 
-We only list the instruction names here, for details please read the references.
+We only list the instruction names here, for details please read the
+:ref:`references <loongarch-references>`.
 
-Arithmetic Operation Instructions::
+1. Arithmetic Operation Instructions::
 
-  ADD.W SUB.W ADDI.W ADD.D SUB.D ADDI.D
-  SLT SLTU SLTI SLTUI
-  AND OR NOR XOR ANDN ORN ANDI ORI XORI
-  MUL.W MULH.W MULH.WU DIV.W DIV.WU MOD.W MOD.WU
-  MUL.D MULH.D MULH.DU DIV.D DIV.DU MOD.D MOD.DU
-  PCADDI PCADDU12I PCADDU18I
-  LU12I.W LU32I.D LU52I.D ADDU16I.D
+     ADD.W SUB.W ADDI.W ADD.D SUB.D ADDI.D
+     SLT SLTU SLTI SLTUI
+     AND OR NOR XOR ANDN ORN ANDI ORI XORI
+     MUL.W MULH.W MULH.WU DIV.W DIV.WU MOD.W MOD.WU
+     MUL.D MULH.D MULH.DU DIV.D DIV.DU MOD.D MOD.DU
+     PCADDI PCADDU12I PCADDU18I
+     LU12I.W LU32I.D LU52I.D ADDU16I.D
 
-Bit-shift Instructions::
+2. Bit-shift Instructions::
 
-  SLL.W SRL.W SRA.W ROTR.W SLLI.W SRLI.W SRAI.W ROTRI.W
-  SLL.D SRL.D SRA.D ROTR.D SLLI.D SRLI.D SRAI.D ROTRI.D
+     SLL.W SRL.W SRA.W ROTR.W SLLI.W SRLI.W SRAI.W ROTRI.W
+     SLL.D SRL.D SRA.D ROTR.D SLLI.D SRLI.D SRAI.D ROTRI.D
 
-Bit-manipulation Instructions::
+3. Bit-manipulation Instructions::
 
-  EXT.W.B EXT.W.H CLO.W CLO.D SLZ.W CLZ.D CTO.W CTO.D CTZ.W CTZ.D
-  BYTEPICK.W BYTEPICK.D BSTRINS.W BSTRINS.D BSTRPICK.W BSTRPICK.D
-  REVB.2H REVB.4H REVB.2W REVB.D REVH.2W REVH.D BITREV.4B BITREV.8B BITREV.W BITREV.D
-  MASKEQZ MASKNEZ
+     EXT.W.B EXT.W.H CLO.W CLO.D SLZ.W CLZ.D CTO.W CTO.D CTZ.W CTZ.D
+     BYTEPICK.W BYTEPICK.D BSTRINS.W BSTRINS.D BSTRPICK.W BSTRPICK.D
+     REVB.2H REVB.4H REVB.2W REVB.D REVH.2W REVH.D BITREV.4B BITREV.8B BITREV.W BITREV.D
+     MASKEQZ MASKNEZ
 
-Branch Instructions::
+4. Branch Instructions::
 
-  BEQ BNE BLT BGE BLTU BGEU BEQZ BNEZ B BL JIRL
+     BEQ BNE BLT BGE BLTU BGEU BEQZ BNEZ B BL JIRL
 
-Load/Store Instructions::
+5. Load/Store Instructions::
 
-  LD.B LD.BU LD.H LD.HU LD.W LD.WU LD.D ST.B ST.H ST.W ST.D
-  LDX.B LDX.BU LDX.H LDX.HU LDX.W LDX.WU LDX.D STX.B STX.H STX.W STX.D
-  LDPTR.W LDPTR.D STPTR.W STPTR.D
-  PRELD PRELDX
+     LD.B LD.BU LD.H LD.HU LD.W LD.WU LD.D ST.B ST.H ST.W ST.D
+     LDX.B LDX.BU LDX.H LDX.HU LDX.W LDX.WU LDX.D STX.B STX.H STX.W STX.D
+     LDPTR.W LDPTR.D STPTR.W STPTR.D
+     PRELD PRELDX
 
-Atomic Operation Instructions::
+6. Atomic Operation Instructions::
 
-  LL.W SC.W LL.D SC.D
-  AMSWAP.W AMSWAP.D AMADD.W AMADD.D AMAND.W AMAND.D AMOR.W AMOR.D AMXOR.W AMXOR.D
-  AMMAX.W AMMAX.D AMMIN.W AMMIN.D
+     LL.W SC.W LL.D SC.D
+     AMSWAP.W AMSWAP.D AMADD.W AMADD.D AMAND.W AMAND.D AMOR.W AMOR.D AMXOR.W AMXOR.D
+     AMMAX.W AMMAX.D AMMIN.W AMMIN.D
 
-Barrier Instructions::
+7. Barrier Instructions::
 
-  IBAR DBAR
+     IBAR DBAR
 
-Special Instructions::
+8. Special Instructions::
 
-  SYSCALL BREAK CPUCFG NOP IDLE ERTN DBCL RDTIMEL.W RDTIMEH.W RDTIME.D ASRTLE.D ASRTGT.D
+     SYSCALL BREAK CPUCFG NOP IDLE ERTN DBCL RDTIMEL.W RDTIMEH.W RDTIME.D ASRTLE.D ASRTGT.D
 
-Privileged Instructions::
+9. Privileged Instructions::
 
-  CSRRD CSRWR CSRXCHG
-  IOCSRRD.B IOCSRRD.H IOCSRRD.W IOCSRRD.D IOCSRWR.B IOCSRWR.H IOCSRWR.W IOCSRWR.D
-  CACOP TLBP(TLBSRCH) TLBRD TLBWR TLBFILL TLBCLR TLBFLUSH INVTLB LDDIR LDPTE
+     CSRRD CSRWR CSRXCHG
+     IOCSRRD.B IOCSRRD.H IOCSRRD.W IOCSRRD.D IOCSRWR.B IOCSRWR.H IOCSRWR.W IOCSRWR.D
+     CACOP TLBP(TLBSRCH) TLBRD TLBWR TLBFILL TLBCLR TLBFLUSH INVTLB LDDIR LDPTE
 
 Virtual Memory
 ==============
@@ -315,6 +316,8 @@ MIPS, while New Loongson is based on LoongArch. Take Loongson-3 as an example:
 Loongson-3A1000/3B1500/3A2000/3A3000/3A4000 are MIPS-compatible, while Loongson-
 3A5000 (and future revisions) are all based on LoongArch.
 
+.. _loongarch-references:
+
 References
 ==========
 

> +
> + +---------------------------------------------+
> + |::                                           |
> + |                                             |
> + |    +-----+     +---------+     +-------+    |
> + |    | IPI | --> | CPUINTC | <-- | Timer |    |
> + |    +-----+     +---------+     +-------+    |
> + |                     ^                       |
> + |                     |                       |
> + |                +---------+     +-------+    |
> + |                | LIOINTC | <-- | UARTs |    |
> + |                +---------+     +-------+    |
> + |                     ^                       |
> + |                     |                       |
> + |               +-----------+                 |
> + |               | HTVECINTC |                 |
> + |               +-----------+                 |
> + |                ^         ^                  |
> + |                |         |                  |
> + |          +---------+ +---------+            |
> + |          | PCH-PIC | | PCH-MSI |            |
> + |          +---------+ +---------+            |
> + |            ^     ^           ^              |
> + |            |     |           |              |
> + |    +---------+ +---------+ +---------+      |
> + |    | PCH-LPC | | Devices | | Devices |      |
> + |    +---------+ +---------+ +---------+      |
> + |         ^                                   |
> + |         |                                   |
> + |    +---------+                              |
> + |    | Devices |                              |
> + |    +---------+                              |
> + |                                             |
> + |                                             |
> + +---------------------------------------------+
> +
...
> +
> + +--------------------------------------------------------+
> + |::                                                      |
> + |                                                        |
> + |         +-----+     +---------+     +-------+          |
> + |         | IPI | --> | CPUINTC | <-- | Timer |          |
> + |         +-----+     +---------+     +-------+          |
> + |                      ^       ^                         |
> + |                      |       |                         |
> + |               +---------+ +---------+     +-------+    |
> + |               | EIOINTC | | LIOINTC | <-- | UARTs |    |
> + |               +---------+ +---------+     +-------+    |
> + |                ^       ^                               |
> + |                |       |                               |
> + |         +---------+ +---------+                        |
> + |         | PCH-PIC | | PCH-MSI |                        |
> + |         +---------+ +---------+                        |
> + |           ^     ^           ^                          |
> + |           |     |           |                          |
> + |   +---------+ +---------+ +---------+                  |
> + |   | PCH-LPC | | Devices | | Devices |                  |
> + |   +---------+ +---------+ +---------+                  |
> + |        ^                                               |
> + |        |                                               |
> + |   +---------+                                          |
> + |   | Devices |                                          |
> + |   +---------+                                          |
> + |                                                        |
> + |                                                        |
> + +--------------------------------------------------------+
> +

I think just literal blocks is enough for the diagrams above.

-- 
An old man doll... just what I always wanted! - Clara

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 00/22] arch: Add basic LoongArch support
  2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
                   ` (23 preceding siblings ...)
  2022-04-30  9:05 ` [PATCH V9 24/24] LoongArch: Add Loongson-3 default config file Huacai Chen
@ 2022-05-01  8:19 ` Bagas Sanjaya
  2022-05-01  8:55   ` Huacai Chen
  24 siblings, 1 reply; 94+ messages in thread
From: Bagas Sanjaya @ 2022-05-01  8:19 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds,
	linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang

On Sat, Apr 30, 2022 at 05:04:54PM +0800, Huacai Chen wrote:
> Huacai Chen(24):
>  Documentation: LoongArch: Add basic documentations.
>  Documentation/zh_CN: Add basic LoongArch documentations.
>  LoongArch: Add elf-related definitions.
>  LoongArch: Add writecombine support for drm.
>  LoongArch: Add build infrastructure.
>  LoongArch: Add CPU definition headers.
>  LoongArch: Add atomic/locking headers.
>  LoongArch: Add other common headers.
>  LoongArch: Add boot and setup routines.
>  LoongArch: Add exception/interrupt handling. 
>  LoongArch: Add process management.
>  LoongArch: Add memory management.
>  LoongArch: Add system call support.
>  LoongArch: Add signal handling support.
>  LoongArch: Add elf and module support.
>  LoongArch: Add misc common routines.
>  LoongArch: Add some library functions.
>  LoongArch: Add PCI controller support.
>  LoongArch: Add VDSO and VSYSCALL support.
>  LoongArch: Add efistub booting support.
>  LoongArch: Add zboot (compressed kernel) support.
>  LoongArch: Add multi-processor (SMP) support.
>  LoongArch: Add Non-Uniform Memory Access (NUMA) support.
>  LoongArch: Add Loongson-3 default config file.
> 

I have skimmed through patch descriptions, and I see patch 05-24/24 use
descriptive mood (This patch adds what...). Please write them in
imperative mood instead.

-- 
An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
  2022-05-01  5:22       ` Huacai Chen
  (?)
@ 2022-05-01  8:33         ` Arnd Bergmann
  -1 siblings, 0 replies; 94+ messages in thread
From: Arnd Bergmann @ 2022-05-01  8:33 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Huacai Chen, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Guo Ren,
	Xuerui Wang, Jiaxun Yang, Linux ARM, Catalin Marinas,
	Will Deacon, linux-riscv, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Ard Biesheuvel, linux-efi

On Sun, May 1, 2022 at 7:22 AM Huacai Chen <chenhuacai@gmail.com> wrote:
> On Sat, Apr 30, 2022 at 7:02 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> > >
> > > This patch adds zboot (self-extracting compressed kernel) support, all
> > > existing in-kernel compressing algorithm and efistub are supported.
> > >
> > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> >
> > I have no objections to adding a decompressor in principle, and
> > the implementation seems reasonable. However, I think we should try to
> > be consistent between architectures. On both arm64 and riscv, the
> > maintainers decided to not include a decompressor and instead leave
> > it up to the boot loader to decompress the kernel and enter it from there.
>
> X86, ARM32 and MIPS already support self-extracting kernel, and in
> 5.17 we even support self-extracting modules. So I think a
> self-extracting kernel is better than a pure compressed kernel.

These three support it because they always have and it's hard to
remove features later because it breaks user setups. Among the
architectures we merged since the start of the git history in 2005, only
xtensa supports compressed kernels, the rest rely on the boot loader.

> > Adding the arm64, risc-v and uefi maintainers for further discussion here,
> > see full below.
>
> Keeping consistency across architectures (support self-extracting for
> all modern architectures) looks good to me, but can we do that after
> this series? I think that needs a long time to discuss and develop.

Right, just drop this patch then, and we can get back to doing it for
all UEFI users after loongarch is merged.

      Arnd

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
@ 2022-05-01  8:33         ` Arnd Bergmann
  0 siblings, 0 replies; 94+ messages in thread
From: Arnd Bergmann @ 2022-05-01  8:33 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Huacai Chen, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Guo Ren,
	Xuerui Wang, Jiaxun Yang, Linux ARM, Catalin Marinas,
	Will Deacon, linux-riscv, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Ard Biesheuvel, linux-efi

On Sun, May 1, 2022 at 7:22 AM Huacai Chen <chenhuacai@gmail.com> wrote:
> On Sat, Apr 30, 2022 at 7:02 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> > >
> > > This patch adds zboot (self-extracting compressed kernel) support, all
> > > existing in-kernel compressing algorithm and efistub are supported.
> > >
> > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> >
> > I have no objections to adding a decompressor in principle, and
> > the implementation seems reasonable. However, I think we should try to
> > be consistent between architectures. On both arm64 and riscv, the
> > maintainers decided to not include a decompressor and instead leave
> > it up to the boot loader to decompress the kernel and enter it from there.
>
> X86, ARM32 and MIPS already support self-extracting kernel, and in
> 5.17 we even support self-extracting modules. So I think a
> self-extracting kernel is better than a pure compressed kernel.

These three support it because they always have and it's hard to
remove features later because it breaks user setups. Among the
architectures we merged since the start of the git history in 2005, only
xtensa supports compressed kernels, the rest rely on the boot loader.

> > Adding the arm64, risc-v and uefi maintainers for further discussion here,
> > see full below.
>
> Keeping consistency across architectures (support self-extracting for
> all modern architectures) looks good to me, but can we do that after
> this series? I think that needs a long time to discuss and develop.

Right, just drop this patch then, and we can get back to doing it for
all UEFI users after loongarch is merged.

      Arnd

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
@ 2022-05-01  8:33         ` Arnd Bergmann
  0 siblings, 0 replies; 94+ messages in thread
From: Arnd Bergmann @ 2022-05-01  8:33 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Huacai Chen, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Guo Ren,
	Xuerui Wang, Jiaxun Yang, Linux ARM, Catalin Marinas,
	Will Deacon, linux-riscv, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Ard Biesheuvel, linux-efi

On Sun, May 1, 2022 at 7:22 AM Huacai Chen <chenhuacai@gmail.com> wrote:
> On Sat, Apr 30, 2022 at 7:02 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> > >
> > > This patch adds zboot (self-extracting compressed kernel) support, all
> > > existing in-kernel compressing algorithm and efistub are supported.
> > >
> > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> >
> > I have no objections to adding a decompressor in principle, and
> > the implementation seems reasonable. However, I think we should try to
> > be consistent between architectures. On both arm64 and riscv, the
> > maintainers decided to not include a decompressor and instead leave
> > it up to the boot loader to decompress the kernel and enter it from there.
>
> X86, ARM32 and MIPS already support self-extracting kernel, and in
> 5.17 we even support self-extracting modules. So I think a
> self-extracting kernel is better than a pure compressed kernel.

These three support it because they always have and it's hard to
remove features later because it breaks user setups. Among the
architectures we merged since the start of the git history in 2005, only
xtensa supports compressed kernels, the rest rely on the boot loader.

> > Adding the arm64, risc-v and uefi maintainers for further discussion here,
> > see full below.
>
> Keeping consistency across architectures (support self-extracting for
> all modern architectures) looks good to me, but can we do that after
> this series? I think that needs a long time to discuss and develop.

Right, just drop this patch then, and we can get back to doing it for
all UEFI users after loongarch is merged.

      Arnd

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
  2022-05-01  6:35         ` Russell King (Oracle)
  (?)
@ 2022-05-01  8:46           ` Huacai Chen
  -1 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-05-01  8:46 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Arnd Bergmann, Huacai Chen, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Guo Ren,
	Xuerui Wang, Jiaxun Yang, Linux ARM, Catalin Marinas,
	Will Deacon, linux-riscv, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Ard Biesheuvel, linux-efi

Hi, Russell,

On Sun, May 1, 2022 at 2:35 PM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> On Sun, May 01, 2022 at 01:22:25PM +0800, Huacai Chen wrote:
> > Hi, Arnd,
> >
> > On Sat, Apr 30, 2022 at 7:02 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > >
> > > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> > > >
> > > > This patch adds zboot (self-extracting compressed kernel) support, all
> > > > existing in-kernel compressing algorithm and efistub are supported.
> > > >
> > > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > >
> > > I have no objections to adding a decompressor in principle, and
> > > the implementation seems reasonable. However, I think we should try to
> > > be consistent between architectures. On both arm64 and riscv, the
> > > maintainers decided to not include a decompressor and instead leave
> > > it up to the boot loader to decompress the kernel and enter it from there.
> > X86, ARM32 and MIPS already support self-extracting kernel, and in
> > 5.17 we even support self-extracting modules. So I think a
> > self-extracting kernel is better than a pure compressed kernel.
>
> FYI, kernel modules are not self-extracting. They don't contain the code
> to do the decompression - that is contained within the kernel, and it is
> the kernel that does the decompression. The userspace tooling tells the
> kernel that the module is compressed.
I call "self-extracting" here means we don't need out-of-kernel help:
kernel decompress doesn't need the bootloader, module decompress
doesn't need kmod.

Huacai
>
> --
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
@ 2022-05-01  8:46           ` Huacai Chen
  0 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-05-01  8:46 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Arnd Bergmann, Huacai Chen, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Guo Ren,
	Xuerui Wang, Jiaxun Yang, Linux ARM, Catalin Marinas,
	Will Deacon, linux-riscv, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Ard Biesheuvel, linux-efi

Hi, Russell,

On Sun, May 1, 2022 at 2:35 PM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> On Sun, May 01, 2022 at 01:22:25PM +0800, Huacai Chen wrote:
> > Hi, Arnd,
> >
> > On Sat, Apr 30, 2022 at 7:02 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > >
> > > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> > > >
> > > > This patch adds zboot (self-extracting compressed kernel) support, all
> > > > existing in-kernel compressing algorithm and efistub are supported.
> > > >
> > > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > >
> > > I have no objections to adding a decompressor in principle, and
> > > the implementation seems reasonable. However, I think we should try to
> > > be consistent between architectures. On both arm64 and riscv, the
> > > maintainers decided to not include a decompressor and instead leave
> > > it up to the boot loader to decompress the kernel and enter it from there.
> > X86, ARM32 and MIPS already support self-extracting kernel, and in
> > 5.17 we even support self-extracting modules. So I think a
> > self-extracting kernel is better than a pure compressed kernel.
>
> FYI, kernel modules are not self-extracting. They don't contain the code
> to do the decompression - that is contained within the kernel, and it is
> the kernel that does the decompression. The userspace tooling tells the
> kernel that the module is compressed.
I call "self-extracting" here means we don't need out-of-kernel help:
kernel decompress doesn't need the bootloader, module decompress
doesn't need kmod.

Huacai
>
> --
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
@ 2022-05-01  8:46           ` Huacai Chen
  0 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-05-01  8:46 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Arnd Bergmann, Huacai Chen, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Guo Ren,
	Xuerui Wang, Jiaxun Yang, Linux ARM, Catalin Marinas,
	Will Deacon, linux-riscv, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Ard Biesheuvel, linux-efi

Hi, Russell,

On Sun, May 1, 2022 at 2:35 PM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> On Sun, May 01, 2022 at 01:22:25PM +0800, Huacai Chen wrote:
> > Hi, Arnd,
> >
> > On Sat, Apr 30, 2022 at 7:02 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > >
> > > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> > > >
> > > > This patch adds zboot (self-extracting compressed kernel) support, all
> > > > existing in-kernel compressing algorithm and efistub are supported.
> > > >
> > > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > >
> > > I have no objections to adding a decompressor in principle, and
> > > the implementation seems reasonable. However, I think we should try to
> > > be consistent between architectures. On both arm64 and riscv, the
> > > maintainers decided to not include a decompressor and instead leave
> > > it up to the boot loader to decompress the kernel and enter it from there.
> > X86, ARM32 and MIPS already support self-extracting kernel, and in
> > 5.17 we even support self-extracting modules. So I think a
> > self-extracting kernel is better than a pure compressed kernel.
>
> FYI, kernel modules are not self-extracting. They don't contain the code
> to do the decompression - that is contained within the kernel, and it is
> the kernel that does the decompression. The userspace tooling tells the
> kernel that the module is compressed.
I call "self-extracting" here means we don't need out-of-kernel help:
kernel decompress doesn't need the bootloader, module decompress
doesn't need kmod.

Huacai
>
> --
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 01/24] Documentation: LoongArch: Add basic documentations
  2022-05-01  7:48   ` Bagas Sanjaya
@ 2022-05-01  8:55     ` Huacai Chen
  0 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-05-01  8:55 UTC (permalink / raw)
  To: Bagas Sanjaya
  Cc: Huacai Chen, Arnd Bergmann, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION, LKML,
	Xuefeng Li, Yanteng Si, Guo Ren, Xuerui Wang, Jiaxun Yang

Hi, Bagas,

On Sun, May 1, 2022 at 3:49 PM Bagas Sanjaya <bagasdotme@gmail.com> wrote:
>
> On Sat, Apr 30, 2022 at 05:04:55PM +0800, Huacai Chen wrote:
> > +Instruction names (Mnemonics)
> > +-----------------------------
> > +
> > +We only list the instruction names here, for details please read the references.
> > +
> > +Arithmetic Operation Instructions::
> > +
> > +  ADD.W SUB.W ADDI.W ADD.D SUB.D ADDI.D
> > +  SLT SLTU SLTI SLTUI
> > +  AND OR NOR XOR ANDN ORN ANDI ORI XORI
> > +  MUL.W MULH.W MULH.WU DIV.W DIV.WU MOD.W MOD.WU
> > +  MUL.D MULH.D MULH.DU DIV.D DIV.DU MOD.D MOD.DU
> > +  PCADDI PCADDU12I PCADDU18I
> > +  LU12I.W LU32I.D LU52I.D ADDU16I.D
> > +
> > +Bit-shift Instructions::
> > +
> > +  SLL.W SRL.W SRA.W ROTR.W SLLI.W SRLI.W SRAI.W ROTRI.W
> > +  SLL.D SRL.D SRA.D ROTR.D SLLI.D SRLI.D SRAI.D ROTRI.D
> > +
> > +Bit-manipulation Instructions::
> > +
> > +  EXT.W.B EXT.W.H CLO.W CLO.D SLZ.W CLZ.D CTO.W CTO.D CTZ.W CTZ.D
> > +  BYTEPICK.W BYTEPICK.D BSTRINS.W BSTRINS.D BSTRPICK.W BSTRPICK.D
> > +  REVB.2H REVB.4H REVB.2W REVB.D REVH.2W REVH.D BITREV.4B BITREV.8B BITREV.W BITREV.D
> > +  MASKEQZ MASKNEZ
> > +
> > +Branch Instructions::
> > +
> > +  BEQ BNE BLT BGE BLTU BGEU BEQZ BNEZ B BL JIRL
> > +
> > +Load/Store Instructions::
> > +
> > +  LD.B LD.BU LD.H LD.HU LD.W LD.WU LD.D ST.B ST.H ST.W ST.D
> > +  LDX.B LDX.BU LDX.H LDX.HU LDX.W LDX.WU LDX.D STX.B STX.H STX.W STX.D
> > +  LDPTR.W LDPTR.D STPTR.W STPTR.D
> > +  PRELD PRELDX
> > +
> > +Atomic Operation Instructions::
> > +
> > +  LL.W SC.W LL.D SC.D
> > +  AMSWAP.W AMSWAP.D AMADD.W AMADD.D AMAND.W AMAND.D AMOR.W AMOR.D AMXOR.W AMXOR.D
> > +  AMMAX.W AMMAX.D AMMIN.W AMMIN.D
> > +
> > +Barrier Instructions::
> > +
> > +  IBAR DBAR
> > +
> > +Special Instructions::
> > +
> > +  SYSCALL BREAK CPUCFG NOP IDLE ERTN DBCL RDTIMEL.W RDTIMEH.W RDTIME.D ASRTLE.D ASRTGT.D
> > +
> > +Privileged Instructions::
> > +
> > +  CSRRD CSRWR CSRXCHG
> > +  IOCSRRD.B IOCSRRD.H IOCSRRD.W IOCSRRD.D IOCSRWR.B IOCSRWR.H IOCSRWR.W IOCSRWR.D
> > +  CACOP TLBP(TLBSRCH) TLBRD TLBWR TLBFILL TLBCLR TLBFLUSH INVTLB LDDIR LDPTE
> > +
>
> Since these above are list of instruction categories, it's better to use
> enumerated lists. Also, make use of ReST labels to link to References
> sections, like this:
OK, thanks, let me try.

Huacai
>
> -- >8 --
>
> diff --git a/Documentation/loongarch/introduction.rst b/Documentation/loongarch/introduction.rst
> index 420c0d2ebcfbe7..2d83283ecf28b9 100644
> --- a/Documentation/loongarch/introduction.rst
> +++ b/Documentation/loongarch/introduction.rst
> @@ -194,60 +194,61 @@ can see I21L/I21H and I26L/I26H here.
>  Instruction names (Mnemonics)
>  -----------------------------
>
> -We only list the instruction names here, for details please read the references.
> +We only list the instruction names here, for details please read the
> +:ref:`references <loongarch-references>`.
>
> -Arithmetic Operation Instructions::
> +1. Arithmetic Operation Instructions::
>
> -  ADD.W SUB.W ADDI.W ADD.D SUB.D ADDI.D
> -  SLT SLTU SLTI SLTUI
> -  AND OR NOR XOR ANDN ORN ANDI ORI XORI
> -  MUL.W MULH.W MULH.WU DIV.W DIV.WU MOD.W MOD.WU
> -  MUL.D MULH.D MULH.DU DIV.D DIV.DU MOD.D MOD.DU
> -  PCADDI PCADDU12I PCADDU18I
> -  LU12I.W LU32I.D LU52I.D ADDU16I.D
> +     ADD.W SUB.W ADDI.W ADD.D SUB.D ADDI.D
> +     SLT SLTU SLTI SLTUI
> +     AND OR NOR XOR ANDN ORN ANDI ORI XORI
> +     MUL.W MULH.W MULH.WU DIV.W DIV.WU MOD.W MOD.WU
> +     MUL.D MULH.D MULH.DU DIV.D DIV.DU MOD.D MOD.DU
> +     PCADDI PCADDU12I PCADDU18I
> +     LU12I.W LU32I.D LU52I.D ADDU16I.D
>
> -Bit-shift Instructions::
> +2. Bit-shift Instructions::
>
> -  SLL.W SRL.W SRA.W ROTR.W SLLI.W SRLI.W SRAI.W ROTRI.W
> -  SLL.D SRL.D SRA.D ROTR.D SLLI.D SRLI.D SRAI.D ROTRI.D
> +     SLL.W SRL.W SRA.W ROTR.W SLLI.W SRLI.W SRAI.W ROTRI.W
> +     SLL.D SRL.D SRA.D ROTR.D SLLI.D SRLI.D SRAI.D ROTRI.D
>
> -Bit-manipulation Instructions::
> +3. Bit-manipulation Instructions::
>
> -  EXT.W.B EXT.W.H CLO.W CLO.D SLZ.W CLZ.D CTO.W CTO.D CTZ.W CTZ.D
> -  BYTEPICK.W BYTEPICK.D BSTRINS.W BSTRINS.D BSTRPICK.W BSTRPICK.D
> -  REVB.2H REVB.4H REVB.2W REVB.D REVH.2W REVH.D BITREV.4B BITREV.8B BITREV.W BITREV.D
> -  MASKEQZ MASKNEZ
> +     EXT.W.B EXT.W.H CLO.W CLO.D SLZ.W CLZ.D CTO.W CTO.D CTZ.W CTZ.D
> +     BYTEPICK.W BYTEPICK.D BSTRINS.W BSTRINS.D BSTRPICK.W BSTRPICK.D
> +     REVB.2H REVB.4H REVB.2W REVB.D REVH.2W REVH.D BITREV.4B BITREV.8B BITREV.W BITREV.D
> +     MASKEQZ MASKNEZ
>
> -Branch Instructions::
> +4. Branch Instructions::
>
> -  BEQ BNE BLT BGE BLTU BGEU BEQZ BNEZ B BL JIRL
> +     BEQ BNE BLT BGE BLTU BGEU BEQZ BNEZ B BL JIRL
>
> -Load/Store Instructions::
> +5. Load/Store Instructions::
>
> -  LD.B LD.BU LD.H LD.HU LD.W LD.WU LD.D ST.B ST.H ST.W ST.D
> -  LDX.B LDX.BU LDX.H LDX.HU LDX.W LDX.WU LDX.D STX.B STX.H STX.W STX.D
> -  LDPTR.W LDPTR.D STPTR.W STPTR.D
> -  PRELD PRELDX
> +     LD.B LD.BU LD.H LD.HU LD.W LD.WU LD.D ST.B ST.H ST.W ST.D
> +     LDX.B LDX.BU LDX.H LDX.HU LDX.W LDX.WU LDX.D STX.B STX.H STX.W STX.D
> +     LDPTR.W LDPTR.D STPTR.W STPTR.D
> +     PRELD PRELDX
>
> -Atomic Operation Instructions::
> +6. Atomic Operation Instructions::
>
> -  LL.W SC.W LL.D SC.D
> -  AMSWAP.W AMSWAP.D AMADD.W AMADD.D AMAND.W AMAND.D AMOR.W AMOR.D AMXOR.W AMXOR.D
> -  AMMAX.W AMMAX.D AMMIN.W AMMIN.D
> +     LL.W SC.W LL.D SC.D
> +     AMSWAP.W AMSWAP.D AMADD.W AMADD.D AMAND.W AMAND.D AMOR.W AMOR.D AMXOR.W AMXOR.D
> +     AMMAX.W AMMAX.D AMMIN.W AMMIN.D
>
> -Barrier Instructions::
> +7. Barrier Instructions::
>
> -  IBAR DBAR
> +     IBAR DBAR
>
> -Special Instructions::
> +8. Special Instructions::
>
> -  SYSCALL BREAK CPUCFG NOP IDLE ERTN DBCL RDTIMEL.W RDTIMEH.W RDTIME.D ASRTLE.D ASRTGT.D
> +     SYSCALL BREAK CPUCFG NOP IDLE ERTN DBCL RDTIMEL.W RDTIMEH.W RDTIME.D ASRTLE.D ASRTGT.D
>
> -Privileged Instructions::
> +9. Privileged Instructions::
>
> -  CSRRD CSRWR CSRXCHG
> -  IOCSRRD.B IOCSRRD.H IOCSRRD.W IOCSRRD.D IOCSRWR.B IOCSRWR.H IOCSRWR.W IOCSRWR.D
> -  CACOP TLBP(TLBSRCH) TLBRD TLBWR TLBFILL TLBCLR TLBFLUSH INVTLB LDDIR LDPTE
> +     CSRRD CSRWR CSRXCHG
> +     IOCSRRD.B IOCSRRD.H IOCSRRD.W IOCSRRD.D IOCSRWR.B IOCSRWR.H IOCSRWR.W IOCSRWR.D
> +     CACOP TLBP(TLBSRCH) TLBRD TLBWR TLBFILL TLBCLR TLBFLUSH INVTLB LDDIR LDPTE
>
>  Virtual Memory
>  ==============
> @@ -315,6 +316,8 @@ MIPS, while New Loongson is based on LoongArch. Take Loongson-3 as an example:
>  Loongson-3A1000/3B1500/3A2000/3A3000/3A4000 are MIPS-compatible, while Loongson-
>  3A5000 (and future revisions) are all based on LoongArch.
>
> +.. _loongarch-references:
> +
>  References
>  ==========
>
>
> > +
> > + +---------------------------------------------+
> > + |::                                           |
> > + |                                             |
> > + |    +-----+     +---------+     +-------+    |
> > + |    | IPI | --> | CPUINTC | <-- | Timer |    |
> > + |    +-----+     +---------+     +-------+    |
> > + |                     ^                       |
> > + |                     |                       |
> > + |                +---------+     +-------+    |
> > + |                | LIOINTC | <-- | UARTs |    |
> > + |                +---------+     +-------+    |
> > + |                     ^                       |
> > + |                     |                       |
> > + |               +-----------+                 |
> > + |               | HTVECINTC |                 |
> > + |               +-----------+                 |
> > + |                ^         ^                  |
> > + |                |         |                  |
> > + |          +---------+ +---------+            |
> > + |          | PCH-PIC | | PCH-MSI |            |
> > + |          +---------+ +---------+            |
> > + |            ^     ^           ^              |
> > + |            |     |           |              |
> > + |    +---------+ +---------+ +---------+      |
> > + |    | PCH-LPC | | Devices | | Devices |      |
> > + |    +---------+ +---------+ +---------+      |
> > + |         ^                                   |
> > + |         |                                   |
> > + |    +---------+                              |
> > + |    | Devices |                              |
> > + |    +---------+                              |
> > + |                                             |
> > + |                                             |
> > + +---------------------------------------------+
> > +
> ...
> > +
> > + +--------------------------------------------------------+
> > + |::                                                      |
> > + |                                                        |
> > + |         +-----+     +---------+     +-------+          |
> > + |         | IPI | --> | CPUINTC | <-- | Timer |          |
> > + |         +-----+     +---------+     +-------+          |
> > + |                      ^       ^                         |
> > + |                      |       |                         |
> > + |               +---------+ +---------+     +-------+    |
> > + |               | EIOINTC | | LIOINTC | <-- | UARTs |    |
> > + |               +---------+ +---------+     +-------+    |
> > + |                ^       ^                               |
> > + |                |       |                               |
> > + |         +---------+ +---------+                        |
> > + |         | PCH-PIC | | PCH-MSI |                        |
> > + |         +---------+ +---------+                        |
> > + |           ^     ^           ^                          |
> > + |           |     |           |                          |
> > + |   +---------+ +---------+ +---------+                  |
> > + |   | PCH-LPC | | Devices | | Devices |                  |
> > + |   +---------+ +---------+ +---------+                  |
> > + |        ^                                               |
> > + |        |                                               |
> > + |   +---------+                                          |
> > + |   | Devices |                                          |
> > + |   +---------+                                          |
> > + |                                                        |
> > + |                                                        |
> > + +--------------------------------------------------------+
> > +
>
> I think just literal blocks is enough for the diagrams above.
>
> --
> An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 00/22] arch: Add basic LoongArch support
  2022-05-01  8:19 ` [PATCH V9 00/22] arch: Add basic LoongArch support Bagas Sanjaya
@ 2022-05-01  8:55   ` Huacai Chen
  0 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-05-01  8:55 UTC (permalink / raw)
  To: Bagas Sanjaya
  Cc: Huacai Chen, Arnd Bergmann, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION, LKML,
	Xuefeng Li, Yanteng Si, Guo Ren, Xuerui Wang, Jiaxun Yang

Hi, Bagas,

On Sun, May 1, 2022 at 4:19 PM Bagas Sanjaya <bagasdotme@gmail.com> wrote:
>
> On Sat, Apr 30, 2022 at 05:04:54PM +0800, Huacai Chen wrote:
> > Huacai Chen(24):
> >  Documentation: LoongArch: Add basic documentations.
> >  Documentation/zh_CN: Add basic LoongArch documentations.
> >  LoongArch: Add elf-related definitions.
> >  LoongArch: Add writecombine support for drm.
> >  LoongArch: Add build infrastructure.
> >  LoongArch: Add CPU definition headers.
> >  LoongArch: Add atomic/locking headers.
> >  LoongArch: Add other common headers.
> >  LoongArch: Add boot and setup routines.
> >  LoongArch: Add exception/interrupt handling.
> >  LoongArch: Add process management.
> >  LoongArch: Add memory management.
> >  LoongArch: Add system call support.
> >  LoongArch: Add signal handling support.
> >  LoongArch: Add elf and module support.
> >  LoongArch: Add misc common routines.
> >  LoongArch: Add some library functions.
> >  LoongArch: Add PCI controller support.
> >  LoongArch: Add VDSO and VSYSCALL support.
> >  LoongArch: Add efistub booting support.
> >  LoongArch: Add zboot (compressed kernel) support.
> >  LoongArch: Add multi-processor (SMP) support.
> >  LoongArch: Add Non-Uniform Memory Access (NUMA) support.
> >  LoongArch: Add Loongson-3 default config file.
> >
>
> I have skimmed through patch descriptions, and I see patch 05-24/24 use
> descriptive mood (This patch adds what...). Please write them in
> imperative mood instead.
OK, thanks, let me try.

Huacai
>
> --
> An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 01/24] Documentation: LoongArch: Add basic documentations
  2022-04-30  9:04 ` [PATCH V9 01/24] Documentation: LoongArch: Add basic documentations Huacai Chen
  2022-05-01  7:48   ` Bagas Sanjaya
@ 2022-05-01  9:32   ` WANG Xuerui
  2022-05-01 10:17     ` Huacai Chen
  1 sibling, 1 reply; 94+ messages in thread
From: WANG Xuerui @ 2022-05-01  9:32 UTC (permalink / raw)
  To: Huacai Chen, Arnd Bergmann, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang

Hi,

Here's some rough review on the documentation bits, both semantic-wise 
and English-wise; I'm not native English speaker though, so more eyes 
are welcome.


On 4/30/22 17:04, Huacai Chen wrote:
> Add some basic documentation for LoongArch. LoongArch is a new RISC ISA,
> which is a bit like MIPS or RISC-V. LoongArch includes a reduced 32-bit
> version (LA32R), a standard 32-bit version (LA32S) and a 64-bit version
> (LA64).
>
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> ---
>   Documentation/arch.rst                     |   1 +
>   Documentation/loongarch/features.rst       |   3 +
>   Documentation/loongarch/index.rst          |  21 ++
>   Documentation/loongarch/introduction.rst   | 345 +++++++++++++++++++++
>   Documentation/loongarch/irq-chip-model.rst | 168 ++++++++++
>   5 files changed, 538 insertions(+)
>   create mode 100644 Documentation/loongarch/features.rst
>   create mode 100644 Documentation/loongarch/index.rst
>   create mode 100644 Documentation/loongarch/introduction.rst
>   create mode 100644 Documentation/loongarch/irq-chip-model.rst
>
> diff --git a/Documentation/arch.rst b/Documentation/arch.rst
> index 14bcd8294b93..41a66a8b38e4 100644
> --- a/Documentation/arch.rst
> +++ b/Documentation/arch.rst
> @@ -13,6 +13,7 @@ implementation.
>      arm/index
>      arm64/index
>      ia64/index
> +   loongarch/index
>      m68k/index
>      mips/index
>      nios2/index
> diff --git a/Documentation/loongarch/features.rst b/Documentation/loongarch/features.rst
> new file mode 100644
> index 000000000000..ebacade3ea45
> --- /dev/null
> +++ b/Documentation/loongarch/features.rst
> @@ -0,0 +1,3 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +.. kernel-feat:: $srctree/Documentation/features loongarch
> diff --git a/Documentation/loongarch/index.rst b/Documentation/loongarch/index.rst
> new file mode 100644
> index 000000000000..d127e07a7ed3
> --- /dev/null
> +++ b/Documentation/loongarch/index.rst
> @@ -0,0 +1,21 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +================================
> +LoongArch-specific Documentation
> +================================
> +
> +.. toctree::
> +   :maxdepth: 2
> +   :numbered:
> +
> +   introduction
> +   irq-chip-model
> +
> +   features
> +
> +.. only::  subproject and html
> +
> +   Indices
> +   =======
> +
> +   * :ref:`genindex`
> diff --git a/Documentation/loongarch/introduction.rst b/Documentation/loongarch/introduction.rst
> new file mode 100644
> index 000000000000..420c0d2ebcfb
> --- /dev/null
> +++ b/Documentation/loongarch/introduction.rst
> @@ -0,0 +1,345 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=========================
> +Introduction of LoongArch
> +=========================
> +
> +LoongArch is a new RISC ISA, which is a bit like MIPS or RISC-V. LoongArch
> +includes a reduced 32-bit version (LA32R), a standard 32-bit version (LA32S)
> +and a 64-bit version (LA64). LoongArch has 4 privilege levels (PLV0~PLV3),
> +PLV0 is the highest level which used by kernel, and PLV3 is the lowest level
> +which used by applications. This document introduces the registers, basic

The sentence is a bit malformed; better reword into two sentences.

"There are 4 privilege levels (PLVs) defined in LoongArch: PLV0~PLV3, 
from high to low. Kernel runs at the PLV0 while applications runs at PLV3.

> +instruction set, virtual memory and some other topics of LoongArch.
> +
> +Registers
> +=========
> +
> +LoongArch registers include general purpose registers (GPRs), floating point
> +registers (FPRs), vector registers (VRs) and control status registers (CSRs)
> +used in privileged mode (PLV0).
Aren't privilege levels other than PLV0 also able to use CSRs?
> +
> +GPRs
> +----
> +
> +LoongArch has 32 GPRs ($r0 - $r31), each one is 32bit wide in LA32 and 64bit
> +wide in LA64. $r0 is always zero, and other registers has no special feature,

"while other registers are not special"

But again, this is not technically true; $r1 ($ra) *is* architecturally 
special, in that the BL instruction has it hard-wired as the link 
register. This sentence may need a little tweak but I currently don't 
have a concrete suggestion.

> +but we actually have an ABI register convention as below.

We may link to the official psABI specification now. When this port is 
first announced the documentation is not yet ready, but we now have it 
at [1], so by referring to the official bits we can avoid stale 
description like...

[1]: 
https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html

> +
> +================= =============== =================== ============
> +Name              Alias           Usage               Preserved
> +                                                      across calls
> +================= =============== =================== ============
> +``$r0``           ``$zero``       Constant zero       Unused
> +``$r1``           ``$ra``         Return address      No
> +``$r2``           ``$tp``         TLS                 Unused
> +``$r3``           ``$sp``         Stack pointer       Yes
> +``$r4``-``$r11``  ``$a0``-``$a7`` Argument registers  No
> +``$r4``-``$r5``   ``$v0``-``$v1`` Return value        No
... this (the ABI alias is deprecated in the latest spec), and...
> +``$r12``-``$r20`` ``$t0``-``$t8`` Temp registers      No
> +``$r21``          ``$x``          Reserved            Unused
... this (for one thing, the alias is entirely removed in the latest 
spec; for other thing, kernel does make use of this register), and...
> +``$r22``          ``$fp``         Frame pointer       Yes
... this (this can also be called $s9 when we don't care about or make 
use of its frame-pointer nature).
> +``$r23``-``$r31`` ``$s0``-``$s8`` Static registers    Yes
> +================= =============== =================== ============
And as described above, while the $r21 is reserved in the userspace ABI, 
this port does make use of it (as the percpu base register); so we'd 
better mention this too.
> +
> +FPRs
> +----
> +
> +LoongArch has 32 FPRs ($f0 - $f31), each one is 64bit wide. We also have an
"each one is 64bit wide" -- what about the possible LA32 and LA64 
distinction, as similarly shown in the GPR section?
> +ABI register conversion as below.
> +
> +================= ================== =================== ============
> +Name              Alias              Usage               Preserved
> +                                                         across calls
> +================= ================== =================== ============
> +``$f0``-``$f7``   ``$fa0``-``$fa7``  Argument registers  No
> +``$f0``-``$f1``   ``$fv0``-``$fv1``  Return value        No
Same here -- the $vX and $fvX aliases are deprecated.
> +``$f8``-``$f23``  ``$ft0``-``$ft15`` Temp registers      No
> +``$f24``-``$f31`` ``$fs0``-``$fs7``  Static registers    Yes
> +================= ================== =================== ============
> +
> +VRs
> +----
> +
> +LoongArch has 128bit vector extension (LSX, short for Loongson SIMD eXtention)
> +and 256bit vector extension (LASX, short for Loongson Advanced SIMD eXtension).
> +There are also 32 vector registers, for LSX is $v0 - $v31, and for LASX is $x0
> +- $x31. FPRs and VRs are reused, e.g. the lower 128bits of $x0 is $v0, and the

"for LSX is ..." -- Chinglish; "$v0 ~ $v31 for LSX and $x0 ~ $x31 for 
LASX" would be better.

Also, see what you did here with "$vX"? I know the older names are 
"$vrX" and "$xrX", but the latest reference manual already switched to 
the current naming, so you really can't just continue using "$v[01]" for 
"$a[01]" any more. ;-)

"FPRs and VRs are reused" -- the word "overlap" is better, "FPRs and VRs 
overlap; the FPRs share the same storage as VR's lower bits" might be a 
better expression.

> +lower 64bits of $v0 is $f0, etc.
> +
> +CSRs
> +----
> +
> +CSRs can only be used in privileged mode (PLV0):
> +
> +================= ===================================== ==============
> +Address           Full Name                             Abbrev Name
> +================= ===================================== ==============
> +0x0               Current Mode information              CRMD
> +0x1               Pre-exception Mode information        PRMD
is the word "information" needed?
> +0x2               Extended Unit Enable                  EUEN
> +0x3               Miscellaneous controller              MISC
"controller"? just remove the word or "control" would be better.
> +0x4               Exception Configuration               ECFG
> +0x5               Exception Status                      ESTAT
> +0x6               Exception Return Address              ERA
> +0x7               Bad Virtual Address                   BADV
> +0x8               Bad Instruction                       BADI
> +0xC               Exception Entry Base address          EENTRY
> +0x10              TLB Index                             TLBIDX
> +0x11              TLB Entry High-order bits             TLBEHI
> +0x12              TLB Entry Low-order bits 0            TLBELO0
> +0x13              TLB Entry Low-order bits 1            TLBELO1
> +0x18              Address Space Identifier              ASID
> +0x19              Page Global Directory address for     PGDL
> +                  Lower half address space
> +0x1A              Page Global Directory address for     PGDH
> +                  Higher half address space
> +0x1B              Page Global Directory address         PGD
> +0x1C              Page Walk Controller for Lower        PWCL
> +                  half address space
> +0x1D              Page Walk Controller for Higher       PWCH
> +                  half address space
> +0x1E              STLB Page Size                        STLBPS
> +0x1F              Reduced Virtual Address Configuration RVACFG
> +0x20              CPU Identifier                        CPUID
> +0x21              Privileged Resource Configuration 1   PRCFG1
> +0x22              Privileged Resource Configuration 2   PRCFG2
> +0x23              Privileged Resource Configuration 3   PRCFG3
> +0x30+n (0≤n≤15)   Data Save register                    SAVEn
These are actually scratch registers, but I imagine you can't use that 
word as it's a bit MIPS-y... The name is less comprehensible but we 
might have no choice.
> +0x40              Timer Identifier                      TID
> +0x41              Timer Configuration                   TCFG
> +0x42              Timer Value                           TVAL
> +0x43              Compensation of Timer Count           CNTC
> +0x44              Timer Interrupt Clearing              TICLR
> +0x60              LLBit Controller                      LLBCTL
"Control" is probably sufficient -- same for other places.
> +0x80              Implementation-specific Controller 1  IMPCTL1
> +0x81              Implementation-specific Controller 2  IMPCTL2
> +0x88              TLB Refill Exception Entry Base       TLBRENTRY
> +                  address
> +0x89              TLB Refill Exception BAD Virtual      TLBRBADV
> +                  address
> +0x8A              TLB Refill Exception Return Address   TLBRERA
> +0x8B              TLB Refill Exception data SAVE        TLBRSAVE
> +                  register
> +0x8C              TLB Refill Exception Entry Low-order  TLBRELO0
> +                  bits 0
> +0x8D              TLB Refill Exception Entry Low-order  TLBRELO1
> +                  bits 1
> +0x8E              TLB Refill Exception Entry High-order TLBEHI
> +                  bits
> +0x8F              TLB Refill Exception Pre-exception    TLBRPRMD
> +                  Mode information
> +0x90              Machine Error Controller              MERRCTL
> +0x91              Machine Error Information 1           MERRINFO1
> +0x92              Machine Error Information 2           MERRINFO2
> +0x93              Machine Error Exception Entry Base    MERRENTRY
> +                  address
> +0x94              Machine Error Exception Return        MERRERA
> +                  address
> +0x95              Machine Error Exception data SAVE     MERRSAVE
> +                  register
It seems you're trying to match capitalization here to the CSR acronym 
-- but the resulting names are inconsistent-looking, such as the "data 
SAVE" here, and...
> +0x98              Cache TAGs                            CTAG
> +0x180+n (0≤n≤3)   Direct Mapping configuration Window n DMWn
...here, and...
> +0x200+2n (0≤n≤31) Performance Monitor Configuration n   PMCFGn
> +0x201+2n (0≤n≤31) Performance Monitor overall Counter n PMCNTn
> +0x300             Memory load/store WatchPoint          MWPC
> +                  overall Controller

here.

It's inconsistent, because otherwise you'd have "CuRrent MoDe" at the 
top of the table, similarly for other entries. As the reference manual 
(Chinese version; this is the authoritative version) actually does NOT 
give full English names for the CSRs (only Chinese full-name and the 
abbreviation), I think we can be a bit lax here and use normal 
capitalization for reading comfort.

> +0x301             Memory load/store WatchPoint          MWPS
> +                  overall Status
> +0x310+8n (0≤n≤7)  Memory load/store WatchPoint n        MWPnCFG1
> +                  Configuration 1
> +0x311+8n (0≤n≤7)  Memory load/store WatchPoint n        MWPnCFG2
> +                  Configuration 2
> +0x312+8n (0≤n≤7)  Memory load/store WatchPoint n        MWPnCFG3
> +                  Configuration 3
> +0x313+8n (0≤n≤7)  Memory load/store WatchPoint n        MWPnCFG4
> +                  Configuration 4
> +0x380             Fetch WatchPoint overall Controller   FWPC
> +0x381             Fetch WatchPoint overall Status       FWPS
> +0x390+8n (0≤n≤7)  Fetch WatchPoint n Configuration 1    FWPnCFG1
> +0x391+8n (0≤n≤7)  Fetch WatchPoint n Configuration 2    FWPnCFG2
> +0x392+8n (0≤n≤7)  Fetch WatchPoint n Configuration 3    FWPnCFG3
> +0x393+8n (0≤n≤7)  Fetch WatchPoint n Configuration 4    FWPnCFG4
> +0x500             Debug register                        DBG
> +0x501             Debug Exception Return address        DERA
> +0x502             Debug data SAVE register              DSAVE
> +================= ===================================== ==============
> +
> +ERA,TLBRERA,MERREEA and ERA sometimes are also called EPC,TLBREPC
> +MERREPC and DEPC.
> +
> +Basic Instruction Set
> +=====================
> +
> +Instruction formats
> +-------------------
> +
> +LoongArch has 32-bit wide instructions, and there are 9 instruction formats::
> +
> +  2R-type:    Opcode + Rj + Rd
> +  3R-type:    Opcode + Rk + Rj + Rd
> +  4R-type:    Opcode + Ra + Rk + Rj + Rd
> +  2RI8-type:  Opcode + I8 + Rj + Rd
> +  2RI12-type: Opcode + I12 + Rj + Rd
> +  2RI14-type: Opcode + I14 + Rj + Rd
> +  2RI16-type: Opcode + I16 + Rj + Rd
> +  1RI21-type: Opcode + I21L + Rj + I21H
> +  I26-type:   Opcode + I26L + I26H
> +
> +Rj and Rk are source operands (register), Rd is destination operand (register),
> +and Ra is the additional operand (register) in 4R-type. I8/I12/I16/I21/I26 are
> +8-bits/12-bits/16-bits/21-bits/26bits immediate data. 21bits/26bits immediate
> +data are split into higher bits and lower bits in an instruction word, so you
> +can see I21L/I21H and I26L/I26H here.
> +
> +Instruction names (Mnemonics)
> +-----------------------------
> +
> +We only list the instruction names here, for details please read the references.
> +
> +Arithmetic Operation Instructions::
> +
> +  ADD.W SUB.W ADDI.W ADD.D SUB.D ADDI.D
> +  SLT SLTU SLTI SLTUI
> +  AND OR NOR XOR ANDN ORN ANDI ORI XORI
> +  MUL.W MULH.W MULH.WU DIV.W DIV.WU MOD.W MOD.WU
> +  MUL.D MULH.D MULH.DU DIV.D DIV.DU MOD.D MOD.DU
> +  PCADDI PCADDU12I PCADDU18I
> +  LU12I.W LU32I.D LU52I.D ADDU16I.D
> +
> +Bit-shift Instructions::
> +
> +  SLL.W SRL.W SRA.W ROTR.W SLLI.W SRLI.W SRAI.W ROTRI.W
> +  SLL.D SRL.D SRA.D ROTR.D SLLI.D SRLI.D SRAI.D ROTRI.D
> +
> +Bit-manipulation Instructions::
> +
> +  EXT.W.B EXT.W.H CLO.W CLO.D SLZ.W CLZ.D CTO.W CTO.D CTZ.W CTZ.D
> +  BYTEPICK.W BYTEPICK.D BSTRINS.W BSTRINS.D BSTRPICK.W BSTRPICK.D
> +  REVB.2H REVB.4H REVB.2W REVB.D REVH.2W REVH.D BITREV.4B BITREV.8B BITREV.W BITREV.D
> +  MASKEQZ MASKNEZ
> +
> +Branch Instructions::
> +
> +  BEQ BNE BLT BGE BLTU BGEU BEQZ BNEZ B BL JIRL
> +
> +Load/Store Instructions::
> +
> +  LD.B LD.BU LD.H LD.HU LD.W LD.WU LD.D ST.B ST.H ST.W ST.D
> +  LDX.B LDX.BU LDX.H LDX.HU LDX.W LDX.WU LDX.D STX.B STX.H STX.W STX.D
> +  LDPTR.W LDPTR.D STPTR.W STPTR.D
> +  PRELD PRELDX
> +
> +Atomic Operation Instructions::
> +
> +  LL.W SC.W LL.D SC.D
> +  AMSWAP.W AMSWAP.D AMADD.W AMADD.D AMAND.W AMAND.D AMOR.W AMOR.D AMXOR.W AMXOR.D
> +  AMMAX.W AMMAX.D AMMIN.W AMMIN.D
> +
> +Barrier Instructions::
> +
> +  IBAR DBAR
> +
> +Special Instructions::
> +
> +  SYSCALL BREAK CPUCFG NOP IDLE ERTN DBCL RDTIMEL.W RDTIMEH.W RDTIME.D ASRTLE.D ASRTGT.D
> +
> +Privileged Instructions::
> +
> +  CSRRD CSRWR CSRXCHG
> +  IOCSRRD.B IOCSRRD.H IOCSRRD.W IOCSRRD.D IOCSRWR.B IOCSRWR.H IOCSRWR.W IOCSRWR.D
> +  CACOP TLBP(TLBSRCH) TLBRD TLBWR TLBFILL TLBCLR TLBFLUSH INVTLB LDDIR LDPTE

For the whole section, replace with reference to the official 
(translated or not) documentation repo? I believe this is similar to the 
psABI situation explained above.

> +
> +Virtual Memory
> +==============
> +
> +LoongArch can use direct-mapped virtual memory and page-mapped virtual memory.
> +
> +Direct-mapped virtual memory is configured by CSR.DMWn (n=0~3), it has a simple
> +relationship between virtual address (VA) and physical address (PA)::
"... is configured via CSR.DMWn (n=0~3). It specifies a simple 
relationship ..."
> +
> + VA = PA + FixedOffset
> +
> +Page-mapped virtual memory has arbitrary relationship between VA and PA, which
> +is recorded in TLB and page tables. LoongArch's TLB includes a fully-associative
The first sentence is Chinglish. As the basics of paged virtual memory 
should be common sense to kernel developers, could we simplify, or 
better, just somehow get rid of the sentence?
> +MTLB (Multiple Page Size TLB) and set-associative STLB (Single Page Size TLB).
> +
> +By default, the whole virtual address space of LA32 is configured like this:
> +
> +============ =========================== =============================
> +Name         Address Range               Attributes
> +============ =========================== =============================
> +``UVRANGE``  ``0x00000000 - 0x7FFFFFFF`` Page-mapped, Cached, PLV0~3
> +``KPRANGE0`` ``0x80000000 - 0x9FFFFFFF`` Direct-mapped, Uncached, PLV0
> +``KPRANGE1`` ``0xA0000000 - 0xBFFFFFFF`` Direct-mapped, Cached, PLV0
> +``KVRANGE``  ``0xC0000000 - 0xFFFFFFFF`` Page-mapped, Cached, PLV0
The names sound awfully MIPS-like... I can't find any reference to the 
names here in the reference manual, are these Linux-specific inventions 
only documented here?
> +============ =========================== =============================
> +
> +User mode (PLV3) can only access UVRANGE. For direct-mapped KPRANGE0 and
> +KPRANGE1, PA is equal to VA with bit30~31 cleared. For example, the uncached
> +direct-mapped VA of 0x00001000 is 0x80001000, and the cached direct-mapped
> +VA of 0x00001000 is 0xA0001000.
> +
> +By default, the whole virtual address space of LA64 is configured like this:
> +
> +============ ====================== ======================================
> +Name         Address Range          Attributes
> +============ ====================== ======================================
> +``XUVRANGE`` ``0x0000000000000000 - Page-mapped, Cached, PLV0~3
> +             0x3FFFFFFFFFFFFFFF``
> +``XSPRANGE`` ``0x4000000000000000 - Direct-mapped, Cached / Uncached, PLV0
> +             0x7FFFFFFFFFFFFFFF``
> +``XKPRANGE`` ``0x8000000000000000 - Direct-mapped, Cached / Uncached, PLV0
> +             0xBFFFFFFFFFFFFFFF``
> +``XKVRANGE`` ``0xC000000000000000 - Page-mapped, Cached, PLV0
> +             0xFFFFFFFFFFFFFFFF``
Similarly here.
> +============ ====================== ======================================
> +
> +User mode (PLV3) can only access XUVRANGE. For direct-mapped XSPRANGE and XKPRANGE,
> +PA is equal to VA with bit60~63 cleared, and the cache attributes is configured by
> +bit60~61 (0 is strongly-ordered uncached, 1 is coherent cached, and 2 is weakly-
> +ordered uncached) in VA. Currently we only use XKPRANGE for direct mapping and
> +XSPRANGE is reserved. As an example, the strongly-ordered uncached direct-mapped VA
> +(in XKPRANGE) of 0x00000000 00001000 is 0x80000000 00001000, the coherent cached
> +direct-mapped VA (in XKPRANGE) of 0x00000000 00001000 is 0x90000000 00001000, and
> +the weakly-ordered uncached direct-mapped VA (in XKPRANGE) of 0x00000000 00001000
> +is 0xA0000000 00001000.
> +
> +Relationship of Loongson and LoongArch
> +======================================
> +
> +LoongArch is a RISC ISA which is different from any other existing ones, while
> +Loongson is a family of processors. Loongson includes 3 series: Loongson-1 is
> +the 32-bit processor series, Loongson-2 is the low-end 64-bit processor series,
> +and Loongson-3 is the high-end 64-bit processor series. Old Loongson is based on
> +MIPS, while New Loongson is based on LoongArch. Take Loongson-3 as an example:
> +Loongson-3A1000/3B1500/3A2000/3A3000/3A4000 are MIPS-compatible, while Loongson-
> +3A5000 (and future revisions) are all based on LoongArch.
Is this section truly necessary? At least FWIW Loongson is first of all, 
a corporation, in addition to its series of CPU products, bridge chip 
products, browser offering and pretty much everything. We could use a 
fair bit of clarification for this paragraph, at least use phrases like 
"Loongson processors"...
> +
> +References
> +==========
> +
> +Official web site of Loongson and LoongArch (Loongson Technology Corp. Ltd.):
> +
> +  http://www.loongson.cn/index.html
You may omit the "index.html" part...
> +
> +Developer web site of Loongson and LoongArch (Software and Documentation):
> +
> +  http://www.loongnix.cn/index.php
Do you really mean loongnix.cn and not 
https://loongson.github.io/LoongArch-Documentation/ ? Because 
loongnix.cn is more an information portal for users... at least in its 
current iteration there's no link to actual documentation, no link to 
development repos, nothing useful for prospective contributors.
> +
> +  https://github.com/loongson
> +
> +Documentation of LoongArch ISA:
> +
> +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.00-CN.pdf (in Chinese)
> +
> +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.00-EN.pdf (in English)
> +
> +Documentation of LoongArch ELF ABI:
> +
> +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-ELF-ABI-v1.00-CN.pdf (in Chinese)
> +
> +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-ELF-ABI-v1.00-EN.pdf (in English)
> +
> +Linux kernel repository of Loongson and LoongArch:
> +
> +  https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git
> diff --git a/Documentation/loongarch/irq-chip-model.rst b/Documentation/loongarch/irq-chip-model.rst
> new file mode 100644
> index 000000000000..bde112b81ace
> --- /dev/null
> +++ b/Documentation/loongarch/irq-chip-model.rst
> @@ -0,0 +1,168 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=======================================
> +IRQ chip model (hierarchy) of LoongArch
> +=======================================
> +
> +Currently, LoongArch based processors (e.g. Loongson-3A5000) can only work together
> +with LS7A chipsets. The irq chips in LoongArch computers include CPUINTC (CPU Core
> +Interrupt Controller), LIOINTC (Legacy I/O Interrupt Controller), EIOINTC (Extended
> +I/O Interrupt Controller), HTVECINTC (Hyper-Transport Vector Interrupt Controller),
> +PCH-PIC (Main Interrupt Controller in LS7A chipset), PCH-LPC (LPC Interrupt Controller
> +in LS7A chipset) and PCH-MSI (MSI Interrupt Controller).
> +
> +CPUINTC is a per-core controller (in CPU), LIOINTC/EIOINTC/HTVECINTC are per-package
> +controllers (in CPU), while PCH-PIC/PCH-LPC/PCH-MSI are controllers out of CPU (i.e.,
> +in chipsets). These controllers (in other words, irqchips) are linked in a hierarchy,
> +and there are two models of hierarchy (legacy model and extended model).
> +
> +Legacy IRQ model
> +================
> +
> +In this model, IPI (Inter-Processor Interrupt) and CPU Local Timer interrupt go
> +to CPUINTC directly, CPU UARTS interrupts go to LIOINTC, while all other devices
> +interrupts go to PCH-PIC/PCH-LPC/PCH-MSI and gathered by HTVECINTC, and then go
> +to LIOINTC, and then CPUINTC.
> +
> + +---------------------------------------------+
> + |::                                           |
> + |                                             |
> + |    +-----+     +---------+     +-------+    |
> + |    | IPI | --> | CPUINTC | <-- | Timer |    |
> + |    +-----+     +---------+     +-------+    |
> + |                     ^                       |
> + |                     |                       |
> + |                +---------+     +-------+    |
> + |                | LIOINTC | <-- | UARTs |    |
> + |                +---------+     +-------+    |
> + |                     ^                       |
> + |                     |                       |
> + |               +-----------+                 |
> + |               | HTVECINTC |                 |
> + |               +-----------+                 |
> + |                ^         ^                  |
> + |                |         |                  |
> + |          +---------+ +---------+            |
> + |          | PCH-PIC | | PCH-MSI |            |
> + |          +---------+ +---------+            |
> + |            ^     ^           ^              |
> + |            |     |           |              |
> + |    +---------+ +---------+ +---------+      |
> + |    | PCH-LPC | | Devices | | Devices |      |
> + |    +---------+ +---------+ +---------+      |
> + |         ^                                   |
> + |         |                                   |
> + |    +---------+                              |
> + |    | Devices |                              |
> + |    +---------+                              |
> + |                                             |
> + |                                             |
> + +---------------------------------------------+
> +
> +Extended IRQ model
> +==================
> +
> +In this model, IPI (Inter-Processor Interrupt) and CPU Local Timer interrupt go
> +to CPUINTC directly, CPU UARTS interrupts go to LIOINTC, while all other devices
> +interrupts go to PCH-PIC/PCH-LPC/PCH-MSI and gathered by EIOINTC, and then go to
> +to CPUINTC directly.
> +
> + +--------------------------------------------------------+
> + |::                                                      |
> + |                                                        |
> + |         +-----+     +---------+     +-------+          |
> + |         | IPI | --> | CPUINTC | <-- | Timer |          |
> + |         +-----+     +---------+     +-------+          |
> + |                      ^       ^                         |
> + |                      |       |                         |
> + |               +---------+ +---------+     +-------+    |
> + |               | EIOINTC | | LIOINTC | <-- | UARTs |    |
> + |               +---------+ +---------+     +-------+    |
> + |                ^       ^                               |
> + |                |       |                               |
> + |         +---------+ +---------+                        |
> + |         | PCH-PIC | | PCH-MSI |                        |
> + |         +---------+ +---------+                        |
> + |           ^     ^           ^                          |
> + |           |     |           |                          |
> + |   +---------+ +---------+ +---------+                  |
> + |   | PCH-LPC | | Devices | | Devices |                  |
> + |   +---------+ +---------+ +---------+                  |
> + |        ^                                               |
> + |        |                                               |
> + |   +---------+                                          |
> + |   | Devices |                                          |
> + |   +---------+                                          |
> + |                                                        |
> + |                                                        |
> + +--------------------------------------------------------+
> +
> +ACPI-related definitions
> +========================
> +
> +CPUINTC::
> +
> +  ACPI_MADT_TYPE_CORE_PIC;
> +  struct acpi_madt_core_pic;
> +  enum acpi_madt_core_pic_version;
> +
> +LIOINTC::
> +
> +  ACPI_MADT_TYPE_LIO_PIC;
> +  struct acpi_madt_lio_pic;
> +  enum acpi_madt_lio_pic_version;
> +
> +EIOINTC::
> +
> +  ACPI_MADT_TYPE_EIO_PIC;
> +  struct acpi_madt_eio_pic;
> +  enum acpi_madt_eio_pic_version;
> +
> +HTVECINTC::
> +
> +  ACPI_MADT_TYPE_HT_PIC;
> +  struct acpi_madt_ht_pic;
> +  enum acpi_madt_ht_pic_version;
> +
> +PCH-PIC::
> +
> +  ACPI_MADT_TYPE_BIO_PIC;
> +  struct acpi_madt_bio_pic;
> +  enum acpi_madt_bio_pic_version;
> +
> +PCH-MSI::
> +
> +  ACPI_MADT_TYPE_MSI_PIC;
> +  struct acpi_madt_msi_pic;
> +  enum acpi_madt_msi_pic_version;
> +
> +PCH-LPC::
> +
> +  ACPI_MADT_TYPE_LPC_PIC;
> +  struct acpi_madt_lpc_pic;
> +  enum acpi_madt_lpc_pic_version;
> +
> +References
> +==========
> +
> +Documentation of Loongson-3A5000:
> +
> +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-3A5000-usermanual-1.02-CN.pdf (in Chinese)
> +
> +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-3A5000-usermanual-1.02-EN.pdf (in English)
> +
> +Documentation of Loongson's LS7A chipset:
> +
> +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-7A1000-usermanual-2.00-CN.pdf (in Chinese)
> +
> +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-7A1000-usermanual-2.00-EN.pdf (in English)
> +
> +Attention: CPUINTC is CSR.ECFG/CSR.ESTAT and its interrupt controller described
"Note" may be enough. :-)
> +in Section 7.4 of "LoongArch Reference Manual, Vol 1"; LIOINTC is "Legacy I/O
> +Interrupts" described in Section 11.1 of "Loongson 3A5000 Processor Reference
> +Manual"; EIOINTC is "Extended I/O Interrupts" described in Section 11.2 of
> +"Loongson 3A5000 Processor Reference Manual"; HTVECINTC is "HyperTransport
> +Interrupts" described in Section 14.3 of "Loongson 3A5000 Processor Reference
> +Manual"; PCH-PIC/PCH-MSI is "Interrupt Controller" described in Section 5 of
> +"Loongson 7A1000 Bridge User Manual"; PCH-LPC is "LPC Interrupts" described in
> +Section 24.3 of "Loongson 7A1000 Bridge User Manual".

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 02/24] Documentation/zh_CN: Add basic LoongArch documentations
  2022-04-30  9:04 ` [PATCH V9 02/24] Documentation/zh_CN: Add basic LoongArch documentations Huacai Chen
@ 2022-05-01  9:38   ` WANG Xuerui
  0 siblings, 0 replies; 94+ messages in thread
From: WANG Xuerui @ 2022-05-01  9:38 UTC (permalink / raw)
  To: Huacai Chen, Arnd Bergmann, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang, Alex Shi

Hi,

On 4/30/22 17:04, Huacai Chen wrote:
> Add some basic documentation (zh_CN version) for LoongArch. LoongArch is
> a new RISC ISA, which is a bit like MIPS or RISC-V. LoongArch includes a
> reduced 32-bit version (LA32R), a standard 32-bit version (LA32S) and a
> 64-bit version (LA64).
>
> Reviewed-by: Alex Shi <alexs@kernel.org>
> Reviewed-by: Yanteng Si <siyanteng@loongson.cn>
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> ---
>   Documentation/translations/zh_CN/index.rst    |   1 +
>   .../translations/zh_CN/loongarch/features.rst |   8 +
>   .../translations/zh_CN/loongarch/index.rst    |  26 ++
>   .../zh_CN/loongarch/introduction.rst          | 318 ++++++++++++++++++
>   .../zh_CN/loongarch/irq-chip-model.rst        | 167 +++++++++
>   5 files changed, 520 insertions(+)
>   create mode 100644 Documentation/translations/zh_CN/loongarch/features.rst
>   create mode 100644 Documentation/translations/zh_CN/loongarch/index.rst
>   create mode 100644 Documentation/translations/zh_CN/loongarch/introduction.rst
>   create mode 100644 Documentation/translations/zh_CN/loongarch/irq-chip-model.rst
>
> diff --git a/Documentation/translations/zh_CN/index.rst b/Documentation/translations/zh_CN/index.rst
> index 88d8df957a78..41c59950523c 100644
> --- a/Documentation/translations/zh_CN/index.rst
> +++ b/Documentation/translations/zh_CN/index.rst
> @@ -171,6 +171,7 @@ TODOList:
>      riscv/index
>      openrisc/index
>      parisc/index
> +   loongarch/index
>   
>   TODOList:
>   
> diff --git a/Documentation/translations/zh_CN/loongarch/features.rst b/Documentation/translations/zh_CN/loongarch/features.rst
> new file mode 100644
> index 000000000000..3886e635ec06
> --- /dev/null
> +++ b/Documentation/translations/zh_CN/loongarch/features.rst
> @@ -0,0 +1,8 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +.. include:: ../disclaimer-zh_CN.rst
> +
> +:Original: Documentation/loongarch/features.rst
> +:Translator: Huacai Chen <chenhuacai@loongson.cn>
> +
> +.. kernel-feat:: $srctree/Documentation/features loongarch
> diff --git a/Documentation/translations/zh_CN/loongarch/index.rst b/Documentation/translations/zh_CN/loongarch/index.rst
> new file mode 100644
> index 000000000000..367dead02e3a
> --- /dev/null
> +++ b/Documentation/translations/zh_CN/loongarch/index.rst
> @@ -0,0 +1,26 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +.. include:: ../disclaimer-zh_CN.rst
> +
> +:Original: Documentation/loongarch/index.rst
> +:Translator: Huacai Chen <chenhuacai@loongson.cn>
> +
> +=================
> +LoongArch特性文档

This title is translated from "LoongArch-specific documentation", so 
"特性文档" is wrong -- "LoongArch特性文档" is "LoongArch features 
documentation" instead. You should say something like 
"LoongArch架构相关文档" or "LoongArch架构相关信息" instead.

(BTW, the translation for MIPS documentation has the same error, so 
that's probably what you based the patch on...)

And for the rest, most (but not all, specifically not those concerning 
English usage) comments to the previous patch adding English 
documentation applies here too. Please keep the two patches' contents in 
sync.

(no further comments below)

> +=================
> +
> +.. toctree::
> +   :maxdepth: 2
> +   :numbered:
> +
> +   introduction
> +   irq-chip-model
> +
> +   features
> +
> +.. only::  subproject and html
> +
> +   Indices
> +   =======
> +
> +   * :ref:`genindex`
> diff --git a/Documentation/translations/zh_CN/loongarch/introduction.rst b/Documentation/translations/zh_CN/loongarch/introduction.rst
> new file mode 100644
> index 000000000000..432a6267f1f1
> --- /dev/null
> +++ b/Documentation/translations/zh_CN/loongarch/introduction.rst
> @@ -0,0 +1,318 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +.. include:: ../disclaimer-zh_CN.rst
> +
> +:Original: Documentation/loongarch/introduction.rst
> +:Translator: Huacai Chen <chenhuacai@loongson.cn>
> +
> +=============
> +LoongArch介绍
> +=============
> +
> +LoongArch是一种新的RISC ISA,在一定程度上类似于MIPS和RISC-V。LoongArch指令集
> +包括一个精简32位版(LA32R)、一个标准32位版(LA32S)、一个64位版(LA64)。
> +LoongArch有四个特权级(PLV0~PLV3),其中PLV0是最高特权级,用于内核;而PLV3是
> +最低特权级,用于应用程序。本文档介绍了LoongArch的寄存器、基础指令集、虚拟内
> +存以及其他一些主题。
> +
> +寄存器
> +======
> +
> +LoongArch的寄存器包括通用寄存器(GPRs)、浮点寄存器(FPRs)、向量寄存器(VRs)
> +和用于特权模式(PLV0)的控制状态寄存器(CSRs)。
> +
> +通用寄存器
> +----------
> +
> +LoongArch包括32个通用寄存器($r0 - $r31),LA32中每个寄存器为32位宽,LA64中
> +每个寄存器为64位宽。$r0的内容总是0,而其他寄存器没有特殊功能。然而,我们有
> +如下所示的一套ABI寄存器使用约定。
> +
> +================= =============== =================== ==========
> +寄存器名          别名            用途                跨调用保持
> +================= =============== =================== ==========
> +``$r0``           ``$zero``       常量0               不使用
> +``$r1``           ``$ra``         返回地址            否
> +``$r2``           ``$tp``         TLS(线程局部存储) 不使用
> +``$r3``           ``$sp``         栈指针              是
> +``$r4``-``$r11``  ``$a0``-``$a7`` 参数寄存器          否
> +``$r4``-``$r5``   ``$v0``-``$v1`` 返回值              否
> +``$r12``-``$r20`` ``$t0``-``$t8`` 临时寄存器          否
> +``$r21``          ``$x``          保留                不使用
> +``$r22``          ``$fp``         帧指针              是
> +``$r23``-``$r31`` ``$s0``-``$s8`` 静态寄存器          是
> +================= =============== =================== ==========
> +
> +浮点寄存器
> +----------
> +
> +LoongArch有32个浮点寄存器($f0 - $f31),每个寄存器均为64位宽。我们同样
> +有如下所示的一套ABI寄存器使用约定。
> +
> +================= ================== =================== ==========
> +寄存器名          别名               用途                跨调用保持
> +================= ================== =================== ==========
> +``$f0``-``$f7``   ``$fa0``-``$fa7``  参数寄存器          否
> +``$f0``-``$f1``   ``$fv0``-``$fv1``  返回值              否
> +``$f8``-``$f23``  ``$ft0``-``$ft15`` 临时寄存器          否
> +``$f24``-``$f31`` ``$fs0``-``$fs7``  静态寄存器          是
> +================= ================== =================== ==========
> +
> +向量寄存器
> +----------
> +
> +LoongArch拥有128位向量扩展(LSX,全称Loongson SIMD eXtention)和256位向量扩展
> +(LASX,全称Loongson Advanced SIMD eXtension)。共有32个向量寄存器,对于LSX是
> +$v0 - $v31,对于LASX是$x0 - $x31。浮点寄存器和向量寄存器是复用的,比如:$x0的
> +低128位是$v0,而$v0的低64位又是$f0,以此类推。
> +
> +控制状态寄存器
> +--------------
> +
> +控制状态寄存器只用于特权模式(PLV0):
> +
> +================= ==================================== ==========
> +地址              全称描述                             简称
> +================= ==================================== ==========
> +0x0               当前模式信息                         CRMD
> +0x1               异常前模式信息                       PRMD
> +0x2               扩展部件使能                         EUEN
> +0x3               杂项控制                             MISC
> +0x4               异常配置                             ECFG
> +0x5               异常状态                             ESTAT
> +0x6               异常返回地址                         ERA
> +0x7               出错虚拟地址                         BADV
> +0x8               出错指令                             BADI
> +0xC               异常入口地址                         EENTRY
> +0x10              TLB索引                              TLBIDX
> +0x11              TLB表项高位                          TLBEHI
> +0x12              TLB表项低位0                         TLBELO0
> +0x13              TLB表项低位1                         TLBELO1
> +0x18              地址空间标识符                       ASID
> +0x19              低半地址空间页全局目录基址           PGDL
> +0x1A              高半地址空间页全局目录基址           PGDH
> +0x1B              页全局目录基址                       PGD
> +0x1C              页表遍历控制低半部分                 PWCL
> +0x1D              页表遍历控制高半部分                 PWCH
> +0x1E              STLB页大小                           STLBPS
> +0x1F              缩减虚地址配置                       RVACFG
> +0x20              CPU编号                              CPUID
> +0x21              特权资源配置信息1                    PRCFG1
> +0x22              特权资源配置信息2                    PRCFG2
> +0x23              特权资源配置信息3                    PRCFG3
> +0x30+n (0≤n≤15)   数据保存寄存器                       SAVEn
> +0x40              定时器编号                           TID
> +0x41              定时器配置                           TCFG
> +0x42              定时器值                             TVAL
> +0x43              计时器补偿                           CNTC
> +0x44              定时器中断清除                       TICLR
> +0x60              LLBit相关控制                        LLBCTL
> +0x80              实现相关控制1                        IMPCTL1
> +0x81              实现相关控制2                        IMPCTL2
> +0x88              TLB重填异常入口地址                  TLBRENTRY
> +0x89              TLB重填异常出错虚地址                TLBRBADV
> +0x8A              TLB重填异常返回地址                  TLBRERA
> +0x8B              TLB重填异常数据保存                  TLBRSAVE
> +0x8C              TLB重填异常表项低位0                 TLBRELO0
> +0x8D              TLB重填异常表项低位1                 TLBRELO1
> +0x8E              TLB重填异常表项高位                  TLBEHI
> +0x8F              TLB重填异常前模式信息                TLBRPRMD
> +0x90              机器错误控制                         MERRCTL
> +0x91              机器错误信息1                        MERRINFO1
> +0x92              机器错误信息2                        MERRINFO2
> +0x93              机器错误异常入口地址                 MERRENTRY
> +0x94              机器错误异常返回地址                 MERRERA
> +0x95              机器错误异常数据保存                 MERRSAVE
> +0x98              高速缓存标签                         CTAG
> +0x180+n (0≤n≤3)   直接映射配置窗口n                    DMWn
> +0x200+2n (0≤n≤31) 性能监测配置n                        PMCFGn
> +0x201+2n (0≤n≤31) 性能监测计数器n                      PMCNTn
> +0x300             内存读写监视点整体控制               MWPC
> +0x301             内存读写监视点整体状态               MWPS
> +0x310+8n (0≤n≤7)  内存读写监视点n配置1                 MWPnCFG1
> +0x311+8n (0≤n≤7)  内存读写监视点n配置2                 MWPnCFG2
> +0x312+8n (0≤n≤7)  内存读写监视点n配置3                 MWPnCFG3
> +0x313+8n (0≤n≤7)  内存读写监视点n配置4                 MWPnCFG4
> +0x380             取指监视点整体控制                   FWPC
> +0x381             取指监视点整体状态                   FWPS
> +0x390+8n (0≤n≤7)  取指监视点n配置1                     FWPnCFG1
> +0x391+8n (0≤n≤7)  取指监视点n配置2                     FWPnCFG2
> +0x392+8n (0≤n≤7)  取指监视点n配置3                     FWPnCFG3
> +0x393+8n (0≤n≤7)  取指监视点n配置4                     FWPnCFG4
> +0x500             调试寄存器                           DBG
> +0x501             调试异常返回地址                     DERA
> +0x502             调试数据保存                         DSAVE
> +================= ==================================== ==========
> +
> +ERA,TLBRERA,MERREEA和ERA有时也称为EPC,TLBREPC,MERREPC和DEPC。
> +
> +基础指令集
> +==========
> +
> +指令格式
> +--------
> +
> +LoongArch的指令字长为32位,一共有9种指令格式::
> +
> +  2R-type:    Opcode + Rj + Rd
> +  3R-type:    Opcode + Rk + Rj + Rd
> +  4R-type:    Opcode + Ra + Rk + Rj + Rd
> +  2RI8-type:  Opcode + I8 + Rj + Rd
> +  2RI12-type: Opcode + I12 + Rj + Rd
> +  2RI14-type: Opcode + I14 + Rj + Rd
> +  2RI16-type: Opcode + I16 + Rj + Rd
> +  1RI21-type: Opcode + I21L + Rj + I21H
> +  I26-type:   Opcode + I26L + I26H
> +
> +Opcode是指令操作码,Rj和Rk是源操作数(寄存器),Rd是目标操作数(寄存器),Ra是
> +4R-type格式特有的附加操作数(寄存器)。I8/I12/I16/I21/I26分别是8位/12位/16位/
> +21位/26位的立即数。其中21位和26位立即数在指令字中被分割为高位部分与低位部分,
> +所以你们在这里的格式描述中能够看到I21L/I21H和I26L/I26H这样的表述。
> +
> +指令名称(助记符)
> +------------------
> +
> +我们在此只简单罗列一下指令名称,详细信息请阅读参考文献中的文档。
> +
> +算术运算指令::
> +
> +  ADD.W SUB.W ADDI.W ADD.D SUB.D ADDI.D
> +  SLT SLTU SLTI SLTUI
> +  AND OR NOR XOR ANDN ORN ANDI ORI XORI
> +  MUL.W MULH.W MULH.WU DIV.W DIV.WU MOD.W MOD.WU
> +  MUL.D MULH.D MULH.DU DIV.D DIV.DU MOD.D MOD.DU
> +  PCADDI PCADDU12I PCADDU18I
> +  LU12I.W LU32I.D LU52I.D ADDU16I.D
> +
> +移位运算指令::
> +
> +  SLL.W SRL.W SRA.W ROTR.W SLLI.W SRLI.W SRAI.W ROTRI.W
> +  SLL.D SRL.D SRA.D ROTR.D SLLI.D SRLI.D SRAI.D ROTRI.D
> +
> +位域操作指令::
> +
> +  EXT.W.B EXT.W.H CLO.W CLO.D SLZ.W CLZ.D CTO.W CTO.D CTZ.W CTZ.D
> +  BYTEPICK.W BYTEPICK.D BSTRINS.W BSTRINS.D BSTRPICK.W BSTRPICK.D
> +  REVB.2H REVB.4H REVB.2W REVB.D REVH.2W REVH.D BITREV.4B BITREV.8B BITREV.W BITREV.D
> +  MASKEQZ MASKNEZ
> +
> +分支转移指令::
> +
> +  BEQ BNE BLT BGE BLTU BGEU BEQZ BNEZ B BL JIRL
> +
> +访存读写指令::
> +
> +  LD.B LD.BU LD.H LD.HU LD.W LD.WU LD.D ST.B ST.H ST.W ST.D
> +  LDX.B LDX.BU LDX.H LDX.HU LDX.W LDX.WU LDX.D STX.B STX.H STX.W STX.D
> +  LDPTR.W LDPTR.D STPTR.W STPTR.D
> +  PRELD PRELDX
> +
> +原子操作指令::
> +
> +  LL.W SC.W LL.D SC.D
> +  AMSWAP.W AMSWAP.D AMADD.W AMADD.D AMAND.W AMAND.D AMOR.W AMOR.D AMXOR.W AMXOR.D
> +  AMMAX.W AMMAX.D AMMIN.W AMMIN.D
> +
> +栅障指令::
> +
> +  IBAR DBAR
> +
> +特殊指令::
> +
> +  SYSCALL BREAK CPUCFG NOP IDLE ERTN DBCL RDTIMEL.W RDTIMEH.W RDTIME.D ASRTLE.D ASRTGT.D
> +
> +特权指令::
> +
> +  CSRRD CSRWR CSRXCHG
> +  IOCSRRD.B IOCSRRD.H IOCSRRD.W IOCSRRD.D IOCSRWR.B IOCSRWR.H IOCSRWR.W IOCSRWR.D
> +  CACOP TLBP(TLBSRCH) TLBRD TLBWR TLBFILL TLBCLR TLBFLUSH INVTLB LDDIR LDPTE
> +
> +虚拟内存
> +========
> +
> +LoongArch可以使用直接映射虚拟内存和分页映射虚拟内存。
> +
> +直接映射虚拟内存通过CSR.DMWn(n=0~3)来进行配置,虚拟地址(VA)和物理地址(PA)
> +之间有简单的映射关系::
> +
> + VA = PA + 固定偏移
> +
> +分页映射的虚拟地址(VA)和物理地址(PA)有任意的映射关系,这种关系记录在TLB和页
> +表中。LoongArch的TLB包括一个全相联的MTLB(Multiple Page Size TLB,页大小可变)
> +和一个组相联的STLB(Single Page Size TLB,页大小固定)。
> +
> +缺省状态下,LA32的整个虚拟地址空间配置如下:
> +
> +============ =========================== ===========================
> +区段名       地址范围                    属性
> +============ =========================== ===========================
> +``UVRANGE``  ``0x00000000 - 0x7FFFFFFF`` 分页映射, 可缓存, PLV0~3
> +``KPRANGE0`` ``0x80000000 - 0x9FFFFFFF`` 直接映射, 非缓存, PLV0
> +``KPRANGE1`` ``0xA0000000 - 0xBFFFFFFF`` 直接映射, 可缓存, PLV0
> +``KVRANGE``  ``0xC0000000 - 0xFFFFFFFF`` 分页映射, 可缓存, PLV0
> +============ =========================== ===========================
> +
> +用户态(PLV3)只能访问UVRANGE,对于直接映射的KPRANGE0和KPRANGE1,将虚拟地址的第
> +30~31位清零就等于物理地址。例如:物理地址0x00001000对应的非缓存直接映射虚拟地址
> +是0x80001000,而其可缓存直接映射虚拟地址是0xA0001000。
> +
> +缺省状态下,LA64的整个虚拟地址空间配置如下:
> +
> +============ ====================== ==================================
> +区段名       地址范围               属性
> +============ ====================== ==================================
> +``XUVRANGE`` ``0x0000000000000000 - 分页映射, 可缓存, PLV0~3
> +             0x3FFFFFFFFFFFFFFF``
> +``XSPRANGE`` ``0x4000000000000000 - 直接映射, 可缓存 / 非缓存, PLV0
> +             0x7FFFFFFFFFFFFFFF``
> +``XKPRANGE`` ``0x8000000000000000 - 直接映射, 可缓存 / 非缓存, PLV0
> +             0xBFFFFFFFFFFFFFFF``
> +``XKVRANGE`` ``0xC000000000000000 - 分页映射, 可缓存, PLV0
> +             0xFFFFFFFFFFFFFFFF``
> +============ ====================== ==================================
> +
> +用户态(PLV3)只能访问XUVRANGE,对于直接映射的XSPRANGE和XKPRANGE,将虚拟地址的第
> +60~63位清零就等于物理地址,而其缓存属性是通过虚拟地址的第60~61位配置的(0表示强序
> +非缓存,1表示一致可缓存,2表示弱序非缓存)。目前,我们仅用XKPRANGE来进行直接映射,
> +XSPRANGE保留给以后用。此处给出一个直接映射的例子:物理地址0x00000000 00001000的强
> +序非缓存直接映射虚拟地址是0x80000000 00001000,其一致可缓存直接映射虚拟地址是
> +0x90000000 00001000,而其弱序非缓存直接映射虚拟地址是0xA0000000 00001000。
> +
> +Loongson与LoongArch的关系
> +=========================
> +
> +LoongArch是一种RISC指令集架构(ISA),不同于现存的任何一种ISA,而Loongson(即龙
> +芯)是一个处理器家族。龙芯包括三个系列:Loongson-1(龙芯1号)是32位处理器系列,
> +Loongson-2(龙芯2号)是低端64位处理器系列,而Loongson-3(龙芯3号)是高端64位处理
> +器系列。旧的龙芯处理器基于MIPS架构,而新的龙芯处理器基于LoongArch架构。以龙芯3号
> +为例:龙芯3A1000/3B1500/3A2000/3A3000/3A4000都是兼容MIPS的,而龙芯3A5000(以及将
> +来的型号)都是基于LoongArch的。
> +
> +参考文献
> +========
> +
> +Loongson与LoongArch的官方网站(龙芯中科技术股份有限公司):
> +
> +  http://www.loongson.cn/index.html
> +
> +Loongson与LoongArch的开发者网站(软件与文档资源):
> +
> +  http://www.loongnix.cn/index.php
> +
> +  https://github.com/loongson
> +
> +LoongArch指令集架构的文档:
> +
> +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.00-CN.pdf (中文版)
> +
> +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.00-EN.pdf (英文版)
> +
> +LoongArch的ELF ABI文档:
> +
> +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-ELF-ABI-v1.00-CN.pdf (中文版)
> +
> +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-ELF-ABI-v1.00-EN.pdf (英文版)
> +
> +Loongson与LoongArch的Linux内核源码仓库:
> +
> +  https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git
> diff --git a/Documentation/translations/zh_CN/loongarch/irq-chip-model.rst b/Documentation/translations/zh_CN/loongarch/irq-chip-model.rst
> new file mode 100644
> index 000000000000..54c0c9ebac77
> --- /dev/null
> +++ b/Documentation/translations/zh_CN/loongarch/irq-chip-model.rst
> @@ -0,0 +1,167 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +.. include:: ../disclaimer-zh_CN.rst
> +
> +:Original: Documentation/loongarch/irq-chip-model.rst
> +:Translator: Huacai Chen <chenhuacai@loongson.cn>
> +
> +==================================
> +LoongArch的IRQ芯片模型(层级关系)
> +==================================
> +
> +目前,基于LoongArch的处理器(如龙芯3A5000)只能与LS7A芯片组配合工作。LoongArch计算机
> +中的中断控制器(即IRQ芯片)包括CPUINTC(CPU Core Interrupt Controller)、LIOINTC(
> +Legacy I/O Interrupt Controller)、EIOINTC(Extended I/O Interrupt Controller)、
> +HTVECINTC(Hyper-Transport Vector Interrupt Controller)、PCH-PIC(LS7A芯片组的主中
> +断控制器)、PCH-LPC(LS7A芯片组的LPC中断控制器)和PCH-MSI(MSI中断控制器)。
> +
> +CPUINTC是一种CPU内部的每个核本地的中断控制器,LIOINTC/EIOINTC/HTVECINTC是CPU内部的
> +全局中断控制器(每个芯片一个,所有核共享),而PCH-PIC/PCH-LPC/PCH-MSI是CPU外部的中
> +断控制器(在配套芯片组里面)。这些中断控制器(或者说IRQ芯片)以一种层次树的组织形式
> +级联在一起,一共有两种层级关系模型(传统IRQ模型和扩展IRQ模型)。
> +
> +传统IRQ模型
> +===========
> +
> +在这种模型里面,IPI(Inter-Processor Interrupt)和CPU本地始终中断直接发送到CPUINTC,
> +CPU串口(UARTs)中断发送到LIOINTC,而其他所有设备的中断则分别发送到所连接的PCH-PIC/
> +PCH-LPC/PCH-MSI,然后被HTVECINTC统一收集,再发送到LIOINTC,最后到达CPUINTC。
> +
> + +---------------------------------------------+
> + |::                                           |
> + |                                             |
> + |    +-----+     +---------+     +-------+    |
> + |    | IPI | --> | CPUINTC | <-- | Timer |    |
> + |    +-----+     +---------+     +-------+    |
> + |                     ^                       |
> + |                     |                       |
> + |                +---------+     +-------+    |
> + |                | LIOINTC | <-- | UARTs |    |
> + |                +---------+     +-------+    |
> + |                     ^                       |
> + |                     |                       |
> + |               +-----------+                 |
> + |               | HTVECINTC |                 |
> + |               +-----------+                 |
> + |                ^         ^                  |
> + |                |         |                  |
> + |          +---------+ +---------+            |
> + |          | PCH-PIC | | PCH-MSI |            |
> + |          +---------+ +---------+            |
> + |            ^     ^           ^              |
> + |            |     |           |              |
> + |    +---------+ +---------+ +---------+      |
> + |    | PCH-LPC | | Devices | | Devices |      |
> + |    +---------+ +---------+ +---------+      |
> + |         ^                                   |
> + |         |                                   |
> + |    +---------+                              |
> + |    | Devices |                              |
> + |    +---------+                              |
> + |                                             |
> + |                                             |
> + +---------------------------------------------+
> +
> +扩展IRQ模型
> +===========
> +
> +在这种模型里面,IPI(Inter-Processor Interrupt)和CPU本地始终中断直接发送到CPUINTC,
> +CPU串口(UARTs)中断发送到LIOINTC,而其他所有设备的中断则分别发送到所连接的PCH-PIC/
> +PCH-LPC/PCH-MSI,然后被EIOINTC统一收集,再直接到达CPUINTC。
> +
> + +--------------------------------------------------------+
> + |::                                                      |
> + |                                                        |
> + |         +-----+     +---------+     +-------+          |
> + |         | IPI | --> | CPUINTC | <-- | Timer |          |
> + |         +-----+     +---------+     +-------+          |
> + |                      ^       ^                         |
> + |                      |       |                         |
> + |               +---------+ +---------+     +-------+    |
> + |               | EIOINTC | | LIOINTC | <-- | UARTs |    |
> + |               +---------+ +---------+     +-------+    |
> + |                ^       ^                               |
> + |                |       |                               |
> + |         +---------+ +---------+                        |
> + |         | PCH-PIC | | PCH-MSI |                        |
> + |         +---------+ +---------+                        |
> + |           ^     ^           ^                          |
> + |           |     |           |                          |
> + |   +---------+ +---------+ +---------+                  |
> + |   | PCH-LPC | | Devices | | Devices |                  |
> + |   +---------+ +---------+ +---------+                  |
> + |        ^                                               |
> + |        |                                               |
> + |   +---------+                                          |
> + |   | Devices |                                          |
> + |   +---------+                                          |
> + |                                                        |
> + |                                                        |
> + +--------------------------------------------------------+
> +
> +ACPI相关的定义
> +==============
> +
> +CPUINTC::
> +
> +  ACPI_MADT_TYPE_CORE_PIC;
> +  struct acpi_madt_core_pic;
> +  enum acpi_madt_core_pic_version;
> +
> +LIOINTC::
> +
> +  ACPI_MADT_TYPE_LIO_PIC;
> +  struct acpi_madt_lio_pic;
> +  enum acpi_madt_lio_pic_version;
> +
> +EIOINTC::
> +
> +  ACPI_MADT_TYPE_EIO_PIC;
> +  struct acpi_madt_eio_pic;
> +  enum acpi_madt_eio_pic_version;
> +
> +HTVECINTC::
> +
> +  ACPI_MADT_TYPE_HT_PIC;
> +  struct acpi_madt_ht_pic;
> +  enum acpi_madt_ht_pic_version;
> +
> +PCH-PIC::
> +
> +  ACPI_MADT_TYPE_BIO_PIC;
> +  struct acpi_madt_bio_pic;
> +  enum acpi_madt_bio_pic_version;
> +
> +PCH-MSI::
> +
> +  ACPI_MADT_TYPE_MSI_PIC;
> +  struct acpi_madt_msi_pic;
> +  enum acpi_madt_msi_pic_version;
> +
> +PCH-LPC::
> +
> +  ACPI_MADT_TYPE_LPC_PIC;
> +  struct acpi_madt_lpc_pic;
> +  enum acpi_madt_lpc_pic_version;
> +
> +参考文献
> +========
> +
> +龙芯3A5000的文档:
> +
> +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-3A5000-usermanual-1.02-CN.pdf (中文版)
> +
> +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-3A5000-usermanual-1.02-EN.pdf (英文版)
> +
> +龙芯LS7A芯片组的文档:
> +
> +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-7A1000-usermanual-2.00-CN.pdf (中文版)
> +
> +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-7A1000-usermanual-2.00-EN.pdf (英文版)
> +
> +注:CPUINTC即《龙芯架构参考手册卷一》第7.4节所描述的CSR.ECFG/CSR.ESTAT寄存器及其中断
> +控制逻辑;LIOINTC即《龙芯3A5000处理器使用手册》第11.1节所描述的“传统I/O中断”;EIOINTC
> +即《龙芯3A5000处理器使用手册》第11.2节所描述的“扩展I/O中断”;HTVECINTC即《龙芯3A5000
> +处理器使用手册》第14.3节所描述的“HyperTransport中断”;PCH-PIC/PCH-MSI即《龙芯7A1000桥
> +片用户手册》第5章所描述的“中断控制器”;PCH-LPC即《龙芯7A1000桥片用户手册》第24.3节所
> +描述的“LPC中断”。

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 03/24] LoongArch: Add elf-related definitions
  2022-04-30  9:04 ` [PATCH V9 03/24] LoongArch: Add elf-related definitions Huacai Chen
@ 2022-05-01  9:41   ` WANG Xuerui
  2022-05-01 14:27     ` Huacai Chen
  0 siblings, 1 reply; 94+ messages in thread
From: WANG Xuerui @ 2022-05-01  9:41 UTC (permalink / raw)
  To: Huacai Chen, Arnd Bergmann, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang

Hi,

Commit message title could be "ELF" -- proper capitalization.

On 4/30/22 17:04, Huacai Chen wrote:
> Add elf-related definitions for LoongArch, including: EM_LOONGARCH,
> KEXEC_ARCH_LOONGARCH, AUDIT_ARCH_LOONGARCH32, AUDIT_ARCH_LOONGARCH64
> and NT_LOONGARCH_*.
>
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> ---
>   include/uapi/linux/audit.h  | 2 ++
>   include/uapi/linux/elf-em.h | 1 +
>   include/uapi/linux/elf.h    | 5 +++++
>   include/uapi/linux/kexec.h  | 1 +
>   scripts/sorttable.c         | 5 +++++
>   5 files changed, 14 insertions(+)
>
> diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
> index 8eda133ca4c1..7c1dc818b1d5 100644
> --- a/include/uapi/linux/audit.h
> +++ b/include/uapi/linux/audit.h
> @@ -439,6 +439,8 @@ enum {
>   #define AUDIT_ARCH_UNICORE	(EM_UNICORE|__AUDIT_ARCH_LE)
>   #define AUDIT_ARCH_X86_64	(EM_X86_64|__AUDIT_ARCH_64BIT|__AUDIT_ARCH_LE)
>   #define AUDIT_ARCH_XTENSA	(EM_XTENSA)
> +#define AUDIT_ARCH_LOONGARCH32	(EM_LOONGARCH|__AUDIT_ARCH_LE)
> +#define AUDIT_ARCH_LOONGARCH64	(EM_LOONGARCH|__AUDIT_ARCH_64BIT|__AUDIT_ARCH_LE)
>   
>   #define AUDIT_PERM_EXEC		1
>   #define AUDIT_PERM_WRITE	2
> diff --git a/include/uapi/linux/elf-em.h b/include/uapi/linux/elf-em.h
> index f47e853546fa..ef38c2bc5ab7 100644
> --- a/include/uapi/linux/elf-em.h
> +++ b/include/uapi/linux/elf-em.h
> @@ -51,6 +51,7 @@
>   #define EM_RISCV	243	/* RISC-V */
>   #define EM_BPF		247	/* Linux BPF - in-kernel virtual machine */
>   #define EM_CSKY		252	/* C-SKY */
> +#define EM_LOONGARCH	258	/* LoongArch */
>   #define EM_FRV		0x5441	/* Fujitsu FR-V */
>   
>   /*
> diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
> index 7ce993e6786c..1e0ae3f554f6 100644
> --- a/include/uapi/linux/elf.h
> +++ b/include/uapi/linux/elf.h
> @@ -436,6 +436,11 @@ typedef struct elf64_shdr {
>   #define NT_MIPS_DSP	0x800		/* MIPS DSP ASE registers */
>   #define NT_MIPS_FP_MODE	0x801		/* MIPS floating-point mode */
>   #define NT_MIPS_MSA	0x802		/* MIPS SIMD registers */
> +#define NT_LOONGARCH_CPUCFG	0xa00	/* LoongArch CPU config registers */
> +#define NT_LOONGARCH_CSR	0xa01	/* LoongArch control and status registers */
> +#define NT_LOONGARCH_LSX	0xa02	/* LoongArch Loongson SIMD Extension registers */
> +#define NT_LOONGARCH_LASX	0xa03	/* LoongArch Loongson Advanced SIMD Extension registers */
> +#define NT_LOONGARCH_LBT	0xa04	/* LoongArch Loongson Binary Translation registers */
These are named NT_LARCH_* in binutils source, better keep consistent?
>   
>   /* Note types with note name "GNU" */
>   #define NT_GNU_PROPERTY_TYPE_0	5
> diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h
> index fb7e2ef60825..981016e05cfa 100644
> --- a/include/uapi/linux/kexec.h
> +++ b/include/uapi/linux/kexec.h
> @@ -43,6 +43,7 @@
>   #define KEXEC_ARCH_MIPS    ( 8 << 16)
>   #define KEXEC_ARCH_AARCH64 (183 << 16)
>   #define KEXEC_ARCH_RISCV   (243 << 16)
> +#define KEXEC_ARCH_LOONGARCH	(258 << 16)
>   
>   /* The artificial cap on the number of segments passed to kexec_load. */
>   #define KEXEC_SEGMENT_MAX 16
> diff --git a/scripts/sorttable.c b/scripts/sorttable.c
> index d00504c5f530..fba40e99f354 100644
> --- a/scripts/sorttable.c
> +++ b/scripts/sorttable.c
> @@ -60,6 +60,10 @@
>   #define EM_RISCV	243
>   #endif
>   
> +#ifndef EM_LOONGARCH
> +#define EM_LOONGARCH	258
> +#endif
> +
>   static uint32_t (*r)(const uint32_t *);
>   static uint16_t (*r2)(const uint16_t *);
>   static uint64_t (*r8)(const uint64_t *);
> @@ -313,6 +317,7 @@ static int do_file(char const *const fname, void *addr)
>   	case EM_ARCOMPACT:
>   	case EM_ARCV2:
>   	case EM_ARM:
> +	case EM_LOONGARCH:
>   	case EM_MICROBLAZE:
>   	case EM_MIPS:
>   	case EM_XTENSA:

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 05/24] LoongArch: Add build infrastructure
  2022-04-30  9:04 ` [PATCH V9 05/24] LoongArch: Add build infrastructure Huacai Chen
@ 2022-05-01 10:09   ` WANG Xuerui
  2022-05-01 12:41     ` Huacai Chen
  2022-05-01 15:43     ` Xi Ruoyao
  0 siblings, 2 replies; 94+ messages in thread
From: WANG Xuerui @ 2022-05-01 10:09 UTC (permalink / raw)
  To: Huacai Chen, Arnd Bergmann, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang


On 4/30/22 17:04, Huacai Chen wrote:
> This patch adds Kbuild, Makefile, Kconfig and link script for LoongArch
> build infrastructure.
>
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> ---
>   arch/loongarch/.gitignore              |   9 +
>   arch/loongarch/Kbuild                  |   3 +
>   arch/loongarch/Kconfig                 | 351 +++++++++++++++++++++++++
>   arch/loongarch/Kconfig.debug           |   0
>   arch/loongarch/Makefile                |  99 +++++++
>   arch/loongarch/include/asm/Kbuild      |  29 ++
>   arch/loongarch/include/uapi/asm/Kbuild |   2 +
>   arch/loongarch/kernel/Makefile         |  22 ++
>   arch/loongarch/kernel/vmlinux.lds.S    | 100 +++++++
>   arch/loongarch/lib/Makefile            |   7 +
>   arch/loongarch/mm/Makefile             |   9 +
>   arch/loongarch/pci/Makefile            |   7 +
>   scripts/subarch.include                |   2 +-
>   13 files changed, 639 insertions(+), 1 deletion(-)
>   create mode 100644 arch/loongarch/.gitignore
>   create mode 100644 arch/loongarch/Kbuild
>   create mode 100644 arch/loongarch/Kconfig
>   create mode 100644 arch/loongarch/Kconfig.debug
>   create mode 100644 arch/loongarch/Makefile
>   create mode 100644 arch/loongarch/include/asm/Kbuild
>   create mode 100644 arch/loongarch/include/uapi/asm/Kbuild
>   create mode 100644 arch/loongarch/kernel/Makefile
>   create mode 100644 arch/loongarch/kernel/vmlinux.lds.S
>   create mode 100644 arch/loongarch/lib/Makefile
>   create mode 100644 arch/loongarch/mm/Makefile
>   create mode 100644 arch/loongarch/pci/Makefile
>
> diff --git a/arch/loongarch/.gitignore b/arch/loongarch/.gitignore
> new file mode 100644
> index 000000000000..fd88d21e7172
> --- /dev/null
> +++ b/arch/loongarch/.gitignore
> @@ -0,0 +1,9 @@
> +*.lds
> +*.raw
> +calc_vmlinuz_load_addr
> +elf-entry
> +relocs
> +vmlinux*
> +vmlinuz*
> +
> +!kernel/vmlinux.lds.S
This exclude entry is unnecessary?
> diff --git a/arch/loongarch/Kbuild b/arch/loongarch/Kbuild
> new file mode 100644
> index 000000000000..1ad35aabdd16
> --- /dev/null
> +++ b/arch/loongarch/Kbuild
> @@ -0,0 +1,3 @@
> +obj-y += kernel/
> +obj-y += mm/
> +obj-y += vdso/
> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> new file mode 100644
> index 000000000000..44b763046893
> --- /dev/null
> +++ b/arch/loongarch/Kconfig
> @@ -0,0 +1,351 @@
> +# SPDX-License-Identifier: GPL-2.0
> +config LOONGARCH
> +	bool
> +	default y
> +	select ACPI_MCFG if ACPI
> +	select ACPI_SYSTEM_POWER_STATES_SUPPORT	if ACPI
> +	select ARCH_BINFMT_ELF_STATE
> +	select ARCH_ENABLE_MEMORY_HOTPLUG
> +	select ARCH_ENABLE_MEMORY_HOTREMOVE
> +	select ARCH_HAS_ACPI_TABLE_UPGRADE	if ACPI
> +	select ARCH_HAS_PTE_SPECIAL
> +	select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
> +	select ARCH_INLINE_READ_LOCK if !PREEMPTION
> +	select ARCH_INLINE_READ_LOCK_BH if !PREEMPTION
> +	select ARCH_INLINE_READ_LOCK_IRQ if !PREEMPTION
> +	select ARCH_INLINE_READ_LOCK_IRQSAVE if !PREEMPTION
> +	select ARCH_INLINE_READ_UNLOCK if !PREEMPTION
> +	select ARCH_INLINE_READ_UNLOCK_BH if !PREEMPTION
> +	select ARCH_INLINE_READ_UNLOCK_IRQ if !PREEMPTION
> +	select ARCH_INLINE_READ_UNLOCK_IRQRESTORE if !PREEMPTION
> +	select ARCH_INLINE_WRITE_LOCK if !PREEMPTION
> +	select ARCH_INLINE_WRITE_LOCK_BH if !PREEMPTION
> +	select ARCH_INLINE_WRITE_LOCK_IRQ if !PREEMPTION
> +	select ARCH_INLINE_WRITE_LOCK_IRQSAVE if !PREEMPTION
> +	select ARCH_INLINE_WRITE_UNLOCK if !PREEMPTION
> +	select ARCH_INLINE_WRITE_UNLOCK_BH if !PREEMPTION
> +	select ARCH_INLINE_WRITE_UNLOCK_IRQ if !PREEMPTION
> +	select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE if !PREEMPTION
> +	select ARCH_INLINE_SPIN_TRYLOCK if !PREEMPTION
> +	select ARCH_INLINE_SPIN_TRYLOCK_BH if !PREEMPTION
> +	select ARCH_INLINE_SPIN_LOCK if !PREEMPTION
> +	select ARCH_INLINE_SPIN_LOCK_BH if !PREEMPTION
> +	select ARCH_INLINE_SPIN_LOCK_IRQ if !PREEMPTION
> +	select ARCH_INLINE_SPIN_LOCK_IRQSAVE if !PREEMPTION
> +	select ARCH_INLINE_SPIN_UNLOCK if !PREEMPTION
> +	select ARCH_INLINE_SPIN_UNLOCK_BH if !PREEMPTION
> +	select ARCH_INLINE_SPIN_UNLOCK_IRQ if !PREEMPTION
> +	select ARCH_INLINE_SPIN_UNLOCK_IRQRESTORE if !PREEMPTION
> +	select ARCH_MIGHT_HAVE_PC_PARPORT
> +	select ARCH_MIGHT_HAVE_PC_SERIO
> +	select ARCH_SPARSEMEM_ENABLE
> +	select ARCH_SUPPORTS_ACPI
> +	select ARCH_SUPPORTS_ATOMIC_RMW
> +	select ARCH_SUPPORTS_HUGETLBFS
> +	select ARCH_USE_BUILTIN_BSWAP
> +	select ARCH_USE_CMPXCHG_LOCKREF
> +	select ARCH_USE_QUEUED_RWLOCKS
> +	select ARCH_USE_QUEUED_SPINLOCKS
> +	select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT
> +	select ARCH_WANTS_NO_INSTR
> +	select BUILDTIME_TABLE_SORT
> +	select COMMON_CLK
> +	select GENERIC_CLOCKEVENTS
> +	select GENERIC_CMOS_UPDATE
> +	select GENERIC_CPU_AUTOPROBE
> +	select GENERIC_ENTRY
> +	select GENERIC_FIND_FIRST_BIT
> +	select GENERIC_GETTIMEOFDAY
> +	select GENERIC_IRQ_MULTI_HANDLER
> +	select GENERIC_IRQ_PROBE
> +	select GENERIC_IRQ_SHOW
> +	select GENERIC_LIB_ASHLDI3
> +	select GENERIC_LIB_ASHRDI3
> +	select GENERIC_LIB_CMPDI2
> +	select GENERIC_LIB_LSHRDI3
> +	select GENERIC_LIB_UCMPDI2
> +	select GENERIC_PCI_IOMAP
> +	select GENERIC_SCHED_CLOCK
> +	select GENERIC_TIME_VSYSCALL
> +	select GPIOLIB
> +	select HAVE_ARCH_AUDITSYSCALL
> +	select HAVE_ARCH_COMPILER_H
> +	select HAVE_ARCH_MMAP_RND_BITS if MMU
> +	select HAVE_ARCH_SECCOMP_FILTER
> +	select HAVE_ARCH_TRACEHOOK
> +	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
> +	select HAVE_ASM_MODVERSIONS
> +	select HAVE_CONTEXT_TRACKING
> +	select HAVE_COPY_THREAD_TLS
> +	select HAVE_DEBUG_KMEMLEAK
> +	select HAVE_DEBUG_STACKOVERFLOW
> +	select HAVE_DMA_CONTIGUOUS
> +	select HAVE_EXIT_THREAD
> +	select HAVE_FAST_GUP
> +	select HAVE_GENERIC_VDSO
> +	select HAVE_IOREMAP_PROT
> +	select HAVE_IRQ_EXIT_ON_IRQ_STACK
> +	select HAVE_IRQ_TIME_ACCOUNTING
> +	select HAVE_MEMBLOCK
> +	select HAVE_MEMBLOCK_NODE_MAP
> +	select HAVE_MOD_ARCH_SPECIFIC
> +	select HAVE_NMI
> +	select HAVE_PCI
> +	select HAVE_PERF_EVENTS
> +	select HAVE_REGS_AND_STACK_ACCESS_API
> +	select HAVE_RSEQ
> +	select HAVE_SYSCALL_TRACEPOINTS
> +	select HAVE_TIF_NOHZ
> +	select HAVE_VIRT_CPU_ACCOUNTING_GEN
> +	select IRQ_FORCED_THREADING
> +	select IRQ_LOONGARCH_CPU
> +	select MODULES_USE_ELF_RELA if MODULES
> +	select PCI
> +	select PCI_DOMAINS_GENERIC
> +	select PCI_ECAM if ACPI
> +	select PCI_MSI_ARCH_FALLBACKS
> +	select PERF_USE_VMALLOC
> +	select RTC_LIB
> +	select SPARSE_IRQ
> +	select SYSCTL_EXCEPTION_TRACE
> +	select SWIOTLB
> +	select TRACE_IRQFLAGS_SUPPORT
> +	select ZONE_DMA32
> +
> +config 32BIT
> +	bool
> +
> +config 64BIT
> +	def_bool y
> +
> +config CPU_HAS_FPU
> +	bool
> +	default y
> +
> +config CPU_HAS_PREFETCH
> +	bool
> +	default y
> +
> +config GENERIC_CALIBRATE_DELAY
> +	def_bool y
> +
> +config GENERIC_CSUM
> +	def_bool y
> +
> +config GENERIC_HWEIGHT
> +	def_bool y
> +
> +config L1_CACHE_SHIFT
> +	int
> +	default "6"
> +
> +config LOCKDEP_SUPPORT
> +	bool
> +	default y
> +
> +config MACH_LOONGSON32
> +	def_bool 32BIT
> +
> +config MACH_LOONGSON64
> +	def_bool 64BIT
These two config symbols are not used anywhere in arch/loongarch, but 
from a quick grep it seems they're sharing the names of the MIPS config 
symbols, on purpose, maybe for sharing code between the MIPS-era 
Loongson models and the LoongArch models. If so, a comment explaining 
this could be beneficial.
> +
> +config PAGE_SIZE_4KB
> +	bool
> +
> +config PAGE_SIZE_16KB
> +	bool
> +
> +config PAGE_SIZE_64KB
> +	bool
> +
> +config PGTABLE_2LEVEL
> +	bool
> +
> +config PGTABLE_3LEVEL
> +	bool
> +
> +config PGTABLE_4LEVEL
> +	bool
> +
> +config PGTABLE_LEVELS
> +	int
> +	default 2 if PGTABLE_2LEVEL
> +	default 3 if PGTABLE_3LEVEL
> +	default 4 if PGTABLE_4LEVEL
> +
> +config SCHED_OMIT_FRAME_POINTER
> +	bool
> +	default y
> +
> +menu "Kernel type"
> +
> +source "kernel/Kconfig.hz"
> +
> +choice
> +	prompt "Page Table Layout"
> +	default 16KB_2LEVEL if 32BIT
> +	default 16KB_3LEVEL if 64BIT
> +	help
> +	  Allows choosing the page table layout, which is a combination
> +	  of page size and page table levels. The virtual memory address
> +	  space bits are determined by the page table layout.
"The size of virtual memory address space"?
> +
> +config 4KB_3LEVEL
> +	bool "4KB with 3 levels"
> +	select PAGE_SIZE_4KB
> +	select PGTABLE_3LEVEL
> +	help
> +	  This option selects 4KB page size with 3 level page tables, which
> +	  support a maximum 39 bits of application virtual memory.
"a maximum of XX bits" -- similarly for all occurrences below.
> +
> +config 4KB_4LEVEL
> +	bool "4KB with 4 levels"
> +	select PAGE_SIZE_4KB
> +	select PGTABLE_4LEVEL
> +	help
> +	  This option selects 4KB page size with 4 level page tables, which
> +	  support a maximum 48 bits of application virtual memory.
> +
> +config 16KB_2LEVEL
> +	bool "16KB with 2 levels"
> +	select PAGE_SIZE_16KB
> +	select PGTABLE_2LEVEL
> +	help
> +	  This option selects 16KB page size with 2 level page tables, which
> +	  support a maximum 36 bits of application virtual memory.
> +
> +config 16KB_3LEVEL
> +	bool "16KB with 3 levels"
> +	select PAGE_SIZE_16KB
> +	select PGTABLE_3LEVEL
> +	help
> +	  This option selects 16KB page size with 3 level page tables, which
> +	  support a maximum 47 bits of application virtual memory.
> +
> +config 64KB_2LEVEL
> +	bool "64KB with 2 levels"
> +	select PAGE_SIZE_64KB
> +	select PGTABLE_2LEVEL
> +	help
> +	  This option selects 64KB page size with 2 level page tables, which
> +	  support a maximum 42 bits of application virtual memory.
> +
> +config 64KB_3LEVEL
> +	bool "64KB with 3 levels"
> +	select PAGE_SIZE_64KB
> +	select PGTABLE_3LEVEL
> +	help
> +	  This option selects 64KB page size with 3 level page tables, which
> +	  support a maximum 55 bits of application virtual memory.
> +
> +endchoice
> +
> +config DMI
> +	bool "Enable DMI scanning"
> +	select DMI_SCAN_MACHINE_NON_EFI_FALLBACK
> +	default y
> +	help
> +	  Enabled scanning of DMI to identify machine quirks. Say Y
Should be "Enable scanning ..." but the arch/x86 and arch/mips versions 
of this text all have this typo. Might be wise to fix here... then fix 
the other two later.
> +	  here unless you have verified that your setup is not
> +	  affected by entries in the DMI blacklist. Required by PNP
> +	  BIOS code.
Do we have a "PNP BIOS"? I know this is also copied text, but we may 
tweak it to suit our platform.
> +
> +config EFI
> +	bool "EFI runtime service support"
> +	select UCS2_STRING
> +	select EFI_RUNTIME_WRAPPERS
> +	help
> +	  This enables the kernel to use EFI runtime services that are
> +	  available (such as the EFI variable services).
> +
> +	  This option is only useful on systems that have EFI firmware.
> +	  In addition, you should use the latest ELILO loader available
> +	  at <http://elilo.sourceforge.net> in order to take advantage
> +	  of EFI runtime services. However, even with this option, the
Remove mention of ELILO?
> +	  resultant kernel should continue to boot on existing non-EFI
> +	  platforms.
> +
> +config FORCE_MAX_ZONEORDER
> +	int "Maximum zone order"
> +	range 14 64 if PAGE_SIZE_64KB
> +	default "14" if PAGE_SIZE_64KB
> +	range 12 64 if PAGE_SIZE_16KB
> +	default "12" if PAGE_SIZE_16KB
> +	range 11 64
> +	default "11"
> +	help
> +	  The kernel memory allocator divides physically contiguous memory
> +	  blocks into "zones", where each zone is a power of two number of
> +	  pages.  This option selects the largest power of two that the kernel
> +	  keeps in the memory allocator.  If you need to allocate very large
> +	  blocks of physically contiguous memory, then you may need to
> +	  increase this value.
> +
> +	  This config option is actually maximum order plus one. For example,
> +	  a value of 11 means that the largest free memory block is 2^10 pages.
> +
> +	  The page size is not necessarily 4KB.  Keep this in mind
> +	  when choosing a value for this option.
> +
> +config SECCOMP
> +	bool "Enable seccomp to safely compute untrusted bytecode"
> +	depends on PROC_FS
> +	default y
> +	help
> +	  This kernel feature is useful for number crunching applications
> +	  that may need to compute untrusted bytecode during their
> +	  execution. By using pipes or other transports made available to
> +	  the process as file descriptors supporting the read/write
> +	  syscalls, it's possible to isolate those applications in
> +	  their own address space using seccomp. Once seccomp is
> +	  enabled via /proc/<pid>/seccomp, it cannot be disabled
> +	  and the task is only allowed to execute a few safe syscalls
> +	  defined by each seccomp mode.
> +
> +	  If unsure, say Y. Only embedded should say N here.
> +
> +endmenu
> +
> +config ARCH_SELECT_MEMORY_MODEL
> +	def_bool y
> +
> +config ARCH_FLATMEM_ENABLE
> +	def_bool y
> +
> +config ARCH_SPARSEMEM_ENABLE
> +	def_bool y
> +	help
> +	  Say Y to support efficient handling of sparse physical memory,
> +	  for architectures which are either NUMA (Non-Uniform Memory Access)
> +	  or have huge holes in the physical address space for other reasons.
> +	  See <file:Documentation/vm/numa.rst> for more.
> +
> +config ARCH_ENABLE_THP_MIGRATION
> +	def_bool y
> +	depends on TRANSPARENT_HUGEPAGE
> +
> +config ARCH_MEMORY_PROBE
> +	def_bool y
> +	depends on MEMORY_HOTPLUG
> +
> +config MMU
> +	bool
> +	default y
> +
> +config ARCH_MMAP_RND_BITS_MIN
> +	default 12
> +
> +config ARCH_MMAP_RND_BITS_MAX
> +	default 18
> +
> +menu "Bus options"
> +
> +endmenu
> +
> +menu "Power management options"
> +
> +source "drivers/acpi/Kconfig"
> +
> +endmenu
> +
> +source "drivers/firmware/Kconfig"
> diff --git a/arch/loongarch/Kconfig.debug b/arch/loongarch/Kconfig.debug
> new file mode 100644
> index 000000000000..e69de29bb2d1
> diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
> new file mode 100644
> index 000000000000..0a40e79b3265
> --- /dev/null
> +++ b/arch/loongarch/Makefile
> @@ -0,0 +1,99 @@
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Author: Huacai Chen <chenhuacai@loongson.cn>
> +# Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> +
> +#
> +# Select the object file format to substitute into the linker script.
> +#
> +64bit-tool-archpref	= loongarch64
> +32bit-bfd		= elf32-loongarch
> +64bit-bfd		= elf64-loongarch
> +32bit-emul		= elf32loongarch
> +64bit-emul		= elf64loongarch
> +
> +ifdef CONFIG_64BIT
> +tool-archpref		= $(64bit-tool-archpref)
> +UTS_MACHINE		:= loongarch64
> +endif
> +
> +ifneq ($(SUBARCH),$(ARCH))
> +  ifeq ($(CROSS_COMPILE),)
> +    CROSS_COMPILE := $(call cc-cross-prefix, $(tool-archpref)-linux-  $(tool-archpref)-linux-gnu-  $(tool-archpref)-unknown-linux-gnu-)
> +  endif
> +endif
> +
> +cflags-y += $(call cc-option, -mno-check-zero-division)
> +
> +ifdef CONFIG_64BIT
> +ld-emul			= $(64bit-emul)
> +cflags-y		+= -mabi=lp64s
> +endif
> +
> +all-y			:= vmlinux
> +
> +#
> +# GCC uses -G0 -mabicalls -fpic as default.  We don't want PIC in the kernel
> +# code since it only slows down the whole thing.  At some point we might make
> +# use of global pointer optimizations but their use of $r2 conflicts with
> +# the current pointer optimization.
LoongArch doesn't have any notion of "abicalls", please remove the whole 
MIPS legacy... or at least replace with something suitable for LoongArch.
> +#
> +cflags-y			+= -G0 -pipe
> +cflags-y			+= -msoft-float
> +LDFLAGS_vmlinux			+= -G0 -static -n -nostdlib
> +KBUILD_AFLAGS_KERNEL		+= -Wa,-mla-global-with-pcrel
> +KBUILD_CFLAGS_KERNEL		+= -Wa,-mla-global-with-pcrel
> +KBUILD_AFLAGS_MODULE		+= -Wa,-mla-global-with-abs
> +KBUILD_CFLAGS_MODULE		+= -fplt -Wa,-mla-global-with-abs,-mla-local-with-abs
These switches are the ones that should receive more love via 
comments... they are here to tell the assembler to emit the "la.global" 
and "la.local" pseudo-insns in a particular "flavor". Why not simply use 
the default? This needs explanation!
> +
> +cflags-y += -ffreestanding
> +cflags-y += $(call as-option,-Wa$(comma)-mno-fix-loongson3-llsc,)
Unfortunately we're still working around the LL/SC hardware issue even 
after migrating to LoongArch... might be better to add a comment too. 
(something along the line of "we work around the issue manually in the 
handwritten assembly, so no automatic workarounds should kick in")
> +
> +load-y		= 0x9000000000200000
> +bootvars-y	= VMLINUX_LOAD_ADDRESS=$(load-y)
> +
> +drivers-$(CONFIG_PCI)		+= arch/loongarch/pci/
> +
> +KBUILD_AFLAGS	+= $(cflags-y)
> +KBUILD_CFLAGS	+= $(cflags-y)
> +KBUILD_CPPFLAGS += -DVMLINUX_LOAD_ADDRESS=$(load-y)
> +
> +# This is required to get dwarf unwinding tables into .debug_frame
> +# instead of .eh_frame so we don't discard them.
> +KBUILD_CFLAGS += -fno-asynchronous-unwind-tables
> +KBUILD_CFLAGS += -isystem $(shell $(CC) -print-file-name=include)
> +KBUILD_CFLAGS += $(call cc-option,-mstrict-align)
Explain reason of this -mstrict-align request -- it's because not all 
LoongArch cores support unaligned accesses, and as kernel we can't rely 
on others to provide emulation for these accesses.
> +
> +KBUILD_LDFLAGS	+= -m $(ld-emul)
> +
> +ifdef CONFIG_LOONGARCH
> +CHECKFLAGS += $(shell $(CC) $(KBUILD_CFLAGS) -dM -E -x c /dev/null | \
> +	egrep -vw '__GNUC_(MINOR_|PATCHLEVEL_)?_' | \
> +	sed -e "s/^\#define /-D'/" -e "s/ /'='/" -e "s/$$/'/" -e 's/\$$/&&/g')
> +endif
> +
> +head-y := arch/loongarch/kernel/head.o
> +
> +libs-y += arch/loongarch/lib/
> +
> +prepare: vdso_prepare
> +vdso_prepare: prepare0
> +	$(Q)$(MAKE) $(build)=arch/loongarch/vdso include/generated/vdso-offsets.h
> +
> +PHONY += vdso_install
> +vdso_install:
> +	$(Q)$(MAKE) $(build)=arch/loongarch/vdso $@
> +
> +all:	$(all-y)
> +
> +CLEAN_FILES += vmlinux
> +
> +install:
> +	$(Q)install -D -m 755 vmlinux $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
> +	$(Q)install -D -m 644 .config $(INSTALL_PATH)/config-$(KERNELRELEASE)
> +	$(Q)install -D -m 644 System.map $(INSTALL_PATH)/System.map-$(KERNELRELEASE)
> +
> +define archhelp
> +	echo '  install              - install kernel into $(INSTALL_PATH)'
> +	echo
> +endef
> diff --git a/arch/loongarch/include/asm/Kbuild b/arch/loongarch/include/asm/Kbuild
> new file mode 100644
> index 000000000000..a0eed6076c79
> --- /dev/null
> +++ b/arch/loongarch/include/asm/Kbuild
> @@ -0,0 +1,29 @@
> +# SPDX-License-Identifier: GPL-2.0
> +generic-y += dma-contiguous.h
> +generic-y += export.h
> +generic-y += mcs_spinlock.h
> +generic-y += parport.h
> +generic-y += early_ioremap.h
> +generic-y += qrwlock.h
> +generic-y += qspinlock.h
> +generic-y += rwsem.h
> +generic-y += segment.h
> +generic-y += user.h
> +generic-y += stat.h
> +generic-y += fcntl.h
> +generic-y += ioctl.h
> +generic-y += ioctls.h
> +generic-y += mman.h
> +generic-y += msgbuf.h
> +generic-y += sembuf.h
> +generic-y += shmbuf.h
> +generic-y += statfs.h
> +generic-y += socket.h
> +generic-y += sockios.h
> +generic-y += termios.h
> +generic-y += termbits.h
> +generic-y += poll.h
> +generic-y += param.h
> +generic-y += posix_types.h
> +generic-y += resource.h
> +generic-y += kvm_para.h
> diff --git a/arch/loongarch/include/uapi/asm/Kbuild b/arch/loongarch/include/uapi/asm/Kbuild
> new file mode 100644
> index 000000000000..4aa680ca2e5f
> --- /dev/null
> +++ b/arch/loongarch/include/uapi/asm/Kbuild
> @@ -0,0 +1,2 @@
> +# SPDX-License-Identifier: GPL-2.0
> +generic-y += kvm_para.h
> diff --git a/arch/loongarch/kernel/Makefile b/arch/loongarch/kernel/Makefile
> new file mode 100644
> index 000000000000..ead27a11e8e0
> --- /dev/null
> +++ b/arch/loongarch/kernel/Makefile
> @@ -0,0 +1,22 @@
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Makefile for the Linux/LoongArch kernel.
> +#
> +
> +extra-y		:= head.o vmlinux.lds
> +
> +obj-y		+= cpu-probe.o cacheinfo.o cmdline.o env.o setup.o entry.o genex.o \
> +		   traps.o irq.o idle.o process.o dma.o mem.o io.o reset.o switch.o \
> +		   elf.o rtc.o syscall.o signal.o time.o topology.o cmpxchg.o \
> +		   inst.o ptrace.o vdso.o
> +
> +obj-$(CONFIG_ACPI)		+= acpi.o
> +obj-$(CONFIG_EFI) 		+= efi.o
> +
> +obj-$(CONFIG_CPU_HAS_FPU)	+= fpu.o
> +
> +obj-$(CONFIG_MODULES)		+= module.o module-sections.o
> +
> +obj-$(CONFIG_PROC_FS)		+= proc.o
> +
> +CPPFLAGS_vmlinux.lds		:= $(KBUILD_CFLAGS)
> diff --git a/arch/loongarch/kernel/vmlinux.lds.S b/arch/loongarch/kernel/vmlinux.lds.S
> new file mode 100644
> index 000000000000..02abfaaa4892
> --- /dev/null
> +++ b/arch/loongarch/kernel/vmlinux.lds.S
> @@ -0,0 +1,100 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#include <linux/sizes.h>
> +#include <asm/asm-offsets.h>
> +#include <asm/thread_info.h>
> +
> +#define PAGE_SIZE _PAGE_SIZE
> +
> +/*
> + * Put .bss..swapper_pg_dir as the first thing in .bss. This will
> + * ensure that it has .bss alignment (64K).
> + */
> +#define BSS_FIRST_SECTIONS *(.bss..swapper_pg_dir)
> +
> +#include <asm-generic/vmlinux.lds.h>
> +
> +OUTPUT_ARCH(loongarch)
> +ENTRY(kernel_entry)
> +PHDRS {
> +	text PT_LOAD FLAGS(7);	/* RWX */
> +	note PT_NOTE FLAGS(4);	/* R__ */
> +}
> +
> +jiffies	 = jiffies_64;
> +
> +SECTIONS
> +{
> +	. = VMLINUX_LOAD_ADDRESS;
> +
> +	_text = .;
> +	.text : {
> +		TEXT_TEXT
> +		SCHED_TEXT
> +		CPUIDLE_TEXT
> +		LOCK_TEXT
> +		KPROBES_TEXT
> +		IRQENTRY_TEXT
> +		SOFTIRQENTRY_TEXT
> +		*(.fixup)
> +		*(.gnu.warning)
> +	} :text = 0
> +	_etext = .;
> +
> +	EXCEPTION_TABLE(16)
> +
> +	. = ALIGN(PAGE_SIZE);
> +	__init_begin = .;
> +	__inittext_begin = .;
> +
> +	INIT_TEXT_SECTION(PAGE_SIZE)
> +	.exit.text : {
> +		EXIT_TEXT
> +	}
> +
> +	__inittext_end = .;
> +
> +	__initdata_begin = .;
> +
> +	INIT_DATA_SECTION(16)
> +	.exit.data : {
> +		EXIT_DATA
> +	}
> +
> +	__initdata_end = .;
> +
> +	__init_end = .;
> +
> +	_sdata = .;
> +	RO_DATA(4096)
> +	RW_DATA(1 << CONFIG_L1_CACHE_SHIFT, PAGE_SIZE, THREAD_SIZE)
> +
> +	.sdata : {
> +		*(.sdata)
> +	}
> +
> +	. = ALIGN(SZ_64K);
> +	_edata =  .;
> +
> +	BSS_SECTION(0, SZ_64K, 8)
> +
> +	_end = .;
> +
> +	STABS_DEBUG
> +	DWARF_DEBUG
> +
> +	.gptab.sdata : {
> +		*(.gptab.data)
> +		*(.gptab.sdata)
> +	}
> +	.gptab.sbss : {
> +		*(.gptab.bss)
> +		*(.gptab.sbss)
> +	}
> +
> +	DISCARDS
> +	/DISCARD/ : {
> +		*(.gnu.attributes)
> +		*(.options)
> +		*(.eh_frame)
> +	}
> +}
> diff --git a/arch/loongarch/lib/Makefile b/arch/loongarch/lib/Makefile
> new file mode 100644
> index 000000000000..7f32f3e4a6ec
> --- /dev/null
> +++ b/arch/loongarch/lib/Makefile
> @@ -0,0 +1,7 @@
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Makefile for LoongArch-specific library files..
One extra period at end of line.
> +#
> +
> +lib-y	+= delay.o memset.o memcpy.o memmove.o \
> +	   clear_user.o copy_user.o dump_tlb.o
> diff --git a/arch/loongarch/mm/Makefile b/arch/loongarch/mm/Makefile
> new file mode 100644
> index 000000000000..8ffc6383f836
> --- /dev/null
> +++ b/arch/loongarch/mm/Makefile
> @@ -0,0 +1,9 @@
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Makefile for the Linux/LoongArch-specific parts of the memory manager.
> +#
> +
> +obj-y				+= init.o cache.o tlb.o tlbex.o extable.o \
> +				   fault.o ioremap.o maccess.o mmap.o pgtable.o page.o
> +
> +obj-$(CONFIG_HUGETLB_PAGE)	+= hugetlbpage.o
> diff --git a/arch/loongarch/pci/Makefile b/arch/loongarch/pci/Makefile
> new file mode 100644
> index 000000000000..8101ef3df71c
> --- /dev/null
> +++ b/arch/loongarch/pci/Makefile
> @@ -0,0 +1,7 @@
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Makefile for the PCI specific kernel interface routines under Linux.
> +#
> +
> +obj-y				+= pci.o
> +obj-$(CONFIG_ACPI)		+= acpi.o
> diff --git a/scripts/subarch.include b/scripts/subarch.include
> index 776849a3c500..4bd327d0ae42 100644
> --- a/scripts/subarch.include
> +++ b/scripts/subarch.include
> @@ -10,4 +10,4 @@ SUBARCH := $(shell uname -m | sed -e s/i.86/x86/ -e s/x86_64/x86/ \
>   				  -e s/s390x/s390/ \
>   				  -e s/ppc.*/powerpc/ -e s/mips.*/mips/ \
>   				  -e s/sh[234].*/sh/ -e s/aarch64.*/arm64/ \
> -				  -e s/riscv.*/riscv/)
> +				  -e s/riscv.*/riscv/ -e s/loongarch.*/loongarch/)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 01/24] Documentation: LoongArch: Add basic documentations
  2022-05-01  9:32   ` WANG Xuerui
@ 2022-05-01 10:17     ` Huacai Chen
  0 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-05-01 10:17 UTC (permalink / raw)
  To: WANG Xuerui
  Cc: Huacai Chen, Arnd Bergmann, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION, LKML,
	Xuefeng Li, Yanteng Si, Guo Ren, Jiaxun Yang

Hi, Xuerui,

On Sun, May 1, 2022 at 5:32 PM WANG Xuerui <kernel@xen0n.name> wrote:
>
> Hi,
>
> Here's some rough review on the documentation bits, both semantic-wise
> and English-wise; I'm not native English speaker though, so more eyes
> are welcome.
>
>
> On 4/30/22 17:04, Huacai Chen wrote:
> > Add some basic documentation for LoongArch. LoongArch is a new RISC ISA,
> > which is a bit like MIPS or RISC-V. LoongArch includes a reduced 32-bit
> > version (LA32R), a standard 32-bit version (LA32S) and a 64-bit version
> > (LA64).
> >
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > ---
> >   Documentation/arch.rst                     |   1 +
> >   Documentation/loongarch/features.rst       |   3 +
> >   Documentation/loongarch/index.rst          |  21 ++
> >   Documentation/loongarch/introduction.rst   | 345 +++++++++++++++++++++
> >   Documentation/loongarch/irq-chip-model.rst | 168 ++++++++++
> >   5 files changed, 538 insertions(+)
> >   create mode 100644 Documentation/loongarch/features.rst
> >   create mode 100644 Documentation/loongarch/index.rst
> >   create mode 100644 Documentation/loongarch/introduction.rst
> >   create mode 100644 Documentation/loongarch/irq-chip-model.rst
> >
> > diff --git a/Documentation/arch.rst b/Documentation/arch.rst
> > index 14bcd8294b93..41a66a8b38e4 100644
> > --- a/Documentation/arch.rst
> > +++ b/Documentation/arch.rst
> > @@ -13,6 +13,7 @@ implementation.
> >      arm/index
> >      arm64/index
> >      ia64/index
> > +   loongarch/index
> >      m68k/index
> >      mips/index
> >      nios2/index
> > diff --git a/Documentation/loongarch/features.rst b/Documentation/loongarch/features.rst
> > new file mode 100644
> > index 000000000000..ebacade3ea45
> > --- /dev/null
> > +++ b/Documentation/loongarch/features.rst
> > @@ -0,0 +1,3 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +.. kernel-feat:: $srctree/Documentation/features loongarch
> > diff --git a/Documentation/loongarch/index.rst b/Documentation/loongarch/index.rst
> > new file mode 100644
> > index 000000000000..d127e07a7ed3
> > --- /dev/null
> > +++ b/Documentation/loongarch/index.rst
> > @@ -0,0 +1,21 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +================================
> > +LoongArch-specific Documentation
> > +================================
> > +
> > +.. toctree::
> > +   :maxdepth: 2
> > +   :numbered:
> > +
> > +   introduction
> > +   irq-chip-model
> > +
> > +   features
> > +
> > +.. only::  subproject and html
> > +
> > +   Indices
> > +   =======
> > +
> > +   * :ref:`genindex`
> > diff --git a/Documentation/loongarch/introduction.rst b/Documentation/loongarch/introduction.rst
> > new file mode 100644
> > index 000000000000..420c0d2ebcfb
> > --- /dev/null
> > +++ b/Documentation/loongarch/introduction.rst
> > @@ -0,0 +1,345 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +=========================
> > +Introduction of LoongArch
> > +=========================
> > +
> > +LoongArch is a new RISC ISA, which is a bit like MIPS or RISC-V. LoongArch
> > +includes a reduced 32-bit version (LA32R), a standard 32-bit version (LA32S)
> > +and a 64-bit version (LA64). LoongArch has 4 privilege levels (PLV0~PLV3),
> > +PLV0 is the highest level which used by kernel, and PLV3 is the lowest level
> > +which used by applications. This document introduces the registers, basic
>
> The sentence is a bit malformed; better reword into two sentences.
>
> "There are 4 privilege levels (PLVs) defined in LoongArch: PLV0~PLV3,
> from high to low. Kernel runs at the PLV0 while applications runs at PLV3.
>
> > +instruction set, virtual memory and some other topics of LoongArch.
> > +
> > +Registers
> > +=========
> > +
> > +LoongArch registers include general purpose registers (GPRs), floating point
> > +registers (FPRs), vector registers (VRs) and control status registers (CSRs)
> > +used in privileged mode (PLV0).
> Aren't privilege levels other than PLV0 also able to use CSRs?
> > +
> > +GPRs
> > +----
> > +
> > +LoongArch has 32 GPRs ($r0 - $r31), each one is 32bit wide in LA32 and 64bit
> > +wide in LA64. $r0 is always zero, and other registers has no special feature,
>
> "while other registers are not special"
>
> But again, this is not technically true; $r1 ($ra) *is* architecturally
> special, in that the BL instruction has it hard-wired as the link
> register. This sentence may need a little tweak but I currently don't
> have a concrete suggestion.
>
> > +but we actually have an ABI register convention as below.
>
> We may link to the official psABI specification now. When this port is
> first announced the documentation is not yet ready, but we now have it
> at [1], so by referring to the official bits we can avoid stale
> description like...
>
> [1]:
> https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html
>
> > +
> > +================= =============== =================== ============
> > +Name              Alias           Usage               Preserved
> > +                                                      across calls
> > +================= =============== =================== ============
> > +``$r0``           ``$zero``       Constant zero       Unused
> > +``$r1``           ``$ra``         Return address      No
> > +``$r2``           ``$tp``         TLS                 Unused
> > +``$r3``           ``$sp``         Stack pointer       Yes
> > +``$r4``-``$r11``  ``$a0``-``$a7`` Argument registers  No
> > +``$r4``-``$r5``   ``$v0``-``$v1`` Return value        No
> ... this (the ABI alias is deprecated in the latest spec), and...
> > +``$r12``-``$r20`` ``$t0``-``$t8`` Temp registers      No
> > +``$r21``          ``$x``          Reserved            Unused
> ... this (for one thing, the alias is entirely removed in the latest
> spec; for other thing, kernel does make use of this register), and...
> > +``$r22``          ``$fp``         Frame pointer       Yes
> ... this (this can also be called $s9 when we don't care about or make
> use of its frame-pointer nature).
> > +``$r23``-``$r31`` ``$s0``-``$s8`` Static registers    Yes
> > +================= =============== =================== ============
> And as described above, while the $r21 is reserved in the userspace ABI,
> this port does make use of it (as the percpu base register); so we'd
> better mention this too.
> > +
> > +FPRs
> > +----
> > +
> > +LoongArch has 32 FPRs ($f0 - $f31), each one is 64bit wide. We also have an
> "each one is 64bit wide" -- what about the possible LA32 and LA64
> distinction, as similarly shown in the GPR section?
> > +ABI register conversion as below.
> > +
> > +================= ================== =================== ============
> > +Name              Alias              Usage               Preserved
> > +                                                         across calls
> > +================= ================== =================== ============
> > +``$f0``-``$f7``   ``$fa0``-``$fa7``  Argument registers  No
> > +``$f0``-``$f1``   ``$fv0``-``$fv1``  Return value        No
> Same here -- the $vX and $fvX aliases are deprecated.
> > +``$f8``-``$f23``  ``$ft0``-``$ft15`` Temp registers      No
> > +``$f24``-``$f31`` ``$fs0``-``$fs7``  Static registers    Yes
> > +================= ================== =================== ============
> > +
> > +VRs
> > +----
> > +
> > +LoongArch has 128bit vector extension (LSX, short for Loongson SIMD eXtention)
> > +and 256bit vector extension (LASX, short for Loongson Advanced SIMD eXtension).
> > +There are also 32 vector registers, for LSX is $v0 - $v31, and for LASX is $x0
> > +- $x31. FPRs and VRs are reused, e.g. the lower 128bits of $x0 is $v0, and the
>
> "for LSX is ..." -- Chinglish; "$v0 ~ $v31 for LSX and $x0 ~ $x31 for
> LASX" would be better.
>
> Also, see what you did here with "$vX"? I know the older names are
> "$vrX" and "$xrX", but the latest reference manual already switched to
> the current naming, so you really can't just continue using "$v[01]" for
> "$a[01]" any more. ;-)
>
> "FPRs and VRs are reused" -- the word "overlap" is better, "FPRs and VRs
> overlap; the FPRs share the same storage as VR's lower bits" might be a
> better expression.
>
> > +lower 64bits of $v0 is $f0, etc.
> > +
> > +CSRs
> > +----
> > +
> > +CSRs can only be used in privileged mode (PLV0):
> > +
> > +================= ===================================== ==============
> > +Address           Full Name                             Abbrev Name
> > +================= ===================================== ==============
> > +0x0               Current Mode information              CRMD
> > +0x1               Pre-exception Mode information        PRMD
> is the word "information" needed?
> > +0x2               Extended Unit Enable                  EUEN
> > +0x3               Miscellaneous controller              MISC
> "controller"? just remove the word or "control" would be better.
> > +0x4               Exception Configuration               ECFG
> > +0x5               Exception Status                      ESTAT
> > +0x6               Exception Return Address              ERA
> > +0x7               Bad Virtual Address                   BADV
> > +0x8               Bad Instruction                       BADI
> > +0xC               Exception Entry Base address          EENTRY
> > +0x10              TLB Index                             TLBIDX
> > +0x11              TLB Entry High-order bits             TLBEHI
> > +0x12              TLB Entry Low-order bits 0            TLBELO0
> > +0x13              TLB Entry Low-order bits 1            TLBELO1
> > +0x18              Address Space Identifier              ASID
> > +0x19              Page Global Directory address for     PGDL
> > +                  Lower half address space
> > +0x1A              Page Global Directory address for     PGDH
> > +                  Higher half address space
> > +0x1B              Page Global Directory address         PGD
> > +0x1C              Page Walk Controller for Lower        PWCL
> > +                  half address space
> > +0x1D              Page Walk Controller for Higher       PWCH
> > +                  half address space
> > +0x1E              STLB Page Size                        STLBPS
> > +0x1F              Reduced Virtual Address Configuration RVACFG
> > +0x20              CPU Identifier                        CPUID
> > +0x21              Privileged Resource Configuration 1   PRCFG1
> > +0x22              Privileged Resource Configuration 2   PRCFG2
> > +0x23              Privileged Resource Configuration 3   PRCFG3
> > +0x30+n (0≤n≤15)   Data Save register                    SAVEn
> These are actually scratch registers, but I imagine you can't use that
> word as it's a bit MIPS-y... The name is less comprehensible but we
> might have no choice.
> > +0x40              Timer Identifier                      TID
> > +0x41              Timer Configuration                   TCFG
> > +0x42              Timer Value                           TVAL
> > +0x43              Compensation of Timer Count           CNTC
> > +0x44              Timer Interrupt Clearing              TICLR
> > +0x60              LLBit Controller                      LLBCTL
> "Control" is probably sufficient -- same for other places.
> > +0x80              Implementation-specific Controller 1  IMPCTL1
> > +0x81              Implementation-specific Controller 2  IMPCTL2
> > +0x88              TLB Refill Exception Entry Base       TLBRENTRY
> > +                  address
> > +0x89              TLB Refill Exception BAD Virtual      TLBRBADV
> > +                  address
> > +0x8A              TLB Refill Exception Return Address   TLBRERA
> > +0x8B              TLB Refill Exception data SAVE        TLBRSAVE
> > +                  register
> > +0x8C              TLB Refill Exception Entry Low-order  TLBRELO0
> > +                  bits 0
> > +0x8D              TLB Refill Exception Entry Low-order  TLBRELO1
> > +                  bits 1
> > +0x8E              TLB Refill Exception Entry High-order TLBEHI
> > +                  bits
> > +0x8F              TLB Refill Exception Pre-exception    TLBRPRMD
> > +                  Mode information
> > +0x90              Machine Error Controller              MERRCTL
> > +0x91              Machine Error Information 1           MERRINFO1
> > +0x92              Machine Error Information 2           MERRINFO2
> > +0x93              Machine Error Exception Entry Base    MERRENTRY
> > +                  address
> > +0x94              Machine Error Exception Return        MERRERA
> > +                  address
> > +0x95              Machine Error Exception data SAVE     MERRSAVE
> > +                  register
> It seems you're trying to match capitalization here to the CSR acronym
> -- but the resulting names are inconsistent-looking, such as the "data
> SAVE" here, and...
> > +0x98              Cache TAGs                            CTAG
> > +0x180+n (0≤n≤3)   Direct Mapping configuration Window n DMWn
> ...here, and...
> > +0x200+2n (0≤n≤31) Performance Monitor Configuration n   PMCFGn
> > +0x201+2n (0≤n≤31) Performance Monitor overall Counter n PMCNTn
> > +0x300             Memory load/store WatchPoint          MWPC
> > +                  overall Controller
>
> here.
>
> It's inconsistent, because otherwise you'd have "CuRrent MoDe" at the
> top of the table, similarly for other entries. As the reference manual
> (Chinese version; this is the authoritative version) actually does NOT
> give full English names for the CSRs (only Chinese full-name and the
> abbreviation), I think we can be a bit lax here and use normal
> capitalization for reading comfort.
>
> > +0x301             Memory load/store WatchPoint          MWPS
> > +                  overall Status
> > +0x310+8n (0≤n≤7)  Memory load/store WatchPoint n        MWPnCFG1
> > +                  Configuration 1
> > +0x311+8n (0≤n≤7)  Memory load/store WatchPoint n        MWPnCFG2
> > +                  Configuration 2
> > +0x312+8n (0≤n≤7)  Memory load/store WatchPoint n        MWPnCFG3
> > +                  Configuration 3
> > +0x313+8n (0≤n≤7)  Memory load/store WatchPoint n        MWPnCFG4
> > +                  Configuration 4
> > +0x380             Fetch WatchPoint overall Controller   FWPC
> > +0x381             Fetch WatchPoint overall Status       FWPS
> > +0x390+8n (0≤n≤7)  Fetch WatchPoint n Configuration 1    FWPnCFG1
> > +0x391+8n (0≤n≤7)  Fetch WatchPoint n Configuration 2    FWPnCFG2
> > +0x392+8n (0≤n≤7)  Fetch WatchPoint n Configuration 3    FWPnCFG3
> > +0x393+8n (0≤n≤7)  Fetch WatchPoint n Configuration 4    FWPnCFG4
> > +0x500             Debug register                        DBG
> > +0x501             Debug Exception Return address        DERA
> > +0x502             Debug data SAVE register              DSAVE
> > +================= ===================================== ==============
> > +
> > +ERA,TLBRERA,MERREEA and ERA sometimes are also called EPC,TLBREPC
> > +MERREPC and DEPC.
> > +
> > +Basic Instruction Set
> > +=====================
> > +
> > +Instruction formats
> > +-------------------
> > +
> > +LoongArch has 32-bit wide instructions, and there are 9 instruction formats::
> > +
> > +  2R-type:    Opcode + Rj + Rd
> > +  3R-type:    Opcode + Rk + Rj + Rd
> > +  4R-type:    Opcode + Ra + Rk + Rj + Rd
> > +  2RI8-type:  Opcode + I8 + Rj + Rd
> > +  2RI12-type: Opcode + I12 + Rj + Rd
> > +  2RI14-type: Opcode + I14 + Rj + Rd
> > +  2RI16-type: Opcode + I16 + Rj + Rd
> > +  1RI21-type: Opcode + I21L + Rj + I21H
> > +  I26-type:   Opcode + I26L + I26H
> > +
> > +Rj and Rk are source operands (register), Rd is destination operand (register),
> > +and Ra is the additional operand (register) in 4R-type. I8/I12/I16/I21/I26 are
> > +8-bits/12-bits/16-bits/21-bits/26bits immediate data. 21bits/26bits immediate
> > +data are split into higher bits and lower bits in an instruction word, so you
> > +can see I21L/I21H and I26L/I26H here.
> > +
> > +Instruction names (Mnemonics)
> > +-----------------------------
> > +
> > +We only list the instruction names here, for details please read the references.
> > +
> > +Arithmetic Operation Instructions::
> > +
> > +  ADD.W SUB.W ADDI.W ADD.D SUB.D ADDI.D
> > +  SLT SLTU SLTI SLTUI
> > +  AND OR NOR XOR ANDN ORN ANDI ORI XORI
> > +  MUL.W MULH.W MULH.WU DIV.W DIV.WU MOD.W MOD.WU
> > +  MUL.D MULH.D MULH.DU DIV.D DIV.DU MOD.D MOD.DU
> > +  PCADDI PCADDU12I PCADDU18I
> > +  LU12I.W LU32I.D LU52I.D ADDU16I.D
> > +
> > +Bit-shift Instructions::
> > +
> > +  SLL.W SRL.W SRA.W ROTR.W SLLI.W SRLI.W SRAI.W ROTRI.W
> > +  SLL.D SRL.D SRA.D ROTR.D SLLI.D SRLI.D SRAI.D ROTRI.D
> > +
> > +Bit-manipulation Instructions::
> > +
> > +  EXT.W.B EXT.W.H CLO.W CLO.D SLZ.W CLZ.D CTO.W CTO.D CTZ.W CTZ.D
> > +  BYTEPICK.W BYTEPICK.D BSTRINS.W BSTRINS.D BSTRPICK.W BSTRPICK.D
> > +  REVB.2H REVB.4H REVB.2W REVB.D REVH.2W REVH.D BITREV.4B BITREV.8B BITREV.W BITREV.D
> > +  MASKEQZ MASKNEZ
> > +
> > +Branch Instructions::
> > +
> > +  BEQ BNE BLT BGE BLTU BGEU BEQZ BNEZ B BL JIRL
> > +
> > +Load/Store Instructions::
> > +
> > +  LD.B LD.BU LD.H LD.HU LD.W LD.WU LD.D ST.B ST.H ST.W ST.D
> > +  LDX.B LDX.BU LDX.H LDX.HU LDX.W LDX.WU LDX.D STX.B STX.H STX.W STX.D
> > +  LDPTR.W LDPTR.D STPTR.W STPTR.D
> > +  PRELD PRELDX
> > +
> > +Atomic Operation Instructions::
> > +
> > +  LL.W SC.W LL.D SC.D
> > +  AMSWAP.W AMSWAP.D AMADD.W AMADD.D AMAND.W AMAND.D AMOR.W AMOR.D AMXOR.W AMXOR.D
> > +  AMMAX.W AMMAX.D AMMIN.W AMMIN.D
> > +
> > +Barrier Instructions::
> > +
> > +  IBAR DBAR
> > +
> > +Special Instructions::
> > +
> > +  SYSCALL BREAK CPUCFG NOP IDLE ERTN DBCL RDTIMEL.W RDTIMEH.W RDTIME.D ASRTLE.D ASRTGT.D
> > +
> > +Privileged Instructions::
> > +
> > +  CSRRD CSRWR CSRXCHG
> > +  IOCSRRD.B IOCSRRD.H IOCSRRD.W IOCSRRD.D IOCSRWR.B IOCSRWR.H IOCSRWR.W IOCSRWR.D
> > +  CACOP TLBP(TLBSRCH) TLBRD TLBWR TLBFILL TLBCLR TLBFLUSH INVTLB LDDIR LDPTE
>
> For the whole section, replace with reference to the official
> (translated or not) documentation repo? I believe this is similar to the
> psABI situation explained above.
>
> > +
> > +Virtual Memory
> > +==============
> > +
> > +LoongArch can use direct-mapped virtual memory and page-mapped virtual memory.
> > +
> > +Direct-mapped virtual memory is configured by CSR.DMWn (n=0~3), it has a simple
> > +relationship between virtual address (VA) and physical address (PA)::
> "... is configured via CSR.DMWn (n=0~3). It specifies a simple
> relationship ..."
> > +
> > + VA = PA + FixedOffset
> > +
> > +Page-mapped virtual memory has arbitrary relationship between VA and PA, which
> > +is recorded in TLB and page tables. LoongArch's TLB includes a fully-associative
> The first sentence is Chinglish. As the basics of paged virtual memory
> should be common sense to kernel developers, could we simplify, or
> better, just somehow get rid of the sentence?
> > +MTLB (Multiple Page Size TLB) and set-associative STLB (Single Page Size TLB).
> > +
> > +By default, the whole virtual address space of LA32 is configured like this:
> > +
> > +============ =========================== =============================
> > +Name         Address Range               Attributes
> > +============ =========================== =============================
> > +``UVRANGE``  ``0x00000000 - 0x7FFFFFFF`` Page-mapped, Cached, PLV0~3
> > +``KPRANGE0`` ``0x80000000 - 0x9FFFFFFF`` Direct-mapped, Uncached, PLV0
> > +``KPRANGE1`` ``0xA0000000 - 0xBFFFFFFF`` Direct-mapped, Cached, PLV0
> > +``KVRANGE``  ``0xC0000000 - 0xFFFFFFFF`` Page-mapped, Cached, PLV0
> The names sound awfully MIPS-like... I can't find any reference to the
> names here in the reference manual, are these Linux-specific inventions
> only documented here?
> > +============ =========================== =============================
> > +
> > +User mode (PLV3) can only access UVRANGE. For direct-mapped KPRANGE0 and
> > +KPRANGE1, PA is equal to VA with bit30~31 cleared. For example, the uncached
> > +direct-mapped VA of 0x00001000 is 0x80001000, and the cached direct-mapped
> > +VA of 0x00001000 is 0xA0001000.
> > +
> > +By default, the whole virtual address space of LA64 is configured like this:
> > +
> > +============ ====================== ======================================
> > +Name         Address Range          Attributes
> > +============ ====================== ======================================
> > +``XUVRANGE`` ``0x0000000000000000 - Page-mapped, Cached, PLV0~3
> > +             0x3FFFFFFFFFFFFFFF``
> > +``XSPRANGE`` ``0x4000000000000000 - Direct-mapped, Cached / Uncached, PLV0
> > +             0x7FFFFFFFFFFFFFFF``
> > +``XKPRANGE`` ``0x8000000000000000 - Direct-mapped, Cached / Uncached, PLV0
> > +             0xBFFFFFFFFFFFFFFF``
> > +``XKVRANGE`` ``0xC000000000000000 - Page-mapped, Cached, PLV0
> > +             0xFFFFFFFFFFFFFFFF``
> Similarly here.
> > +============ ====================== ======================================
> > +
> > +User mode (PLV3) can only access XUVRANGE. For direct-mapped XSPRANGE and XKPRANGE,
> > +PA is equal to VA with bit60~63 cleared, and the cache attributes is configured by
> > +bit60~61 (0 is strongly-ordered uncached, 1 is coherent cached, and 2 is weakly-
> > +ordered uncached) in VA. Currently we only use XKPRANGE for direct mapping and
> > +XSPRANGE is reserved. As an example, the strongly-ordered uncached direct-mapped VA
> > +(in XKPRANGE) of 0x00000000 00001000 is 0x80000000 00001000, the coherent cached
> > +direct-mapped VA (in XKPRANGE) of 0x00000000 00001000 is 0x90000000 00001000, and
> > +the weakly-ordered uncached direct-mapped VA (in XKPRANGE) of 0x00000000 00001000
> > +is 0xA0000000 00001000.
> > +
> > +Relationship of Loongson and LoongArch
> > +======================================
> > +
> > +LoongArch is a RISC ISA which is different from any other existing ones, while
> > +Loongson is a family of processors. Loongson includes 3 series: Loongson-1 is
> > +the 32-bit processor series, Loongson-2 is the low-end 64-bit processor series,
> > +and Loongson-3 is the high-end 64-bit processor series. Old Loongson is based on
> > +MIPS, while New Loongson is based on LoongArch. Take Loongson-3 as an example:
> > +Loongson-3A1000/3B1500/3A2000/3A3000/3A4000 are MIPS-compatible, while Loongson-
> > +3A5000 (and future revisions) are all based on LoongArch.
> Is this section truly necessary? At least FWIW Loongson is first of all,
> a corporation, in addition to its series of CPU products, bridge chip
> products, browser offering and pretty much everything. We could use a
> fair bit of clarification for this paragraph, at least use phrases like
> "Loongson processors"...
> > +
> > +References
> > +==========
> > +
> > +Official web site of Loongson and LoongArch (Loongson Technology Corp. Ltd.):
> > +
> > +  http://www.loongson.cn/index.html
> You may omit the "index.html" part...
> > +
> > +Developer web site of Loongson and LoongArch (Software and Documentation):
> > +
> > +  http://www.loongnix.cn/index.php
> Do you really mean loongnix.cn and not
> https://loongson.github.io/LoongArch-Documentation/ ? Because
> loongnix.cn is more an information portal for users... at least in its
> current iteration there's no link to actual documentation, no link to
> development repos, nothing useful for prospective contributors.
> > +
> > +  https://github.com/loongson
> > +
> > +Documentation of LoongArch ISA:
> > +
> > +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.00-CN.pdf (in Chinese)
> > +
> > +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.00-EN.pdf (in English)
> > +
> > +Documentation of LoongArch ELF ABI:
> > +
> > +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-ELF-ABI-v1.00-CN.pdf (in Chinese)
> > +
> > +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-ELF-ABI-v1.00-EN.pdf (in English)
> > +
> > +Linux kernel repository of Loongson and LoongArch:
> > +
> > +  https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git
> > diff --git a/Documentation/loongarch/irq-chip-model.rst b/Documentation/loongarch/irq-chip-model.rst
> > new file mode 100644
> > index 000000000000..bde112b81ace
> > --- /dev/null
> > +++ b/Documentation/loongarch/irq-chip-model.rst
> > @@ -0,0 +1,168 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +=======================================
> > +IRQ chip model (hierarchy) of LoongArch
> > +=======================================
> > +
> > +Currently, LoongArch based processors (e.g. Loongson-3A5000) can only work together
> > +with LS7A chipsets. The irq chips in LoongArch computers include CPUINTC (CPU Core
> > +Interrupt Controller), LIOINTC (Legacy I/O Interrupt Controller), EIOINTC (Extended
> > +I/O Interrupt Controller), HTVECINTC (Hyper-Transport Vector Interrupt Controller),
> > +PCH-PIC (Main Interrupt Controller in LS7A chipset), PCH-LPC (LPC Interrupt Controller
> > +in LS7A chipset) and PCH-MSI (MSI Interrupt Controller).
> > +
> > +CPUINTC is a per-core controller (in CPU), LIOINTC/EIOINTC/HTVECINTC are per-package
> > +controllers (in CPU), while PCH-PIC/PCH-LPC/PCH-MSI are controllers out of CPU (i.e.,
> > +in chipsets). These controllers (in other words, irqchips) are linked in a hierarchy,
> > +and there are two models of hierarchy (legacy model and extended model).
> > +
> > +Legacy IRQ model
> > +================
> > +
> > +In this model, IPI (Inter-Processor Interrupt) and CPU Local Timer interrupt go
> > +to CPUINTC directly, CPU UARTS interrupts go to LIOINTC, while all other devices
> > +interrupts go to PCH-PIC/PCH-LPC/PCH-MSI and gathered by HTVECINTC, and then go
> > +to LIOINTC, and then CPUINTC.
> > +
> > + +---------------------------------------------+
> > + |::                                           |
> > + |                                             |
> > + |    +-----+     +---------+     +-------+    |
> > + |    | IPI | --> | CPUINTC | <-- | Timer |    |
> > + |    +-----+     +---------+     +-------+    |
> > + |                     ^                       |
> > + |                     |                       |
> > + |                +---------+     +-------+    |
> > + |                | LIOINTC | <-- | UARTs |    |
> > + |                +---------+     +-------+    |
> > + |                     ^                       |
> > + |                     |                       |
> > + |               +-----------+                 |
> > + |               | HTVECINTC |                 |
> > + |               +-----------+                 |
> > + |                ^         ^                  |
> > + |                |         |                  |
> > + |          +---------+ +---------+            |
> > + |          | PCH-PIC | | PCH-MSI |            |
> > + |          +---------+ +---------+            |
> > + |            ^     ^           ^              |
> > + |            |     |           |              |
> > + |    +---------+ +---------+ +---------+      |
> > + |    | PCH-LPC | | Devices | | Devices |      |
> > + |    +---------+ +---------+ +---------+      |
> > + |         ^                                   |
> > + |         |                                   |
> > + |    +---------+                              |
> > + |    | Devices |                              |
> > + |    +---------+                              |
> > + |                                             |
> > + |                                             |
> > + +---------------------------------------------+
> > +
> > +Extended IRQ model
> > +==================
> > +
> > +In this model, IPI (Inter-Processor Interrupt) and CPU Local Timer interrupt go
> > +to CPUINTC directly, CPU UARTS interrupts go to LIOINTC, while all other devices
> > +interrupts go to PCH-PIC/PCH-LPC/PCH-MSI and gathered by EIOINTC, and then go to
> > +to CPUINTC directly.
> > +
> > + +--------------------------------------------------------+
> > + |::                                                      |
> > + |                                                        |
> > + |         +-----+     +---------+     +-------+          |
> > + |         | IPI | --> | CPUINTC | <-- | Timer |          |
> > + |         +-----+     +---------+     +-------+          |
> > + |                      ^       ^                         |
> > + |                      |       |                         |
> > + |               +---------+ +---------+     +-------+    |
> > + |               | EIOINTC | | LIOINTC | <-- | UARTs |    |
> > + |               +---------+ +---------+     +-------+    |
> > + |                ^       ^                               |
> > + |                |       |                               |
> > + |         +---------+ +---------+                        |
> > + |         | PCH-PIC | | PCH-MSI |                        |
> > + |         +---------+ +---------+                        |
> > + |           ^     ^           ^                          |
> > + |           |     |           |                          |
> > + |   +---------+ +---------+ +---------+                  |
> > + |   | PCH-LPC | | Devices | | Devices |                  |
> > + |   +---------+ +---------+ +---------+                  |
> > + |        ^                                               |
> > + |        |                                               |
> > + |   +---------+                                          |
> > + |   | Devices |                                          |
> > + |   +---------+                                          |
> > + |                                                        |
> > + |                                                        |
> > + +--------------------------------------------------------+
> > +
> > +ACPI-related definitions
> > +========================
> > +
> > +CPUINTC::
> > +
> > +  ACPI_MADT_TYPE_CORE_PIC;
> > +  struct acpi_madt_core_pic;
> > +  enum acpi_madt_core_pic_version;
> > +
> > +LIOINTC::
> > +
> > +  ACPI_MADT_TYPE_LIO_PIC;
> > +  struct acpi_madt_lio_pic;
> > +  enum acpi_madt_lio_pic_version;
> > +
> > +EIOINTC::
> > +
> > +  ACPI_MADT_TYPE_EIO_PIC;
> > +  struct acpi_madt_eio_pic;
> > +  enum acpi_madt_eio_pic_version;
> > +
> > +HTVECINTC::
> > +
> > +  ACPI_MADT_TYPE_HT_PIC;
> > +  struct acpi_madt_ht_pic;
> > +  enum acpi_madt_ht_pic_version;
> > +
> > +PCH-PIC::
> > +
> > +  ACPI_MADT_TYPE_BIO_PIC;
> > +  struct acpi_madt_bio_pic;
> > +  enum acpi_madt_bio_pic_version;
> > +
> > +PCH-MSI::
> > +
> > +  ACPI_MADT_TYPE_MSI_PIC;
> > +  struct acpi_madt_msi_pic;
> > +  enum acpi_madt_msi_pic_version;
> > +
> > +PCH-LPC::
> > +
> > +  ACPI_MADT_TYPE_LPC_PIC;
> > +  struct acpi_madt_lpc_pic;
> > +  enum acpi_madt_lpc_pic_version;
> > +
> > +References
> > +==========
> > +
> > +Documentation of Loongson-3A5000:
> > +
> > +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-3A5000-usermanual-1.02-CN.pdf (in Chinese)
> > +
> > +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-3A5000-usermanual-1.02-EN.pdf (in English)
> > +
> > +Documentation of Loongson's LS7A chipset:
> > +
> > +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-7A1000-usermanual-2.00-CN.pdf (in Chinese)
> > +
> > +  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-7A1000-usermanual-2.00-EN.pdf (in English)
> > +
> > +Attention: CPUINTC is CSR.ECFG/CSR.ESTAT and its interrupt controller described
> "Note" may be enough. :-)
Thank you for all your suggestions, I will modify.

Huacai
> > +in Section 7.4 of "LoongArch Reference Manual, Vol 1"; LIOINTC is "Legacy I/O
> > +Interrupts" described in Section 11.1 of "Loongson 3A5000 Processor Reference
> > +Manual"; EIOINTC is "Extended I/O Interrupts" described in Section 11.2 of
> > +"Loongson 3A5000 Processor Reference Manual"; HTVECINTC is "HyperTransport
> > +Interrupts" described in Section 14.3 of "Loongson 3A5000 Processor Reference
> > +Manual"; PCH-PIC/PCH-MSI is "Interrupt Controller" described in Section 5 of
> > +"Loongson 7A1000 Bridge User Manual"; PCH-LPC is "LPC Interrupts" described in
> > +Section 24.3 of "Loongson 7A1000 Bridge User Manual".

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 17/24] LoongArch: Add some library functions
  2022-04-30  9:05 ` [PATCH V9 17/24] LoongArch: Add some library functions Huacai Chen
@ 2022-05-01 10:55   ` Guo Ren
  2022-05-01 12:18     ` Huacai Chen
  0 siblings, 1 reply; 94+ messages in thread
From: Guo Ren @ 2022-05-01 10:55 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds,
	linux-arch, Linux Doc Mailing List, Linux Kernel Mailing List,
	Xuefeng Li, Yanteng Si, Huacai Chen, Xuerui Wang, Jiaxun Yang

On Sat, Apr 30, 2022 at 5:23 PM Huacai Chen <chenhuacai@loongson.cn> wrote:
>
> This patch adds some library functions for LoongArch, including: delay,
> memset, memcpy, memmove, copy_user, strncpy_user, strnlen_user and tlb
> dump functions.
>
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> ---
>  arch/loongarch/include/asm/delay.h  |  26 +++++++
>  arch/loongarch/include/asm/string.h |  17 +++++
>  arch/loongarch/lib/clear_user.S     |  43 +++++++++++
>  arch/loongarch/lib/copy_user.S      |  47 ++++++++++++
>  arch/loongarch/lib/delay.c          |  43 +++++++++++
>  arch/loongarch/lib/dump_tlb.c       | 111 ++++++++++++++++++++++++++++
>  arch/loongarch/lib/memcpy.S         |  32 ++++++++
>  arch/loongarch/lib/memmove.S        |  45 +++++++++++
>  arch/loongarch/lib/memset.S         |  30 ++++++++
>  9 files changed, 394 insertions(+)
>  create mode 100644 arch/loongarch/include/asm/delay.h
>  create mode 100644 arch/loongarch/include/asm/string.h
>  create mode 100644 arch/loongarch/lib/clear_user.S
>  create mode 100644 arch/loongarch/lib/copy_user.S
>  create mode 100644 arch/loongarch/lib/delay.c
>  create mode 100644 arch/loongarch/lib/dump_tlb.c
>  create mode 100644 arch/loongarch/lib/memcpy.S
>  create mode 100644 arch/loongarch/lib/memmove.S
>  create mode 100644 arch/loongarch/lib/memset.S
>
> diff --git a/arch/loongarch/include/asm/delay.h b/arch/loongarch/include/asm/delay.h
> new file mode 100644
> index 000000000000..016b3aca65cb
> --- /dev/null
> +++ b/arch/loongarch/include/asm/delay.h
> @@ -0,0 +1,26 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef _ASM_DELAY_H
> +#define _ASM_DELAY_H
> +
> +#include <linux/param.h>
> +
> +extern void __delay(unsigned long loops);
> +extern void __ndelay(unsigned long ns);
> +extern void __udelay(unsigned long us);
> +
> +#define ndelay(ns) __ndelay(ns)
> +#define udelay(us) __udelay(us)
> +
> +/* make sure "usecs *= ..." in udelay do not overflow. */
> +#if HZ >= 1000
> +#define MAX_UDELAY_MS  1
> +#elif HZ <= 200
> +#define MAX_UDELAY_MS  5
> +#else
> +#define MAX_UDELAY_MS  (1000 / HZ)
> +#endif
> +
> +#endif /* _ASM_DELAY_H */
> diff --git a/arch/loongarch/include/asm/string.h b/arch/loongarch/include/asm/string.h
> new file mode 100644
> index 000000000000..7b29cc9c70aa
> --- /dev/null
> +++ b/arch/loongarch/include/asm/string.h
> @@ -0,0 +1,17 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef _ASM_STRING_H
> +#define _ASM_STRING_H
> +
> +#define __HAVE_ARCH_MEMSET
> +extern void *memset(void *__s, int __c, size_t __count);
> +
> +#define __HAVE_ARCH_MEMCPY
> +extern void *memcpy(void *__to, __const__ void *__from, size_t __n);
> +
> +#define __HAVE_ARCH_MEMMOVE
> +extern void *memmove(void *__dest, __const__ void *__src, size_t __n);
> +
> +#endif /* _ASM_STRING_H */
> diff --git a/arch/loongarch/lib/clear_user.S b/arch/loongarch/lib/clear_user.S
> new file mode 100644
> index 000000000000..b8168d22ac80
> --- /dev/null
> +++ b/arch/loongarch/lib/clear_user.S
> @@ -0,0 +1,43 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <asm/asm.h>
> +#include <asm/asmmacro.h>
> +#include <asm/export.h>
> +#include <asm/regdef.h>
> +
> +.macro fixup_ex from, to, offset, fix
> +.if \fix
> +       .section .fixup, "ax"
> +\to:   addi.d  v0, a1, \offset
> +       jr      ra
> +       .previous
> +.endif
> +       .section __ex_table, "a"
> +       PTR     \from\()b, \to\()b
> +       .previous
> +.endm
> +
> +/*
> + * unsigned long __clear_user(void *addr, size_t size)
> + *
> + * a0: addr
> + * a1: size
> + */
> +SYM_FUNC_START(__clear_user)
> +       beqz    a1, 2f
> +
> +1:     st.b    zero, a0, 0
> +       addi.d  a0, a0, 1
> +       addi.d  a1, a1, -1
> +       bgt     a1, zero, 1b
> +
> +2:     move    v0, a1
> +       jr      ra
> +
> +       fixup_ex 1, 3, 0, 1
> +SYM_FUNC_END(__clear_user)
> +
> +EXPORT_SYMBOL(__clear_user)
> diff --git a/arch/loongarch/lib/copy_user.S b/arch/loongarch/lib/copy_user.S
> new file mode 100644
> index 000000000000..43ed26304954
> --- /dev/null
> +++ b/arch/loongarch/lib/copy_user.S
> @@ -0,0 +1,47 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <asm/asm.h>
> +#include <asm/asmmacro.h>
> +#include <asm/export.h>
> +#include <asm/regdef.h>
> +
> +.macro fixup_ex from, to, offset, fix
> +.if \fix
> +       .section .fixup, "ax"
> +\to:   addi.d  v0, a2, \offset
> +       jr      ra
> +       .previous
> +.endif
> +       .section __ex_table, "a"
> +       PTR     \from\()b, \to\()b
> +       .previous
> +.endm
> +
> +/*
> + * unsigned long __copy_user(void *to, const void *from, size_t n)
> + *
> + * a0: to
> + * a1: from
> + * a2: n
> + */
> +SYM_FUNC_START(__copy_user)
> +       beqz    a2, 3f
> +
> +1:     ld.b    t0, a1, 0
> +2:     st.b    t0, a0, 0
> +       addi.d  a0, a0, 1
> +       addi.d  a1, a1, 1
> +       addi.d  a2, a2, -1
> +       bgt     a2, zero, 1b
> +
> +3:     move    v0, a2
> +       jr      ra
> +
> +       fixup_ex 1, 4, 0, 1
> +       fixup_ex 2, 4, 0, 0
> +SYM_FUNC_END(__copy_user)
> +
> +EXPORT_SYMBOL(__copy_user)
> diff --git a/arch/loongarch/lib/delay.c b/arch/loongarch/lib/delay.c
> new file mode 100644
> index 000000000000..5d856694fcfe
> --- /dev/null
> +++ b/arch/loongarch/lib/delay.c
> @@ -0,0 +1,43 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#include <linux/delay.h>
> +#include <linux/export.h>
> +#include <linux/smp.h>
> +#include <linux/timex.h>
> +
> +#include <asm/compiler.h>
> +#include <asm/processor.h>
> +
> +void __delay(unsigned long cycles)
> +{
> +       u64 t0 = get_cycles();
> +
> +       while ((unsigned long)(get_cycles() - t0) < cycles)
> +               cpu_relax();
> +}
> +EXPORT_SYMBOL(__delay);
> +
> +/*
> + * Division by multiplication: you don't have to worry about
> + * loss of precision.
> + *
> + * Use only for very small delays ( < 1 msec). Should probably use a
> + * lookup table, really, as the multiplications take much too long with
> + * short delays.  This is a "reasonable" implementation, though (and the
> + * first constant multiplications gets optimized away if the delay is
> + * a constant)
> + */
> +
> +void __udelay(unsigned long us)
> +{
> +       __delay((us * 0x000010c7ull * HZ * lpj_fine) >> 32);
> +}
> +EXPORT_SYMBOL(__udelay);
> +
> +void __ndelay(unsigned long ns)
> +{
> +       __delay((ns * 0x00000005ull * HZ * lpj_fine) >> 32);
> +}
> +EXPORT_SYMBOL(__ndelay);
> diff --git a/arch/loongarch/lib/dump_tlb.c b/arch/loongarch/lib/dump_tlb.c
> new file mode 100644
> index 000000000000..cda2c6bc7f09
> --- /dev/null
> +++ b/arch/loongarch/lib/dump_tlb.c
> @@ -0,0 +1,111 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + *
> + * Derived from MIPS:
> + * Copyright (C) 1994, 1995 by Waldorf Electronics, written by Ralf Baechle.
> + * Copyright (C) 1999 by Silicon Graphics, Inc.
> + */
> +#include <linux/kernel.h>
> +#include <linux/mm.h>
> +
> +#include <asm/loongarch.h>
> +#include <asm/page.h>
> +#include <asm/pgtable.h>
> +#include <asm/tlb.h>
> +
> +void dump_tlb_regs(void)
> +{
> +       const int field = 2 * sizeof(unsigned long);
> +
> +       pr_info("Index    : %0x\n", read_csr_tlbidx());
> +       pr_info("PageSize : %0x\n", read_csr_pagesize());
> +       pr_info("EntryHi  : %0*llx\n", field, read_csr_entryhi());
> +       pr_info("EntryLo0 : %0*llx\n", field, read_csr_entrylo0());
> +       pr_info("EntryLo1 : %0*llx\n", field, read_csr_entrylo1());
> +}
> +
> +static void dump_tlb(int first, int last)
> +{
> +       unsigned long s_entryhi, entryhi, asid;
> +       unsigned long long entrylo0, entrylo1, pa;
> +       unsigned int index;
> +       unsigned int s_index, s_asid;
> +       unsigned int pagesize, c0, c1, i;
> +       unsigned long asidmask = cpu_asid_mask(&current_cpu_data);
> +       int pwidth = 11;
> +       int vwidth = 11;
> +       int asidwidth = DIV_ROUND_UP(ilog2(asidmask) + 1, 4);
> +
> +       s_entryhi = read_csr_entryhi();
> +       s_index = read_csr_tlbidx();
> +       s_asid = read_csr_asid();
> +
> +       for (i = first; i <= last; i++) {
> +               write_csr_index(i);
> +               tlb_read();
> +               pagesize = read_csr_pagesize();
> +               entryhi  = read_csr_entryhi();
> +               entrylo0 = read_csr_entrylo0();
> +               entrylo1 = read_csr_entrylo1();
> +               index = read_csr_tlbidx();
> +               asid = read_csr_asid();
> +
> +               /* EHINV bit marks entire entry as invalid */
> +               if (index & CSR_TLBIDX_EHINV)
> +                       continue;
> +               /*
> +                * ASID takes effect in absence of G (global) bit.
> +                */
> +               if (!((entrylo0 | entrylo1) & ENTRYLO_G) &&
> +                   asid != s_asid)
> +                       continue;
> +
> +               /*
> +                * Only print entries in use
> +                */
> +               pr_info("Index: %2d pgsize=%x ", i, (1 << pagesize));
> +
> +               c0 = (entrylo0 & ENTRYLO_C) >> ENTRYLO_C_SHIFT;
> +               c1 = (entrylo1 & ENTRYLO_C) >> ENTRYLO_C_SHIFT;
> +
> +               pr_cont("va=%0*lx asid=%0*lx",
> +                       vwidth, (entryhi & ~0x1fffUL), asidwidth, asid & asidmask);
> +
> +               /* NR/NX are in awkward places, so mask them off separately */
> +               pa = entrylo0 & ~(ENTRYLO_NR | ENTRYLO_NX);
> +               pa = pa & PAGE_MASK;
> +               pr_cont("\n\t[");
> +               pr_cont("ri=%d xi=%d ",
> +                       (entrylo0 & ENTRYLO_NR) ? 1 : 0,
> +                       (entrylo0 & ENTRYLO_NX) ? 1 : 0);
> +               pr_cont("pa=%0*llx c=%d d=%d v=%d g=%d plv=%lld] [",
> +                       pwidth, pa, c0,
> +                       (entrylo0 & ENTRYLO_D) ? 1 : 0,
> +                       (entrylo0 & ENTRYLO_V) ? 1 : 0,
> +                       (entrylo0 & ENTRYLO_G) ? 1 : 0,
> +                       (entrylo0 & ENTRYLO_PLV) >> ENTRYLO_PLV_SHIFT);
> +               /* NR/NX are in awkward places, so mask them off separately */
> +               pa = entrylo1 & ~(ENTRYLO_NR | ENTRYLO_NX);
> +               pa = pa & PAGE_MASK;
> +               pr_cont("ri=%d xi=%d ",
> +                       (entrylo1 & ENTRYLO_NR) ? 1 : 0,
> +                       (entrylo1 & ENTRYLO_NX) ? 1 : 0);
> +               pr_cont("pa=%0*llx c=%d d=%d v=%d g=%d plv=%lld]\n",
> +                       pwidth, pa, c1,
> +                       (entrylo1 & ENTRYLO_D) ? 1 : 0,
> +                       (entrylo1 & ENTRYLO_V) ? 1 : 0,
> +                       (entrylo1 & ENTRYLO_G) ? 1 : 0,
> +                       (entrylo1 & ENTRYLO_PLV) >> ENTRYLO_PLV_SHIFT);
> +       }
> +       pr_info("\n");
> +
> +       write_csr_entryhi(s_entryhi);
> +       write_csr_tlbidx(s_index);
> +       write_csr_asid(s_asid);
> +}
> +
> +void dump_tlb_all(void)
> +{
> +       dump_tlb(0, current_cpu_data.tlbsize - 1);
> +}
> diff --git a/arch/loongarch/lib/memcpy.S b/arch/loongarch/lib/memcpy.S
> new file mode 100644
> index 000000000000..d53f1148d26b
> --- /dev/null
> +++ b/arch/loongarch/lib/memcpy.S
> @@ -0,0 +1,32 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <asm/asmmacro.h>
> +#include <asm/export.h>
> +#include <asm/regdef.h>
> +
> +/*
> + * void *memcpy(void *dst, const void *src, size_t n)
> + *
> + * a0: dst
> + * a1: src
> + * a2: n
> + */
> +SYM_FUNC_START(memcpy)
> +       move    a3, a0
> +       beqz    a2, 2f
> +
> +1:     ld.b    t0, a1, 0
> +       st.b    t0, a0, 0
> +       addi.d  a0, a0, 1
> +       addi.d  a1, a1, 1
> +       addi.d  a2, a2, -1
> +       bgt     a2, zero, 1b
> +
> +2:     move    v0, a3
> +       jr      ra
> +SYM_FUNC_END(memcpy)
> +
> +EXPORT_SYMBOL(memcpy)
> diff --git a/arch/loongarch/lib/memmove.S b/arch/loongarch/lib/memmove.S
> new file mode 100644
> index 000000000000..18907d83a83b
> --- /dev/null
> +++ b/arch/loongarch/lib/memmove.S
> @@ -0,0 +1,45 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <asm/asmmacro.h>
> +#include <asm/export.h>
> +#include <asm/regdef.h>
> +
> +/*
> + * void *rmemcpy(void *dst, const void *src, size_t n)
> + *
> + * a0: dst
> + * a1: src
> + * a2: n
> + */
> +SYM_FUNC_START(rmemcpy)
> +       move    a3, a0
> +       beqz    a2, 2f
> +
> +       add.d   a0, a0, a2
> +       add.d   a1, a1, a2
> +
> +1:     ld.b    t0, a1, -1
> +       st.b    t0, a0, -1
> +       addi.d  a0, a0, -1
> +       addi.d  a1, a1, -1
> +       addi.d  a2, a2, -1
> +       bgt     a2, zero, 1b
> +
> +2:     move    v0, a3
> +       jr      ra
> +SYM_FUNC_END(rmemcpy)
Why not directly use:

lib/string.c:
#ifndef __HAVE_ARCH_MEMCPY
/**
 * memcpy - Copy one area of memory to another
 * @dest: Where to copy to
 * @src: Where to copy from
 * @count: The size of the area.
 *
 * You should not use this function to access IO space, use memcpy_toio()
 * or memcpy_fromio() instead.
 */
void *memcpy(void *dest, const void *src, size_t count)
{
        char *tmp = dest;
        const char *s = src;

        while (count--)
                *tmp++ = *s++;
        return dest;
}
EXPORT_SYMBOL(memcpy);
#endif

Do you want to try a C's string implementation?
https://lore.kernel.org/linux-csky/202204051450.UN2k1raL-lkp@intel.com/T/#Z2e.:..:20220404142354.2792428-1-guoren::40kernel.org:1arch:csky:lib:string.c

> +
> +SYM_FUNC_START(memmove)
> +       blt     a0, a1, 1f      /* dst < src, memcpy */
> +       blt     a1, a0, 2f      /* src < dst, rmemcpy */
> +       jr      ra              /* dst == src, return */
> +
> +1:     b       memcpy
> +
> +2:     b       rmemcpy
> +SYM_FUNC_END(memmove)
> +
> +EXPORT_SYMBOL(memmove)
> diff --git a/arch/loongarch/lib/memset.S b/arch/loongarch/lib/memset.S
> new file mode 100644
> index 000000000000..3fc3e7da5263
> --- /dev/null
> +++ b/arch/loongarch/lib/memset.S
> @@ -0,0 +1,30 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <asm/asmmacro.h>
> +#include <asm/export.h>
> +#include <asm/regdef.h>
> +
> +/*
> + * void *memset(void *s, int c, size_t n)
> + *
> + * a0: s
> + * a1: c
> + * a2: n
> + */
> +SYM_FUNC_START(memset)
> +       move    a3, a0
> +       beqz    a2, 2f
> +
> +1:     st.b    a1, a0, 0
> +       addi.d  a0, a0, 1
> +       addi.d  a2, a2, -1
> +       bgt     a2, zero, 1b
> +
> +2:     move    v0, a3
> +       jr      ra
> +SYM_FUNC_END(memset)
> +
> +EXPORT_SYMBOL(memset)
> --
> 2.27.0
>


-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 06/24] LoongArch: Add CPU definition headers
  2022-04-30  9:05 ` [PATCH V9 06/24] LoongArch: Add CPU definition headers Huacai Chen
@ 2022-05-01 11:05   ` WANG Xuerui
  2022-05-01 15:07     ` Huacai Chen
  0 siblings, 1 reply; 94+ messages in thread
From: WANG Xuerui @ 2022-05-01 11:05 UTC (permalink / raw)
  To: Huacai Chen, Arnd Bergmann, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang


On 4/30/22 17:05, Huacai Chen wrote:
> This patch adds common headers (CPU definition and address space layout)
> for basic LoongArch support.
>
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> ---
>   arch/loongarch/include/asm/addrspace.h    |  110 ++
>   arch/loongarch/include/asm/cpu-features.h |   69 +
>   arch/loongarch/include/asm/cpu-info.h     |  136 ++
>   arch/loongarch/include/asm/cpu.h          |  127 ++
>   arch/loongarch/include/asm/fpregdef.h     |   49 +
>   arch/loongarch/include/asm/loongarch.h    | 1528 +++++++++++++++++++++
>   arch/loongarch/include/asm/loongson.h     |  159 +++
>   arch/loongarch/include/asm/regdef.h       |   43 +
>   8 files changed, 2221 insertions(+)
>   create mode 100644 arch/loongarch/include/asm/addrspace.h
>   create mode 100644 arch/loongarch/include/asm/cpu-features.h
>   create mode 100644 arch/loongarch/include/asm/cpu-info.h
>   create mode 100644 arch/loongarch/include/asm/cpu.h
>   create mode 100644 arch/loongarch/include/asm/fpregdef.h
>   create mode 100644 arch/loongarch/include/asm/loongarch.h
>   create mode 100644 arch/loongarch/include/asm/loongson.h
>   create mode 100644 arch/loongarch/include/asm/regdef.h
>
> diff --git a/arch/loongarch/include/asm/addrspace.h b/arch/loongarch/include/asm/addrspace.h
> new file mode 100644
> index 000000000000..e92541629d25
> --- /dev/null
> +++ b/arch/loongarch/include/asm/addrspace.h
> @@ -0,0 +1,110 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
This file obviously comes from the MIPS asm/addrspace.h, with visible 
similarities, so you should add attribution here.
> + */
> +#ifndef _ASM_ADDRSPACE_H
> +#define _ASM_ADDRSPACE_H
> +
> +#include <linux/const.h>
> +
> +#include <asm/loongarch.h>
> +
> +/*
> + * This gives the physical RAM offset.
> + */
> +#ifndef __ASSEMBLY__
> +#ifndef PHYS_OFFSET
> +#define PHYS_OFFSET	_AC(0, UL)
> +#endif
> +extern unsigned long vm_map_base;
> +#endif /* __ASSEMBLY__ */
> +
> +#ifndef IO_BASE
> +#define IO_BASE			CSR_DMW0_BASE
> +#endif
> +
> +#ifndef CAC_BASE
> +#define CAC_BASE		CSR_DMW1_BASE
Could use something less terse than the MIPS name... "CACHED_BASE" 
sounds a lot better while only costing a few more keystrokes.
> +#endif
> +
> +#ifndef UNCAC_BASE
> +#define UNCAC_BASE		CSR_DMW0_BASE
> +#endif
> +
> +#define DMW_PABITS	48
> +#define TO_PHYS_MASK	((1ULL << DMW_PABITS) - 1)
> +
> +/*
> + * Memory above this physical address will be considered highmem.
> + */
> +#ifndef HIGHMEM_START
> +#define HIGHMEM_START		(_AC(1, UL) << _AC(DMW_PABITS, UL))
> +#endif
> +
> +#define TO_PHYS(x)		(	      ((x) & TO_PHYS_MASK))
> +#define TO_CAC(x)		(CAC_BASE   | ((x) & TO_PHYS_MASK))
> +#define TO_UNCAC(x)		(UNCAC_BASE | ((x) & TO_PHYS_MASK))
> +
> +/*
> + * This handles the memory map.
> + */
> +#ifndef PAGE_OFFSET
> +#define PAGE_OFFSET		(CAC_BASE + PHYS_OFFSET)
> +#endif
> +
> +#ifndef FIXADDR_TOP
> +#define FIXADDR_TOP		((unsigned long)(long)(int)0xfffe0000)
> +#endif
> +
> +/*
> + *  Configure language
What's a "configure language"? This seems to be carried over from MIPS 
too, better clarify a bit...
> + */
> +#ifdef __ASSEMBLY__
> +#define _ATYPE_
> +#define _ATYPE32_
> +#define _ATYPE64_
> +#define _CONST64_(x)	x
> +#else
> +#define _ATYPE_		__PTRDIFF_TYPE__
> +#define _ATYPE32_	int
> +#define _ATYPE64_	__s64
> +#ifdef CONFIG_64BIT
> +#define _CONST64_(x)	x ## L
> +#else
> +#define _CONST64_(x)	x ## LL
> +#endif
> +#endif
> +
> +/*
> + *  32/64-bit LoongArch address spaces
> + */
> +#ifdef __ASSEMBLY__
> +#define _ACAST32_
> +#define _ACAST64_
> +#else
> +#define _ACAST32_		(_ATYPE_)(_ATYPE32_)	/* widen if necessary */
> +#define _ACAST64_		(_ATYPE64_)		/* do _not_ narrow */
> +#endif
> +
> +#ifdef CONFIG_32BIT
> +
> +#define UVRANGE			0x00000000
> +#define KPRANGE0		0x80000000
> +#define KPRANGE1		0xa0000000
> +#define KVRANGE			0xc0000000
> +
> +#else
> +
> +#define XUVRANGE		_CONST64_(0x0000000000000000)
> +#define XSPRANGE		_CONST64_(0x4000000000000000)
> +#define XKPRANGE		_CONST64_(0x8000000000000000)
> +#define XKVRANGE		_CONST64_(0xc000000000000000)
> +
> +#endif
> +
> +/*
> + * Returns the physical address of a KPRANGEx / XKPRANGE address
> + */
> +#define PHYSADDR(a)		((_ACAST64_(a)) & TO_PHYS_MASK)
> +
> +#endif /* _ASM_ADDRSPACE_H */
> diff --git a/arch/loongarch/include/asm/cpu-features.h b/arch/loongarch/include/asm/cpu-features.h
> new file mode 100644
> index 000000000000..e29d446112e8
> --- /dev/null
> +++ b/arch/loongarch/include/asm/cpu-features.h
> @@ -0,0 +1,69 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
This is also awfully similar to the MIPS header with the same name. Add 
attribution... I won't repeat myself for the other files out there, but 
make sure all derivative work are appropriately marked!
> + */
> +#ifndef __ASM_CPU_FEATURES_H
> +#define __ASM_CPU_FEATURES_H
> +
> +#include <asm/cpu.h>
> +#include <asm/cpu-info.h>
> +
> +#define cpu_opt(opt)			(cpu_data[0].options & (opt))
> +#define cpu_has(feat)			(cpu_data[0].options & BIT_ULL(feat))
> +
> +#define cpu_has_loongarch		(cpu_has_loongarch32 | cpu_has_loongarch64)
> +#define cpu_has_loongarch32		(cpu_data[0].isa_level & LOONGARCH_CPU_ISA_32BIT)
> +#define cpu_has_loongarch64		(cpu_data[0].isa_level & LOONGARCH_CPU_ISA_64BIT)
> +
> +#define cpu_icache_line_size()		cpu_data[0].icache.linesz
> +#define cpu_dcache_line_size()		cpu_data[0].dcache.linesz
> +#define cpu_vcache_line_size()		cpu_data[0].vcache.linesz
> +#define cpu_scache_line_size()		cpu_data[0].scache.linesz
> +
> +#ifdef CONFIG_32BIT
> +# define cpu_has_64bits			(cpu_data[0].isa_level & LOONGARCH_CPU_ISA_64BIT)
> +# define cpu_vabits			31
> +# define cpu_pabits			31
> +#endif
> +
> +#ifdef CONFIG_64BIT
> +# define cpu_has_64bits			1
> +# define cpu_vabits			cpu_data[0].vabits
> +# define cpu_pabits			cpu_data[0].pabits
> +# define __NEED_ADDRBITS_PROBE
> +#endif
> +
> +/*
> + * SMP assumption: Options of CPU 0 are a superset of all processors.
> + * This is true for all known LoongArch systems.
> + */
> +#define cpu_has_cpucfg		cpu_opt(LOONGARCH_CPU_CPUCFG)
> +#define cpu_has_lam		cpu_opt(LOONGARCH_CPU_LAM)
> +#define cpu_has_ual		cpu_opt(LOONGARCH_CPU_UAL)
> +#define cpu_has_fpu		cpu_opt(LOONGARCH_CPU_FPU)
> +#define cpu_has_lsx		cpu_opt(LOONGARCH_CPU_LSX)
> +#define cpu_has_lasx		cpu_opt(LOONGARCH_CPU_LASX)
> +#define cpu_has_complex		cpu_opt(LOONGARCH_CPU_COMPLEX)
> +#define cpu_has_crypto		cpu_opt(LOONGARCH_CPU_CRYPTO)
> +#define cpu_has_lvz		cpu_opt(LOONGARCH_CPU_LVZ)
> +#define cpu_has_lbt_x86		cpu_opt(LOONGARCH_CPU_LBT_X86)
> +#define cpu_has_lbt_arm		cpu_opt(LOONGARCH_CPU_LBT_ARM)
> +#define cpu_has_lbt_mips	cpu_opt(LOONGARCH_CPU_LBT_MIPS)
> +#define cpu_has_lbt		(cpu_has_lbt_x86|cpu_has_lbt_arm|cpu_has_lbt_mips)
> +#define cpu_has_csr		cpu_opt(LOONGARCH_CPU_CSR)
> +#define cpu_has_tlb		cpu_opt(LOONGARCH_CPU_TLB)
> +#define cpu_has_watch		cpu_opt(LOONGARCH_CPU_WATCH)
> +#define cpu_has_vint		cpu_opt(LOONGARCH_CPU_VINT)
> +#define cpu_has_csripi		cpu_opt(LOONGARCH_CPU_CSRIPI)
> +#define cpu_has_extioi		cpu_opt(LOONGARCH_CPU_EXTIOI)
> +#define cpu_has_prefetch	cpu_opt(LOONGARCH_CPU_PREFETCH)
> +#define cpu_has_pmp		cpu_opt(LOONGARCH_CPU_PMP)
> +#define cpu_has_perf		cpu_opt(LOONGARCH_CPU_PMP)
> +#define cpu_has_scalefreq	cpu_opt(LOONGARCH_CPU_SCALEFREQ)
> +#define cpu_has_flatmode	cpu_opt(LOONGARCH_CPU_FLATMODE)
> +#define cpu_has_eiodecode	cpu_opt(LOONGARCH_CPU_EIODECODE)
> +#define cpu_has_guestid		cpu_opt(LOONGARCH_CPU_GUESTID)
> +#define cpu_has_hypervisor	cpu_opt(LOONGARCH_CPU_HYPERVISOR)
These are all dynamic, according to these definitions, unlike the MIPS 
asm/cpu-features.h where features can be statically overridden by 
individual mach. So we can drop most of these and just write 
cpu_opt(XXX) inline everywhere?
> +
> +
> +#endif /* __ASM_CPU_FEATURES_H */
> diff --git a/arch/loongarch/include/asm/cpu-info.h b/arch/loongarch/include/asm/cpu-info.h
> new file mode 100644
> index 000000000000..8c173ee5650b
> --- /dev/null
> +++ b/arch/loongarch/include/asm/cpu-info.h
> @@ -0,0 +1,136 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef __ASM_CPU_INFO_H
> +#define __ASM_CPU_INFO_H
> +
> +#include <linux/cache.h>
> +#include <linux/types.h>
> +
> +#include <asm/loongarch.h>
> +
> +/*
> + * Descriptor for a cache
> + */
> +struct cache_desc {
> +	unsigned int waysize;	/* Bytes per way */
> +	unsigned short sets;	/* Number of lines per set */
> +	unsigned char ways;	/* Number of ways */
> +	unsigned char linesz;	/* Size of line in bytes */
> +	unsigned char waybit;	/* Bits to select in a cache set */
> +	unsigned char flags;	/* Flags describing cache properties */
> +};
> +
> +struct cpuinfo_loongarch {
> +	u64			asid_cache;
> +	unsigned long		asid_mask;
> +
> +	/*
> +	 * Capability and feature descriptor structure for LoongArch CPU
> +	 */
> +	unsigned long		ases;
Please don't use MIPS acronyms, especially NOT this one. ASE means 
Application-Specific Extension, but LoongArch has none 
"application-specific" adaptations at the moment...
> +	unsigned long long	options;
> +	unsigned int		processor_id;
> +	unsigned int		fpu_vers;
> +	unsigned int		fpu_csr0;
> +	unsigned int		fpu_mask;
> +	unsigned int		cputype;
> +	int			isa_level;
> +	int			tlbsize;
> +	int			tlbsizemtlb;
> +	int			tlbsizestlbsets;
> +	int			tlbsizestlbways;
> +	struct cache_desc	icache; /* Primary I-cache */
> +	struct cache_desc	dcache; /* Primary D or combined I/D cache */
> +	struct cache_desc	vcache; /* Victim cache, between pcache and scache */
> +	struct cache_desc	scache; /* Secondary cache */
> +	struct cache_desc	tcache; /* Tertiary/split secondary cache */
> +	int			package;/* physical package number */
> +	unsigned int		globalnumber;
> +	int			vabits; /* Virtual Address size in bits */
> +	int			pabits; /* Physical Address size in bits */
> +	void			*data;	/* Additional data */
> +	unsigned int		watch_dreg_count;   /* Number data breakpoints */
> +	unsigned int		watch_ireg_count;   /* Number instruction breakpoints */
> +	unsigned int		watch_reg_use_cnt; /* min(NUM_WATCH_REGS, watch_dreg_count + watch_ireg_count), Usable by ptrace */
> +	unsigned int		kscratch_mask; /* Usable KScratch mask. */
> +} __aligned(SMP_CACHE_BYTES);
> +
> +extern struct cpuinfo_loongarch cpu_data[];
> +#define boot_cpu_data cpu_data[0]
> +#define current_cpu_data cpu_data[smp_processor_id()]
> +#define raw_current_cpu_data cpu_data[raw_smp_processor_id()]
> +
> +extern void cpu_probe(void);
> +
> +extern const char *__cpu_family[];
> +extern const char *__cpu_full_name[];
> +#define cpu_family_string()	__cpu_family[raw_smp_processor_id()]
> +#define cpu_full_name_string()	__cpu_full_name[raw_smp_processor_id()]
> +
> +struct seq_file;
> +struct notifier_block;
> +
> +extern int register_proc_cpuinfo_notifier(struct notifier_block *nb);
> +extern int proc_cpuinfo_notifier_call_chain(unsigned long val, void *v);
> +
> +#define proc_cpuinfo_notifier(fn, pri)					\
> +({									\
> +	static struct notifier_block fn##_nb = {			\
> +		.notifier_call = fn,					\
> +		.priority = pri						\
> +	};								\
> +									\
> +	register_proc_cpuinfo_notifier(&fn##_nb);			\
> +})
> +
> +struct proc_cpuinfo_notifier_args {
> +	struct seq_file *m;
> +	unsigned long n;
> +};
> +
> +static inline unsigned int cpu_cluster(struct cpuinfo_loongarch *cpuinfo)
> +{
> +	return (cpuinfo->globalnumber & LOONGARCH_GLOBALNUMBER_CLUSTER) >>
> +		LOONGARCH_GLOBALNUMBER_CLUSTER_SHF;
> +}
> +
> +static inline unsigned int cpu_core(struct cpuinfo_loongarch *cpuinfo)
> +{
> +	return (cpuinfo->globalnumber & LOONGARCH_GLOBALNUMBER_CORE) >>
> +		LOONGARCH_GLOBALNUMBER_CORE_SHF;
> +}
> +
> +extern void cpu_set_cluster(struct cpuinfo_loongarch *cpuinfo, unsigned int cluster);
> +extern void cpu_set_core(struct cpuinfo_loongarch *cpuinfo, unsigned int core);
> +
> +static inline bool cpus_are_siblings(int cpua, int cpub)
> +{
> +	struct cpuinfo_loongarch *infoa = &cpu_data[cpua];
> +	struct cpuinfo_loongarch *infob = &cpu_data[cpub];
> +	unsigned int gnuma, gnumb;
> +
> +	if (infoa->package != infob->package)
> +		return false;
> +
> +	gnuma = infoa->globalnumber & ~LOONGARCH_GLOBALNUMBER_VP;
> +	gnumb = infob->globalnumber & ~LOONGARCH_GLOBALNUMBER_VP;
> +	if (gnuma != gnumb)
> +		return false;
> +
> +	return true;
> +}

Please don't use the "global number" expression anywhere in the port; 
come up with another suitable name.

I'm initially confused by all the "GLOBALNUMBER" and "VP" things, only 
knowing they might be related to something MIPS-specific, because the 
"global numbers" of powerpc and microblaze obviously stand for PCI 
domain number, as explained by their comments. After some further 
digging it became obvious, that GlobalNumber is actually a MIPSr6+ 
configuration register, introduced by the MT ASE and conveying 
information regarding Virtual Processors. Of course LoongArch doesn't 
have anything similar to that...

> +
> +static inline unsigned long cpu_asid_mask(struct cpuinfo_loongarch *cpuinfo)
> +{
> +	return cpuinfo->asid_mask;
> +}
> +
> +static inline void set_cpu_asid_mask(struct cpuinfo_loongarch *cpuinfo,
> +				     unsigned long asid_mask)
> +{
> +	cpuinfo->asid_mask = asid_mask;
> +}
Why keep the accessors when you can just inline the expression at call 
site? MIPS does this because they have to differentiate based on 
CONFIG_MIPS_ASID_BITS_VARIABLE, but LoongArch doesn't behave the same, 
so the 2 functions here should be removed.
> +
> +#endif /* __ASM_CPU_INFO_H */
> diff --git a/arch/loongarch/include/asm/cpu.h b/arch/loongarch/include/asm/cpu.h
> new file mode 100644
> index 000000000000..62e9cb6520a9
> --- /dev/null
> +++ b/arch/loongarch/include/asm/cpu.h
> @@ -0,0 +1,127 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * cpu.h: Values of the PRId register used to match up
> + *	  various LoongArch cpu types.

"PRID"; "CPU".

Similarly please change all other "PRId" to "PRID" to match the 
reference manual...

> + *
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef _ASM_CPU_H
> +#define _ASM_CPU_H
> +
> +/*
> + * As of the LoongArch specs from Loongson Technology, the PRId register
> + * (CPUCFG.00) is defined in this (backwards compatible) way:
> + *
> + * +----------------+----------------+----------------+----------------+
> + * | Reserved       | Company ID	    | Processor ID   | Revision	      |
> + * +----------------+----------------+----------------+----------------+
> + *  31		 24 23		  16 15		    8 7              0

I can't find the relevant spec... at least not in the Loongson 
3A5000/3B5000 Processor Reference Manual nor the LoongArch reference 
manual. The former only gives the default value of 0x14c010 while the 
latter only mentions the field's name and meaning.

Also, "as of the specs" is broken; you should say "As described in the ...".

> + *
> + */
> +
> +/*
> + * Assigned Company values for bits 23:16 of the PRId register.
> + */
> +
> +#define PRID_COMP_MASK		0xff0000
> +
> +#define PRID_COMP_LOONGSON	0x140000
> +
> +/*
> + * Assigned Processor ID (implementation) values for bits 15:8 of the PRId
> + * register.  In order to detect a certain CPU type exactly eventually
> + * additional registers may need to be examined.
> + */
> +
> +#define PRID_IMP_MASK		0xff00
> +
> +#define PRID_IMP_LOONGSON_32	0x4200  /* Loongson 32bit */
> +#define PRID_IMP_LOONGSON_64R	0x6100  /* Reduced Loongson 64bit */
> +#define PRID_IMP_LOONGSON_64C	0x6300  /* Classic Loongson 64bit */
Do we even have "classic" LoongArch cores? This scheme surely is carried 
over from the MIPS era, but I don't think any of the "classic" 
Loongson/MIPS cores is getting a LoongArch refresh...
> +#define PRID_IMP_LOONGSON_64G	0xc000  /* Generic Loongson 64bit */
> +#define PRID_IMP_UNKNOWN	0xff00
> +
> +/*
> + * Particular Revision values for bits 7:0 of the PRId register.
> + */
> +
> +#define PRID_REV_MASK		0x00ff
> +
> +#if !defined(__ASSEMBLY__)
> +
> +enum cpu_type_enum {
> +	CPU_UNKNOWN,
> +	CPU_LOONGSON32,
> +	CPU_LOONGSON64,
> +	CPU_LAST
> +};
> +
> +#endif /* !__ASSEMBLY */
> +
> +/*
> + * ISA Level encodings
> + *
> + */
> +
> +#define LOONGARCH_CPU_ISA_LA32R 0x00000001
> +#define LOONGARCH_CPU_ISA_LA32S 0x00000002
> +#define LOONGARCH_CPU_ISA_LA64  0x00000004
> +
> +#define LOONGARCH_CPU_ISA_32BIT (LOONGARCH_CPU_ISA_LA32R | LOONGARCH_CPU_ISA_LA32S)
> +#define LOONGARCH_CPU_ISA_64BIT LOONGARCH_CPU_ISA_LA64
> +
> +/*
> + * CPU Option encodings
> + */
> +#define CPU_FEATURE_CPUCFG		0	/* CPU has CPUCFG */
> +#define CPU_FEATURE_LAM			1	/* CPU has Atomic instructions */
> +#define CPU_FEATURE_UAL			2	/* CPU has Unaligned Access support */
> +#define CPU_FEATURE_FPU			3	/* CPU has FPU */
> +#define CPU_FEATURE_LSX			4	/* CPU has 128bit SIMD instructions */
> +#define CPU_FEATURE_LASX		5	/* CPU has 256bit SIMD instructions */
> +#define CPU_FEATURE_COMPLEX		6	/* CPU has Complex instructions */
> +#define CPU_FEATURE_CRYPTO		7	/* CPU has Crypto instructions */
> +#define CPU_FEATURE_LVZ			8	/* CPU has Virtualization extension */
> +#define CPU_FEATURE_LBT_X86		9	/* CPU has X86 Binary Translation */
> +#define CPU_FEATURE_LBT_ARM		10	/* CPU has ARM Binary Translation */
> +#define CPU_FEATURE_LBT_MIPS		11	/* CPU has MIPS Binary Translation */
> +#define CPU_FEATURE_TLB			12	/* CPU has TLB */
> +#define CPU_FEATURE_CSR			13	/* CPU has CSR feature */
> +#define CPU_FEATURE_WATCH		14	/* CPU has watchpoint registers */
> +#define CPU_FEATURE_VINT		15	/* CPU has vectored interrupts */
> +#define CPU_FEATURE_CSRIPI		16	/* CPU has CSR-IPI */
> +#define CPU_FEATURE_EXTIOI		17	/* CPU has EXT-IOI */
> +#define CPU_FEATURE_PREFETCH		18	/* CPU has prefetch instructions */
> +#define CPU_FEATURE_PMP			19	/* CPU has perfermance counter */
> +#define CPU_FEATURE_SCALEFREQ		20	/* CPU support scale cpufreq */
> +#define CPU_FEATURE_FLATMODE		21	/* CPU has flatmode */
> +#define CPU_FEATURE_EIODECODE		22	/* CPU has extioi int pin decode mode */
"EXTIOI interrupt pin decoding mode"?
> +#define CPU_FEATURE_GUESTID		23	/* CPU has GuestID feature */
> +#define CPU_FEATURE_HYPERVISOR		24	/* CPU has hypervisor (run in VM) */
"CPU is virtualized (under a hypervisor)"?
> +
> +#define LOONGARCH_CPU_CPUCFG		BIT_ULL(CPU_FEATURE_CPUCFG)
> +#define LOONGARCH_CPU_LAM		BIT_ULL(CPU_FEATURE_LAM)
> +#define LOONGARCH_CPU_UAL		BIT_ULL(CPU_FEATURE_UAL)
> +#define LOONGARCH_CPU_FPU		BIT_ULL(CPU_FEATURE_FPU)
> +#define LOONGARCH_CPU_LSX		BIT_ULL(CPU_FEATURE_LSX)
> +#define LOONGARCH_CPU_LASX		BIT_ULL(CPU_FEATURE_LASX)
> +#define LOONGARCH_CPU_COMPLEX		BIT_ULL(CPU_FEATURE_COMPLEX)
> +#define LOONGARCH_CPU_CRYPTO		BIT_ULL(CPU_FEATURE_CRYPTO)
> +#define LOONGARCH_CPU_LVZ		BIT_ULL(CPU_FEATURE_LVZ)
> +#define LOONGARCH_CPU_LBT_X86		BIT_ULL(CPU_FEATURE_LBT_X86)
> +#define LOONGARCH_CPU_LBT_ARM		BIT_ULL(CPU_FEATURE_LBT_ARM)
> +#define LOONGARCH_CPU_LBT_MIPS		BIT_ULL(CPU_FEATURE_LBT_MIPS)
> +#define LOONGARCH_CPU_TLB		BIT_ULL(CPU_FEATURE_TLB)
> +#define LOONGARCH_CPU_CSR		BIT_ULL(CPU_FEATURE_CSR)
> +#define LOONGARCH_CPU_WATCH		BIT_ULL(CPU_FEATURE_WATCH)
> +#define LOONGARCH_CPU_VINT		BIT_ULL(CPU_FEATURE_VINT)
> +#define LOONGARCH_CPU_CSRIPI		BIT_ULL(CPU_FEATURE_CSRIPI)
> +#define LOONGARCH_CPU_EXTIOI		BIT_ULL(CPU_FEATURE_EXTIOI)
> +#define LOONGARCH_CPU_PREFETCH		BIT_ULL(CPU_FEATURE_PREFETCH)
> +#define LOONGARCH_CPU_PMP		BIT_ULL(CPU_FEATURE_PMP)
> +#define LOONGARCH_CPU_SCALEFREQ		BIT_ULL(CPU_FEATURE_SCALEFREQ)
> +#define LOONGARCH_CPU_FLATMODE		BIT_ULL(CPU_FEATURE_FLATMODE)
> +#define LOONGARCH_CPU_EIODECODE		BIT_ULL(CPU_FEATURE_EIODECODE)
> +#define LOONGARCH_CPU_GUESTID		BIT_ULL(CPU_FEATURE_GUESTID)
> +#define LOONGARCH_CPU_HYPERVISOR	BIT_ULL(CPU_FEATURE_HYPERVISOR)
> +#endif /* _ASM_CPU_H */
> diff --git a/arch/loongarch/include/asm/fpregdef.h b/arch/loongarch/include/asm/fpregdef.h
> new file mode 100644
> index 000000000000..151dc9aee1c6
> --- /dev/null
> +++ b/arch/loongarch/include/asm/fpregdef.h
> @@ -0,0 +1,49 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Definitions for the FPU register names
> + *
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef _ASM_FPREGDEF_H
> +#define _ASM_FPREGDEF_H
> +
> +#define fv0	$f0	/* return value */
> +#define fv1	$f2
> +#define fa0	$f12	/* argument registers */
> +#define fa1	$f13
> +#define fa2	$f14
> +#define fa3	$f15
> +#define fa4	$f16
> +#define fa5	$f17
> +#define fa6	$f18
> +#define fa7	$f19
> +#define ft0	$f4	/* caller saved */
> +#define ft1	$f5
> +#define ft2	$f6
> +#define ft3	$f7
> +#define ft4	$f8
> +#define ft5	$f9
> +#define ft6	$f10
> +#define ft7	$f11
> +#define ft8	$f20
> +#define ft9	$f21
> +#define ft10	$f22
> +#define ft11	$f23
> +#define ft12	$f1
> +#define ft13	$f3
> +#define fs0	$f24	/* callee saved */
> +#define fs1	$f25
> +#define fs2	$f26
> +#define fs3	$f27
> +#define fs4	$f28
> +#define fs5	$f29
> +#define fs6	$f30
> +#define fs7	$f31
This doesn't agree with the current ABI spec, and may need further 
checking. There's no way $f1 could be $ft12 and return values not 
sharing storage with the first two arguments.
> +
> +#define fcsr0	$r0
> +#define fcsr1	$r1
> +#define fcsr2	$r2
> +#define fcsr3	$r3
> +#define vcsr16	$r16
> +
> +#endif /* _ASM_FPREGDEF_H */
> diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h
> new file mode 100644
> index 000000000000..083e6726d4cb
> --- /dev/null
> +++ b/arch/loongarch/include/asm/loongarch.h
> @@ -0,0 +1,1528 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef _ASM_LOONGARCH_H
> +#define _ASM_LOONGARCH_H
> +
> +#include <linux/bits.h>
> +#include <linux/linkage.h>
> +#include <linux/types.h>
> +
> +#ifndef __ASSEMBLY__
> +#include <larchintrin.h>
> +
> +/*
> + * parse_r var, r - Helper assembler macro for parsing register names.
> + *
> + * This converts the register name in $n form provided in \r to the
> + * corresponding register number, which is assigned to the variable \var. It is
> + * needed to allow explicit encoding of instructions in inline assembly where
> + * registers are chosen by the compiler in $n form, allowing us to avoid using
> + * fixed register numbers.
> + *
> + * It also allows newer instructions (not implemented by the assembler) to be
> + * transparently implemented using assembler macros, instead of needing separate
> + * cases depending on toolchain support.
> + *
> + * Simple usage example:
> + * __asm__ __volatile__("parse_r addr, %0\n\t"
> + *			"#invtlb op, 0, %0\n\t"
> + *			".word ((0x6498000) | (addr << 10) | (0 << 5) | op)"
> + *			: "=r" (status);
> + */
> +
> +/* Match an individual register number and assign to \var */
> +#define _IFC_REG(n)				\
> +	".ifc	\\r, $r" #n "\n\t"		\
> +	"\\var	= " #n "\n\t"			\
> +	".endif\n\t"
> +
> +__asm__(".macro	parse_r var r\n\t"
> +	"\\var	= -1\n\t"
> +	_IFC_REG(0)  _IFC_REG(1)  _IFC_REG(2)  _IFC_REG(3)
> +	_IFC_REG(4)  _IFC_REG(5)  _IFC_REG(6)  _IFC_REG(7)
> +	_IFC_REG(8)  _IFC_REG(9)  _IFC_REG(10) _IFC_REG(11)
> +	_IFC_REG(12) _IFC_REG(13) _IFC_REG(14) _IFC_REG(15)
> +	_IFC_REG(16) _IFC_REG(17) _IFC_REG(18) _IFC_REG(19)
> +	_IFC_REG(20) _IFC_REG(21) _IFC_REG(22) _IFC_REG(23)
> +	_IFC_REG(24) _IFC_REG(25) _IFC_REG(26) _IFC_REG(27)
> +	_IFC_REG(28) _IFC_REG(29) _IFC_REG(30) _IFC_REG(31)
> +	".iflt	\\var\n\t"
> +	".error	\"Unable to parse register name \\r\"\n\t"
> +	".endif\n\t"
> +	".endm");
> +
> +#undef _IFC_REG
> +
> +/* CPUCFG */
> +static inline u32 read_cpucfg(u32 reg)
> +{
> +	return __cpucfg(reg);
> +}
> +
> +#endif /* !__ASSEMBLY__ */
> +
> +#ifdef __ASSEMBLY__
> +
> +/* LoongArch Registers */
> +#define REG_RA	0x1
> +#define REG_TP	0x2
> +#define REG_SP	0x3
> +#define REG_A0	0x4
> +#define REG_A1	0x5
> +#define REG_A2	0x6
> +#define REG_A3	0x7
> +#define REG_A4	0x8
> +#define REG_A5	0x9
> +#define REG_A6	0xa
> +#define REG_A7	0xb
> +#define REG_V0	REG_A0
> +#define REG_V1	REG_A1
Remove V* aliases per spec.
> +#define REG_T0	0xc
> +#define REG_T1	0xd
> +#define REG_T2	0xe
> +#define REG_T3	0xf
> +#define REG_T4	0x10
> +#define REG_T5	0x11
> +#define REG_T6	0x12
> +#define REG_T7	0x13
> +#define REG_T8	0x14
> +#define REG_U0	0x15
And document this somewhere.
> +#define REG_FP	0x16
> +#define REG_S0	0x17
> +#define REG_S1	0x18
> +#define REG_S2	0x19
> +#define REG_S3	0x1a
> +#define REG_S4	0x1b
> +#define REG_S5	0x1c
> +#define REG_S6	0x1d
> +#define REG_S7	0x1e
> +#define REG_S8	0x1f
> +
> +#endif /* __ASSEMBLY__ */
> +
> +/* Bit Domains for CPUCFG registers */
"bit fields"?
> +#define LOONGARCH_CPUCFG0		0x0
> +#define  CPUCFG0_PRID			GENMASK(31, 0)
> +
> +#define LOONGARCH_CPUCFG1		0x1
> +#define  CPUCFG1_ISGR32			BIT(0)
> +#define  CPUCFG1_ISGR64			BIT(1)
> +#define  CPUCFG1_PAGING			BIT(2)
> +#define  CPUCFG1_IOCSR			BIT(3)
> +#define  CPUCFG1_PABITS			GENMASK(11, 4)
> +#define  CPUCFG1_VABITS			GENMASK(19, 12)
> +#define  CPUCFG1_UAL			BIT(20)
> +#define  CPUCFG1_RI			BIT(21)
> +#define  CPUCFG1_EP			BIT(22)
> +#define  CPUCFG1_RPLV			BIT(23)
> +#define  CPUCFG1_HUGEPG			BIT(24)
> +#define  CPUCFG1_IOCSRBRD		BIT(25)
> +#define  CPUCFG1_MSGINT			BIT(26)
These names are not consistent with the reference manual. For example 
the bits 0-1 is actually an enum, and bit 2 is called PGMMU in the 
LoongArch reference manual.
> +
> +#define LOONGARCH_CPUCFG2		0x2
> +#define  CPUCFG2_FP			BIT(0)
> +#define  CPUCFG2_FPSP			BIT(1)
> +#define  CPUCFG2_FPDP			BIT(2)
> +#define  CPUCFG2_FPVERS			GENMASK(5, 3)
> +#define  CPUCFG2_LSX			BIT(6)
> +#define  CPUCFG2_LASX			BIT(7)
> +#define  CPUCFG2_COMPLEX		BIT(8)
> +#define  CPUCFG2_CRYPTO			BIT(9)
> +#define  CPUCFG2_LVZP			BIT(10)
> +#define  CPUCFG2_LVZVER			GENMASK(13, 11)
> +#define  CPUCFG2_LLFTP			BIT(14)
> +#define  CPUCFG2_LLFTPREV		GENMASK(17, 15)
> +#define  CPUCFG2_X86BT			BIT(18)
> +#define  CPUCFG2_ARMBT			BIT(19)
> +#define  CPUCFG2_MIPSBT			BIT(20)
> +#define  CPUCFG2_LSPW			BIT(21)
> +#define  CPUCFG2_LAM			BIT(22)
> +
> +#define LOONGARCH_CPUCFG3		0x3
> +#define  CPUCFG3_CCDMA			BIT(0)
> +#define  CPUCFG3_SFB			BIT(1)
> +#define  CPUCFG3_UCACC			BIT(2)
> +#define  CPUCFG3_LLEXC			BIT(3)
> +#define  CPUCFG3_SCDLY			BIT(4)
> +#define  CPUCFG3_LLDBAR			BIT(5)
> +#define  CPUCFG3_ITLBT			BIT(6)
> +#define  CPUCFG3_ICACHET		BIT(7)
> +#define  CPUCFG3_SPW_LVL		GENMASK(10, 8)
> +#define  CPUCFG3_SPW_HG_HF		BIT(11)
> +#define  CPUCFG3_RVA			BIT(12)
> +#define  CPUCFG3_RVAMAX			GENMASK(16, 13)
> +
> +#define LOONGARCH_CPUCFG4		0x4
> +#define  CPUCFG4_CCFREQ			GENMASK(31, 0)
> +
> +#define LOONGARCH_CPUCFG5		0x5
> +#define  CPUCFG5_CCMUL			GENMASK(15, 0)
> +#define  CPUCFG5_CCDIV			GENMASK(31, 16)
> +
> +#define LOONGARCH_CPUCFG6		0x6
> +#define  CPUCFG6_PMP			BIT(0)
> +#define  CPUCFG6_PAMVER			GENMASK(3, 1)
> +#define  CPUCFG6_PMNUM			GENMASK(7, 4)
> +#define  CPUCFG6_PMBITS			GENMASK(13, 8)
> +#define  CPUCFG6_UPM			BIT(14)
> +
> +#define LOONGARCH_CPUCFG16		0x10
> +#define  CPUCFG16_L1_IUPRE		BIT(0)
> +#define  CPUCFG16_L1_IUUNIFY		BIT(1)
> +#define  CPUCFG16_L1_DPRE		BIT(2)
> +#define  CPUCFG16_L2_IUPRE		BIT(3)
> +#define  CPUCFG16_L2_IUUNIFY		BIT(4)
> +#define  CPUCFG16_L2_IUPRIV		BIT(5)
> +#define  CPUCFG16_L2_IUINCL		BIT(6)
> +#define  CPUCFG16_L2_DPRE		BIT(7)
> +#define  CPUCFG16_L2_DPRIV		BIT(8)
> +#define  CPUCFG16_L2_DINCL		BIT(9)
> +#define  CPUCFG16_L3_IUPRE		BIT(10)
> +#define  CPUCFG16_L3_IUUNIFY		BIT(11)
> +#define  CPUCFG16_L3_IUPRIV		BIT(12)
> +#define  CPUCFG16_L3_IUINCL		BIT(13)
> +#define  CPUCFG16_L3_DPRE		BIT(14)
> +#define  CPUCFG16_L3_DPRIV		BIT(15)
> +#define  CPUCFG16_L3_DINCL		BIT(16)
> +
> +#define LOONGARCH_CPUCFG17		0x11
> +#define  CPUCFG17_L1I_WAYS_M		GENMASK(15, 0)
> +#define  CPUCFG17_L1I_SETS_M		GENMASK(23, 16)
> +#define  CPUCFG17_L1I_SIZE_M		GENMASK(30, 24)
> +#define  CPUCFG17_L1I_WAYS		0
> +#define  CPUCFG17_L1I_SETS		16
> +#define  CPUCFG17_L1I_SIZE		24
> +
> +#define LOONGARCH_CPUCFG18		0x12
> +#define  CPUCFG18_L1D_WAYS_M		GENMASK(15, 0)
> +#define  CPUCFG18_L1D_SETS_M		GENMASK(23, 16)
> +#define  CPUCFG18_L1D_SIZE_M		GENMASK(30, 24)
> +#define  CPUCFG18_L1D_WAYS		0
> +#define  CPUCFG18_L1D_SETS		16
> +#define  CPUCFG18_L1D_SIZE		24
> +
> +#define LOONGARCH_CPUCFG19		0x13
> +#define  CPUCFG19_L2_WAYS_M		GENMASK(15, 0)
> +#define  CPUCFG19_L2_SETS_M		GENMASK(23, 16)
> +#define  CPUCFG19_L2_SIZE_M		GENMASK(30, 24)
> +#define  CPUCFG19_L2_WAYS		0
> +#define  CPUCFG19_L2_SETS		16
> +#define  CPUCFG19_L2_SIZE		24
> +
> +#define LOONGARCH_CPUCFG20		0x14
> +#define  CPUCFG20_L3_WAYS_M		GENMASK(15, 0)
> +#define  CPUCFG20_L3_SETS_M		GENMASK(23, 16)
> +#define  CPUCFG20_L3_SIZE_M		GENMASK(30, 24)
> +#define  CPUCFG20_L3_WAYS		0
> +#define  CPUCFG20_L3_SETS		16
> +#define  CPUCFG20_L3_SIZE		24
> +
> +#define LOONGARCH_CPUCFG48		0x30
> +#define  CPUCFG48_MCSR_LCK		BIT(0)
> +#define  CPUCFG48_NAP_EN		BIT(1)
> +#define  CPUCFG48_VFPU_CG		BIT(2)
> +#define  CPUCFG48_RAM_CG		BIT(3)
> +
> +#ifndef __ASSEMBLY__
> +
> +/* CSR */
> +static __always_inline u32 csr_readl(u32 reg)
> +{
> +	return __csrrd_w(reg);
> +}
> +
> +static __always_inline u64 csr_readq(u32 reg)
> +{
> +	return __csrrd_d(reg);
> +}
> +
> +static __always_inline void csr_writel(u32 val, u32 reg)
> +{
> +	__csrwr_w(val, reg);
> +}
> +
> +static __always_inline void csr_writeq(u64 val, u32 reg)
> +{
> +	__csrwr_d(val, reg);
> +}
> +
> +static __always_inline u32 csr_xchgl(u32 val, u32 mask, u32 reg)
> +{
> +	return __csrxchg_w(val, mask, reg);
> +}
> +
> +static __always_inline u64 csr_xchgq(u64 val, u64 mask, u32 reg)
> +{
> +	return __csrxchg_d(val, mask, reg);
> +}
> +
> +/* IOCSR */
> +static __always_inline u32 iocsr_readl(u32 reg)
> +{
> +	return __iocsrrd_w(reg);
> +}
> +
> +static __always_inline u64 iocsr_readq(u32 reg)
> +{
> +	return __iocsrrd_d(reg);
> +}
> +
> +static __always_inline void iocsr_writel(u32 val, u32 reg)
> +{
> +	__iocsrwr_w(val, reg);
> +}
> +
> +static __always_inline void iocsr_writeq(u64 val, u32 reg)
> +{
> +	__iocsrwr_d(val, reg);
> +}
> +
> +#endif /* !__ASSEMBLY__ */
> +
> +/* CSR register number */
> +
> +/* Basic CSR registers */
> +#define LOONGARCH_CSR_CRMD		0x0	/* Current mode info */
> +#define  CSR_CRMD_WE_SHIFT		9
> +#define  CSR_CRMD_WE			(_ULCAST_(0x1) << CSR_CRMD_WE_SHIFT)
> +#define  CSR_CRMD_DACM_SHIFT		7
> +#define  CSR_CRMD_DACM_WIDTH		2
> +#define  CSR_CRMD_DACM			(_ULCAST_(0x3) << CSR_CRMD_DACM_SHIFT)
> +#define  CSR_CRMD_DACF_SHIFT		5
> +#define  CSR_CRMD_DACF_WIDTH		2
> +#define  CSR_CRMD_DACF			(_ULCAST_(0x3) << CSR_CRMD_DACF_SHIFT)
> +#define  CSR_CRMD_PG_SHIFT		4
> +#define  CSR_CRMD_PG			(_ULCAST_(0x1) << CSR_CRMD_PG_SHIFT)
> +#define  CSR_CRMD_DA_SHIFT		3
> +#define  CSR_CRMD_DA			(_ULCAST_(0x1) << CSR_CRMD_DA_SHIFT)
> +#define  CSR_CRMD_IE_SHIFT		2
> +#define  CSR_CRMD_IE			(_ULCAST_(0x1) << CSR_CRMD_IE_SHIFT)
> +#define  CSR_CRMD_PLV_SHIFT		0
> +#define  CSR_CRMD_PLV_WIDTH		2
> +#define  CSR_CRMD_PLV			(_ULCAST_(0x3) << CSR_CRMD_PLV_SHIFT)
> +
> +#define PLV_KERN			0
> +#define PLV_USER			3
> +#define PLV_MASK			0x3
> +
> +#define LOONGARCH_CSR_PRMD		0x1	/* Prev-exception mode info */
> +#define  CSR_PRMD_PWE_SHIFT		3
> +#define  CSR_PRMD_PWE			(_ULCAST_(0x1) << CSR_PRMD_PWE_SHIFT)
> +#define  CSR_PRMD_PIE_SHIFT		2
> +#define  CSR_PRMD_PIE			(_ULCAST_(0x1) << CSR_PRMD_PIE_SHIFT)
> +#define  CSR_PRMD_PPLV_SHIFT		0
> +#define  CSR_PRMD_PPLV_WIDTH		2
> +#define  CSR_PRMD_PPLV			(_ULCAST_(0x3) << CSR_PRMD_PPLV_SHIFT)
> +
> +#define LOONGARCH_CSR_EUEN		0x2	/* Extended unit enable */
> +#define  CSR_EUEN_LBTEN_SHIFT		3
> +#define  CSR_EUEN_LBTEN			(_ULCAST_(0x1) << CSR_EUEN_LBTEN_SHIFT)
> +#define  CSR_EUEN_LASXEN_SHIFT		2
> +#define  CSR_EUEN_LASXEN		(_ULCAST_(0x1) << CSR_EUEN_LASXEN_SHIFT)
> +#define  CSR_EUEN_LSXEN_SHIFT		1
> +#define  CSR_EUEN_LSXEN			(_ULCAST_(0x1) << CSR_EUEN_LSXEN_SHIFT)
> +#define  CSR_EUEN_FPEN_SHIFT		0
> +#define  CSR_EUEN_FPEN			(_ULCAST_(0x1) << CSR_EUEN_FPEN_SHIFT)
> +
> +#define LOONGARCH_CSR_MISC		0x3	/* Misc config */
> +
> +#define LOONGARCH_CSR_ECFG		0x4	/* Exception config */
> +#define  CSR_ECFG_VS_SHIFT		16
> +#define  CSR_ECFG_VS_WIDTH		3
> +#define  CSR_ECFG_VS			(_ULCAST_(0x7) << CSR_ECFG_VS_SHIFT)
> +#define  CSR_ECFG_IM_SHIFT		0
> +#define  CSR_ECFG_IM_WIDTH		13
> +#define  CSR_ECFG_IM			(_ULCAST_(0x1fff) << CSR_ECFG_IM_SHIFT)
> +
> +#define LOONGARCH_CSR_ESTAT		0x5	/* Exception status */
> +#define  CSR_ESTAT_ESUBCODE_SHIFT	22
> +#define  CSR_ESTAT_ESUBCODE_WIDTH	9
> +#define  CSR_ESTAT_ESUBCODE		(_ULCAST_(0x1ff) << CSR_ESTAT_ESUBCODE_SHIFT)
> +#define  CSR_ESTAT_EXC_SHIFT		16
> +#define  CSR_ESTAT_EXC_WIDTH		6
> +#define  CSR_ESTAT_EXC			(_ULCAST_(0x3f) << CSR_ESTAT_EXC_SHIFT)
> +#define  CSR_ESTAT_IS_SHIFT		0
> +#define  CSR_ESTAT_IS_WIDTH		15
> +#define  CSR_ESTAT_IS			(_ULCAST_(0x7fff) << CSR_ESTAT_IS_SHIFT)
> +
> +#define LOONGARCH_CSR_ERA		0x6	/* ERA */
> +
> +#define LOONGARCH_CSR_BADV		0x7	/* Bad virtual address */
> +
> +#define LOONGARCH_CSR_BADI		0x8	/* Bad instruction */
> +
> +#define LOONGARCH_CSR_EENTRY		0xc	/* Exception entry */
> +
> +/* TLB related CSR registers */
> +#define LOONGARCH_CSR_TLBIDX		0x10	/* TLB Index, EHINV, PageSize, NP */
> +#define  CSR_TLBIDX_EHINV_SHIFT		31
> +#define  CSR_TLBIDX_EHINV		(_ULCAST_(1) << CSR_TLBIDX_EHINV_SHIFT)
> +#define  CSR_TLBIDX_PS_SHIFT		24
> +#define  CSR_TLBIDX_PS_WIDTH		6
> +#define  CSR_TLBIDX_PS			(_ULCAST_(0x3f) << CSR_TLBIDX_PS_SHIFT)
> +#define  CSR_TLBIDX_IDX_SHIFT		0
> +#define  CSR_TLBIDX_IDX_WIDTH		12
> +#define  CSR_TLBIDX_IDX			(_ULCAST_(0xfff) << CSR_TLBIDX_IDX_SHIFT)
> +#define  CSR_TLBIDX_SIZEM		0x3f000000
> +#define  CSR_TLBIDX_SIZE		CSR_TLBIDX_PS_SHIFT
> +#define  CSR_TLBIDX_IDXM		0xfff
> +#define  CSR_INVALID_ENTRY(e)		(CSR_TLBIDX_EHINV | e)
> +
> +#define LOONGARCH_CSR_TLBEHI		0x11	/* TLB EntryHi */
> +
> +#define LOONGARCH_CSR_TLBELO0		0x12	/* TLB EntryLo0 */
> +#define  CSR_TLBLO0_RPLV_SHIFT		63
> +#define  CSR_TLBLO0_RPLV		(_ULCAST_(0x1) << CSR_TLBLO0_RPLV_SHIFT)
> +#define  CSR_TLBLO0_NX_SHIFT		62
> +#define  CSR_TLBLO0_NX			(_ULCAST_(0x1) << CSR_TLBLO0_NX_SHIFT)
> +#define  CSR_TLBLO0_NR_SHIFT		61
> +#define  CSR_TLBLO0_NR			(_ULCAST_(0x1) << CSR_TLBLO0_NR_SHIFT)
> +#define  CSR_TLBLO0_PFN_SHIFT		12
> +#define  CSR_TLBLO0_PFN_WIDTH		36
> +#define  CSR_TLBLO0_PFN			(_ULCAST_(0xfffffffff) << CSR_TLBLO0_PFN_SHIFT)
> +#define  CSR_TLBLO0_GLOBAL_SHIFT	6
> +#define  CSR_TLBLO0_GLOBAL		(_ULCAST_(0x1) << CSR_TLBLO0_GLOBAL_SHIFT)
> +#define  CSR_TLBLO0_CCA_SHIFT		4
> +#define  CSR_TLBLO0_CCA_WIDTH		2
> +#define  CSR_TLBLO0_CCA			(_ULCAST_(0x3) << CSR_TLBLO0_CCA_SHIFT)
> +#define  CSR_TLBLO0_PLV_SHIFT		2
> +#define  CSR_TLBLO0_PLV_WIDTH		2
> +#define  CSR_TLBLO0_PLV			(_ULCAST_(0x3) << CSR_TLBLO0_PLV_SHIFT)
> +#define  CSR_TLBLO0_WE_SHIFT		1
> +#define  CSR_TLBLO0_WE			(_ULCAST_(0x1) << CSR_TLBLO0_WE_SHIFT)
> +#define  CSR_TLBLO0_V_SHIFT		0
> +#define  CSR_TLBLO0_V			(_ULCAST_(0x1) << CSR_TLBLO0_V_SHIFT)
> +
> +#define LOONGARCH_CSR_TLBELO1		0x13	/* TLB EntryLo1 */
> +#define  CSR_TLBLO1_RPLV_SHIFT		63
> +#define  CSR_TLBLO1_RPLV		(_ULCAST_(0x1) << CSR_TLBLO1_RPLV_SHIFT)
> +#define  CSR_TLBLO1_NX_SHIFT		62
> +#define  CSR_TLBLO1_NX			(_ULCAST_(0x1) << CSR_TLBLO1_NX_SHIFT)
> +#define  CSR_TLBLO1_NR_SHIFT		61
> +#define  CSR_TLBLO1_NR			(_ULCAST_(0x1) << CSR_TLBLO1_NR_SHIFT)
> +#define  CSR_TLBLO1_PFN_SHIFT		12
> +#define  CSR_TLBLO1_PFN_WIDTH		36
> +#define  CSR_TLBLO1_PFN			(_ULCAST_(0xfffffffff) << CSR_TLBLO1_PFN_SHIFT)
> +#define  CSR_TLBLO1_GLOBAL_SHIFT	6
> +#define  CSR_TLBLO1_GLOBAL		(_ULCAST_(0x1) << CSR_TLBLO1_GLOBAL_SHIFT)
> +#define  CSR_TLBLO1_CCA_SHIFT		4
> +#define  CSR_TLBLO1_CCA_WIDTH		2
> +#define  CSR_TLBLO1_CCA			(_ULCAST_(0x3) << CSR_TLBLO1_CCA_SHIFT)
> +#define  CSR_TLBLO1_PLV_SHIFT		2
> +#define  CSR_TLBLO1_PLV_WIDTH		2
> +#define  CSR_TLBLO1_PLV			(_ULCAST_(0x3) << CSR_TLBLO1_PLV_SHIFT)
> +#define  CSR_TLBLO1_WE_SHIFT		1
> +#define  CSR_TLBLO1_WE			(_ULCAST_(0x1) << CSR_TLBLO1_WE_SHIFT)
> +#define  CSR_TLBLO1_V_SHIFT		0
> +#define  CSR_TLBLO1_V			(_ULCAST_(0x1) << CSR_TLBLO1_V_SHIFT)
> +
> +#define LOONGARCH_CSR_GTLBC		0x15	/* Guest TLB control */
> +#define  CSR_GTLBC_RID_SHIFT		16
> +#define  CSR_GTLBC_RID_WIDTH		8
> +#define  CSR_GTLBC_RID			(_ULCAST_(0xff) << CSR_GTLBC_RID_SHIFT)
> +#define  CSR_GTLBC_TOTI_SHIFT		13
> +#define  CSR_GTLBC_TOTI			(_ULCAST_(0x1) << CSR_GTLBC_TOTI_SHIFT)
> +#define  CSR_GTLBC_USERID_SHIFT		12
> +#define  CSR_GTLBC_USERID		(_ULCAST_(0x1) << CSR_GTLBC_USERID_SHIFT)
> +#define  CSR_GTLBC_GMTLBSZ_SHIFT	0
> +#define  CSR_GTLBC_GMTLBSZ_WIDTH	6
> +#define  CSR_GTLBC_GMTLBSZ		(_ULCAST_(0x3f) << CSR_GTLBC_GMTLBSZ_SHIFT)
> +
> +#define LOONGARCH_CSR_TRGP		0x16	/* TLBR read guest info */
> +#define  CSR_TRGP_RID_SHIFT		16
> +#define  CSR_TRGP_RID_WIDTH		8
> +#define  CSR_TRGP_RID			(_ULCAST_(0xff) << CSR_TRGP_RID_SHIFT)
> +#define  CSR_TRGP_GTLB_SHIFT		0
> +#define  CSR_TRGP_GTLB			(1 << CSR_TRGP_GTLB_SHIFT)
> +
> +#define LOONGARCH_CSR_ASID		0x18	/* ASID */
> +#define  CSR_ASID_BIT_SHIFT		16	/* ASIDBits */
> +#define  CSR_ASID_BIT_WIDTH		8
> +#define  CSR_ASID_BIT			(_ULCAST_(0xff) << CSR_ASID_BIT_SHIFT)
> +#define  CSR_ASID_ASID_SHIFT		0
> +#define  CSR_ASID_ASID_WIDTH		10
> +#define  CSR_ASID_ASID			(_ULCAST_(0x3ff) << CSR_ASID_ASID_SHIFT)
> +
> +#define LOONGARCH_CSR_PGDL		0x19	/* Page table base address when VA[47] = 0 */
> +
> +#define LOONGARCH_CSR_PGDH		0x1a	/* Page table base address when VA[47] = 1 */
> +
> +#define LOONGARCH_CSR_PGD		0x1b	/* Page table base */
> +
> +#define LOONGARCH_CSR_PWCTL0		0x1c	/* PWCtl0 */
> +#define  CSR_PWCTL0_PTEW_SHIFT		30
> +#define  CSR_PWCTL0_PTEW_WIDTH		2
> +#define  CSR_PWCTL0_PTEW		(_ULCAST_(0x3) << CSR_PWCTL0_PTEW_SHIFT)
> +#define  CSR_PWCTL0_DIR1WIDTH_SHIFT	25
> +#define  CSR_PWCTL0_DIR1WIDTH_WIDTH	5
> +#define  CSR_PWCTL0_DIR1WIDTH		(_ULCAST_(0x1f) << CSR_PWCTL0_DIR1WIDTH_SHIFT)
> +#define  CSR_PWCTL0_DIR1BASE_SHIFT	20
> +#define  CSR_PWCTL0_DIR1BASE_WIDTH	5
> +#define  CSR_PWCTL0_DIR1BASE		(_ULCAST_(0x1f) << CSR_PWCTL0_DIR1BASE_SHIFT)
> +#define  CSR_PWCTL0_DIR0WIDTH_SHIFT	15
> +#define  CSR_PWCTL0_DIR0WIDTH_WIDTH	5
> +#define  CSR_PWCTL0_DIR0WIDTH		(_ULCAST_(0x1f) << CSR_PWCTL0_DIR0WIDTH_SHIFT)
> +#define  CSR_PWCTL0_DIR0BASE_SHIFT	10
> +#define  CSR_PWCTL0_DIR0BASE_WIDTH	5
> +#define  CSR_PWCTL0_DIR0BASE		(_ULCAST_(0x1f) << CSR_PWCTL0_DIR0BASE_SHIFT)
> +#define  CSR_PWCTL0_PTWIDTH_SHIFT	5
> +#define  CSR_PWCTL0_PTWIDTH_WIDTH	5
> +#define  CSR_PWCTL0_PTWIDTH		(_ULCAST_(0x1f) << CSR_PWCTL0_PTWIDTH_SHIFT)
> +#define  CSR_PWCTL0_PTBASE_SHIFT	0
> +#define  CSR_PWCTL0_PTBASE_WIDTH	5
> +#define  CSR_PWCTL0_PTBASE		(_ULCAST_(0x1f) << CSR_PWCTL0_PTBASE_SHIFT)
> +
> +#define LOONGARCH_CSR_PWCTL1		0x1d	/* PWCtl1 */
> +#define  CSR_PWCTL1_DIR3WIDTH_SHIFT	18
> +#define  CSR_PWCTL1_DIR3WIDTH_WIDTH	5
> +#define  CSR_PWCTL1_DIR3WIDTH		(_ULCAST_(0x1f) << CSR_PWCTL1_DIR3WIDTH_SHIFT)
> +#define  CSR_PWCTL1_DIR3BASE_SHIFT	12
> +#define  CSR_PWCTL1_DIR3BASE_WIDTH	5
> +#define  CSR_PWCTL1_DIR3BASE		(_ULCAST_(0x1f) << CSR_PWCTL0_DIR3BASE_SHIFT)
> +#define  CSR_PWCTL1_DIR2WIDTH_SHIFT	6
> +#define  CSR_PWCTL1_DIR2WIDTH_WIDTH	5
> +#define  CSR_PWCTL1_DIR2WIDTH		(_ULCAST_(0x1f) << CSR_PWCTL1_DIR2WIDTH_SHIFT)
> +#define  CSR_PWCTL1_DIR2BASE_SHIFT	0
> +#define  CSR_PWCTL1_DIR2BASE_WIDTH	5
> +#define  CSR_PWCTL1_DIR2BASE		(_ULCAST_(0x1f) << CSR_PWCTL0_DIR2BASE_SHIFT)
> +
> +#define LOONGARCH_CSR_STLBPGSIZE	0x1e
> +#define  CSR_STLBPGSIZE_PS_WIDTH	6
> +#define  CSR_STLBPGSIZE_PS		(_ULCAST_(0x3f))
> +
> +#define LOONGARCH_CSR_RVACFG		0x1f
> +#define  CSR_RVACFG_RDVA_WIDTH		4
> +#define  CSR_RVACFG_RDVA		(_ULCAST_(0xf))
> +
> +/* Config CSR registers */
> +#define LOONGARCH_CSR_CPUID		0x20	/* CPU core id */
> +#define  CSR_CPUID_COREID_WIDTH		9
> +#define  CSR_CPUID_COREID		_ULCAST_(0x1ff)
> +
> +#define LOONGARCH_CSR_PRCFG1		0x21	/* Config1 */
> +#define  CSR_CONF1_VSMAX_SHIFT		12
> +#define  CSR_CONF1_VSMAX_WIDTH		3
> +#define  CSR_CONF1_VSMAX		(_ULCAST_(7) << CSR_CONF1_VSMAX_SHIFT)
> +#define  CSR_CONF1_TMRBITS_SHIFT	4
> +#define  CSR_CONF1_TMRBITS_WIDTH	8
> +#define  CSR_CONF1_TMRBITS		(_ULCAST_(0xff) << CSR_CONF1_TMRBITS_SHIFT)
> +#define  CSR_CONF1_KSNUM_WIDTH		4
> +#define  CSR_CONF1_KSNUM		_ULCAST_(0xf)
> +
> +#define LOONGARCH_CSR_PRCFG2		0x22	/* Config2 */
> +#define  CSR_CONF2_PGMASK_SUPP		0x3ffff000
> +
> +#define LOONGARCH_CSR_PRCFG3		0x23	/* Config3 */
> +#define  CSR_CONF3_STLBIDX_SHIFT	20
> +#define  CSR_CONF3_STLBIDX_WIDTH	6
> +#define  CSR_CONF3_STLBIDX		(_ULCAST_(0x3f) << CSR_CONF3_STLBIDX_SHIFT)
> +#define  CSR_CONF3_STLBWAYS_SHIFT	12
> +#define  CSR_CONF3_STLBWAYS_WIDTH	8
> +#define  CSR_CONF3_STLBWAYS		(_ULCAST_(0xff) << CSR_CONF3_STLBWAYS_SHIFT)
> +#define  CSR_CONF3_MTLBSIZE_SHIFT	4
> +#define  CSR_CONF3_MTLBSIZE_WIDTH	8
> +#define  CSR_CONF3_MTLBSIZE		(_ULCAST_(0xff) << CSR_CONF3_MTLBSIZE_SHIFT)
> +#define  CSR_CONF3_TLBTYPE_SHIFT	0
> +#define  CSR_CONF3_TLBTYPE_WIDTH	4
> +#define  CSR_CONF3_TLBTYPE		(_ULCAST_(0xf) << CSR_CONF3_TLBTYPE_SHIFT)
> +
> +/* Kscratch registers */
> +#define LOONGARCH_CSR_KS0		0x30
> +#define LOONGARCH_CSR_KS1		0x31
> +#define LOONGARCH_CSR_KS2		0x32
> +#define LOONGARCH_CSR_KS3		0x33
> +#define LOONGARCH_CSR_KS4		0x34
> +#define LOONGARCH_CSR_KS5		0x35
> +#define LOONGARCH_CSR_KS6		0x36
> +#define LOONGARCH_CSR_KS7		0x37
> +#define LOONGARCH_CSR_KS8		0x38
> +
> +/* Exception allocated KS0, KS1 and KS2 statically */
> +#define EXCEPTION_KS0			LOONGARCH_CSR_KS0
> +#define EXCEPTION_KS1			LOONGARCH_CSR_KS1
> +#define EXCEPTION_KS2			LOONGARCH_CSR_KS2
> +#define EXC_KSCRATCH_MASK		(1 << 0 | 1 << 1 | 1 << 2)
> +
> +/* Percpu-data base allocated KS3 statically */
> +#define PERCPU_BASE_KS			LOONGARCH_CSR_KS3
> +#define PERCPU_KSCRATCH_MASK		(1 << 3)
> +
> +/* KVM allocated KS4 and KS5 statically */
> +#define KVM_VCPU_KS			LOONGARCH_CSR_KS4
> +#define KVM_TEMP_KS			LOONGARCH_CSR_KS5
> +#define KVM_KSCRATCH_MASK		(1 << 4 | 1 << 5)
> +
> +/* Timer registers */
> +#define LOONGARCH_CSR_TMID		0x40	/* Timer ID */
> +
> +#define LOONGARCH_CSR_TCFG		0x41	/* Timer config */
> +#define  CSR_TCFG_VAL_SHIFT		2
> +#define	 CSR_TCFG_VAL_WIDTH		48
> +#define  CSR_TCFG_VAL			(_ULCAST_(0x3fffffffffff) << CSR_TCFG_VAL_SHIFT)
> +#define  CSR_TCFG_PERIOD_SHIFT		1
> +#define  CSR_TCFG_PERIOD		(_ULCAST_(0x1) << CSR_TCFG_PERIOD_SHIFT)
> +#define  CSR_TCFG_EN			(_ULCAST_(0x1))
> +
> +#define LOONGARCH_CSR_TVAL		0x42	/* Timer value */
> +
> +#define LOONGARCH_CSR_CNTC		0x43	/* Timer offset */
> +
> +#define LOONGARCH_CSR_TINTCLR		0x44	/* Timer interrupt clear */
> +#define  CSR_TINTCLR_TI_SHIFT		0
> +#define  CSR_TINTCLR_TI			(1 << CSR_TINTCLR_TI_SHIFT)
> +
> +/* Guest registers */
> +#define LOONGARCH_CSR_GSTAT		0x50	/* Guest status */
> +#define  CSR_GSTAT_GID_SHIFT		16
> +#define  CSR_GSTAT_GID_WIDTH		8
> +#define  CSR_GSTAT_GID			(_ULCAST_(0xff) << CSR_GSTAT_GID_SHIFT)
> +#define  CSR_GSTAT_GIDBIT_SHIFT		4
> +#define  CSR_GSTAT_GIDBIT_WIDTH		6
> +#define  CSR_GSTAT_GIDBIT		(_ULCAST_(0x3f) << CSR_GSTAT_GIDBIT_SHIFT)
> +#define  CSR_GSTAT_PVM_SHIFT		1
> +#define  CSR_GSTAT_PVM			(_ULCAST_(0x1) << CSR_GSTAT_PVM_SHIFT)
> +#define  CSR_GSTAT_VM_SHIFT		0
> +#define  CSR_GSTAT_VM			(_ULCAST_(0x1) << CSR_GSTAT_VM_SHIFT)
> +
> +#define LOONGARCH_CSR_GCFG		0x51	/* Guest config */
> +#define  CSR_GCFG_GPERF_SHIFT		24
> +#define  CSR_GCFG_GPERF_WIDTH		3
> +#define  CSR_GCFG_GPERF			(_ULCAST_(0x7) << CSR_GCFG_GPERF_SHIFT)
> +#define  CSR_GCFG_GCI_SHIFT		20
> +#define  CSR_GCFG_GCI_WIDTH		2
> +#define  CSR_GCFG_GCI			(_ULCAST_(0x3) << CSR_GCFG_GCI_SHIFT)
> +#define  CSR_GCFG_GCI_ALL		(_ULCAST_(0x0) << CSR_GCFG_GCI_SHIFT)
> +#define  CSR_GCFG_GCI_HIT		(_ULCAST_(0x1) << CSR_GCFG_GCI_SHIFT)
> +#define  CSR_GCFG_GCI_SECURE		(_ULCAST_(0x2) << CSR_GCFG_GCI_SHIFT)
> +#define  CSR_GCFG_GCIP_SHIFT		16
> +#define  CSR_GCFG_GCIP			(_ULCAST_(0xf) << CSR_GCFG_GCIP_SHIFT)
> +#define  CSR_GCFG_GCIP_ALL		(_ULCAST_(0x1) << CSR_GCFG_GCIP_SHIFT)
> +#define  CSR_GCFG_GCIP_HIT		(_ULCAST_(0x1) << (CSR_GCFG_GCIP_SHIFT + 1))
> +#define  CSR_GCFG_GCIP_SECURE		(_ULCAST_(0x1) << (CSR_GCFG_GCIP_SHIFT + 2))
> +#define  CSR_GCFG_TORU_SHIFT		15
> +#define  CSR_GCFG_TORU			(_ULCAST_(0x1) << CSR_GCFG_TORU_SHIFT)
> +#define  CSR_GCFG_TORUP_SHIFT		14
> +#define  CSR_GCFG_TORUP			(_ULCAST_(0x1) << CSR_GCFG_TORUP_SHIFT)
> +#define  CSR_GCFG_TOP_SHIFT		13
> +#define  CSR_GCFG_TOP			(_ULCAST_(0x1) << CSR_GCFG_TOP_SHIFT)
> +#define  CSR_GCFG_TOPP_SHIFT		12
> +#define  CSR_GCFG_TOPP			(_ULCAST_(0x1) << CSR_GCFG_TOPP_SHIFT)
> +#define  CSR_GCFG_TOE_SHIFT		11
> +#define  CSR_GCFG_TOE			(_ULCAST_(0x1) << CSR_GCFG_TOE_SHIFT)
> +#define  CSR_GCFG_TOEP_SHIFT		10
> +#define  CSR_GCFG_TOEP			(_ULCAST_(0x1) << CSR_GCFG_TOEP_SHIFT)
> +#define  CSR_GCFG_TIT_SHIFT		9
> +#define  CSR_GCFG_TIT			(_ULCAST_(0x1) << CSR_GCFG_TIT_SHIFT)
> +#define  CSR_GCFG_TITP_SHIFT		8
> +#define  CSR_GCFG_TITP			(_ULCAST_(0x1) << CSR_GCFG_TITP_SHIFT)
> +#define  CSR_GCFG_SIT_SHIFT		7
> +#define  CSR_GCFG_SIT			(_ULCAST_(0x1) << CSR_GCFG_SIT_SHIFT)
> +#define  CSR_GCFG_SITP_SHIFT		6
> +#define  CSR_GCFG_SITP			(_ULCAST_(0x1) << CSR_GCFG_SITP_SHIFT)
> +#define  CSR_GCFG_MATC_SHITF		4
> +#define  CSR_GCFG_MATC_WIDTH		2
> +#define  CSR_GCFG_MATC_MASK		(_ULCAST_(0x3) << CSR_GCFG_MATC_SHITF)
> +#define  CSR_GCFG_MATC_GUEST		(_ULCAST_(0x0) << CSR_GCFG_MATC_SHITF)
> +#define  CSR_GCFG_MATC_ROOT		(_ULCAST_(0x1) << CSR_GCFG_MATC_SHITF)
> +#define  CSR_GCFG_MATC_NEST		(_ULCAST_(0x2) << CSR_GCFG_MATC_SHITF)
> +
> +#define LOONGARCH_CSR_GINTC		0x52	/* Guest interrupt control */
> +#define  CSR_GINTC_HC_SHIFT		16
> +#define  CSR_GINTC_HC_WIDTH		8
> +#define  CSR_GINTC_HC			(_ULCAST_(0xff) << CSR_GINTC_HC_SHIFT)
> +#define  CSR_GINTC_PIP_SHIFT		8
> +#define  CSR_GINTC_PIP_WIDTH		8
> +#define  CSR_GINTC_PIP			(_ULCAST_(0xff) << CSR_GINTC_PIP_SHIFT)
> +#define  CSR_GINTC_VIP_SHIFT		0
> +#define  CSR_GINTC_VIP_WIDTH		8
> +#define  CSR_GINTC_VIP			(_ULCAST_(0xff))
> +
> +#define LOONGARCH_CSR_GCNTC		0x53	/* Guest timer offset */
> +
> +/* LLBCTL register */
> +#define LOONGARCH_CSR_LLBCTL		0x60	/* LLBit control */
> +#define  CSR_LLBCTL_ROLLB_SHIFT		0
> +#define  CSR_LLBCTL_ROLLB		(_ULCAST_(1) << CSR_LLBCTL_ROLLB_SHIFT)
> +#define  CSR_LLBCTL_WCLLB_SHIFT		1
> +#define  CSR_LLBCTL_WCLLB		(_ULCAST_(1) << CSR_LLBCTL_WCLLB_SHIFT)
> +#define  CSR_LLBCTL_KLO_SHIFT		2
> +#define  CSR_LLBCTL_KLO			(_ULCAST_(1) << CSR_LLBCTL_KLO_SHIFT)
> +
> +/* Implement dependent */
> +#define LOONGARCH_CSR_IMPCTL1		0x80	/* Loongson config1 */
> +#define  CSR_MISPEC_SHIFT		20
> +#define  CSR_MISPEC_WIDTH		8
> +#define  CSR_MISPEC			(_ULCAST_(0xff) << CSR_MISPEC_SHIFT)
> +#define  CSR_SSEN_SHIFT			18
> +#define  CSR_SSEN			(_ULCAST_(1) << CSR_SSEN_SHIFT)
> +#define  CSR_SCRAND_SHIFT		17
> +#define  CSR_SCRAND			(_ULCAST_(1) << CSR_SCRAND_SHIFT)
> +#define  CSR_LLEXCL_SHIFT		16
> +#define  CSR_LLEXCL			(_ULCAST_(1) << CSR_LLEXCL_SHIFT)
> +#define  CSR_DISVC_SHIFT		15
> +#define  CSR_DISVC			(_ULCAST_(1) << CSR_DISVC_SHIFT)
> +#define  CSR_VCLRU_SHIFT		14
> +#define  CSR_VCLRU			(_ULCAST_(1) << CSR_VCLRU_SHIFT)
> +#define  CSR_DCLRU_SHIFT		13
> +#define  CSR_DCLRU			(_ULCAST_(1) << CSR_DCLRU_SHIFT)
> +#define  CSR_FASTLDQ_SHIFT		12
> +#define  CSR_FASTLDQ			(_ULCAST_(1) << CSR_FASTLDQ_SHIFT)
> +#define  CSR_USERCAC_SHIFT		11
> +#define  CSR_USERCAC			(_ULCAST_(1) << CSR_USERCAC_SHIFT)
> +#define  CSR_ANTI_MISPEC_SHIFT		10
> +#define  CSR_ANTI_MISPEC		(_ULCAST_(1) << CSR_ANTI_MISPEC_SHIFT)
> +#define  CSR_AUTO_FLUSHSFB_SHIFT	9
> +#define  CSR_AUTO_FLUSHSFB		(_ULCAST_(1) << CSR_AUTO_FLUSHSFB_SHIFT)
> +#define  CSR_STFILL_SHIFT		8
> +#define  CSR_STFILL			(_ULCAST_(1) << CSR_STFILL_SHIFT)
> +#define  CSR_LIFEP_SHIFT		7
> +#define  CSR_LIFEP			(_ULCAST_(1) << CSR_LIFEP_SHIFT)
> +#define  CSR_LLSYNC_SHIFT		6
> +#define  CSR_LLSYNC			(_ULCAST_(1) << CSR_LLSYNC_SHIFT)
> +#define  CSR_BRBTDIS_SHIFT		5
> +#define  CSR_BRBTDIS			(_ULCAST_(1) << CSR_BRBTDIS_SHIFT)
> +#define  CSR_RASDIS_SHIFT		4
> +#define  CSR_RASDIS			(_ULCAST_(1) << CSR_RASDIS_SHIFT)
> +#define  CSR_STPRE_SHIFT		2
> +#define  CSR_STPRE_WIDTH		2
> +#define  CSR_STPRE			(_ULCAST_(3) << CSR_STPRE_SHIFT)
> +#define  CSR_INSTPRE_SHIFT		1
> +#define  CSR_INSTPRE			(_ULCAST_(1) << CSR_INSTPRE_SHIFT)
> +#define  CSR_DATAPRE_SHIFT		0
> +#define  CSR_DATAPRE			(_ULCAST_(1) << CSR_DATAPRE_SHIFT)
> +
> +#define LOONGARCH_CSR_IMPCTL2		0x81	/* Loongson config2 */
> +#define  CSR_FLUSH_MTLB_SHIFT		0
> +#define  CSR_FLUSH_MTLB			(_ULCAST_(1) << CSR_FLUSH_MTLB_SHIFT)
> +#define  CSR_FLUSH_STLB_SHIFT		1
> +#define  CSR_FLUSH_STLB			(_ULCAST_(1) << CSR_FLUSH_STLB_SHIFT)
> +#define  CSR_FLUSH_DTLB_SHIFT		2
> +#define  CSR_FLUSH_DTLB			(_ULCAST_(1) << CSR_FLUSH_DTLB_SHIFT)
> +#define  CSR_FLUSH_ITLB_SHIFT		3
> +#define  CSR_FLUSH_ITLB			(_ULCAST_(1) << CSR_FLUSH_ITLB_SHIFT)
> +#define  CSR_FLUSH_BTAC_SHIFT		4
> +#define  CSR_FLUSH_BTAC			(_ULCAST_(1) << CSR_FLUSH_BTAC_SHIFT)
> +
> +#define LOONGARCH_CSR_GNMI		0x82
> +
> +/* TLB Refill registers */
> +#define LOONGARCH_CSR_TLBRENTRY		0x88	/* TLB refill exception entry */
> +#define LOONGARCH_CSR_TLBRBADV		0x89	/* TLB refill badvaddr */
> +#define LOONGARCH_CSR_TLBRERA		0x8a	/* TLB refill ERA */
> +#define LOONGARCH_CSR_TLBRSAVE		0x8b	/* KScratch for TLB refill exception */
> +#define LOONGARCH_CSR_TLBRELO0		0x8c	/* TLB refill entrylo0 */
> +#define LOONGARCH_CSR_TLBRELO1		0x8d	/* TLB refill entrylo1 */
> +#define LOONGARCH_CSR_TLBREHI		0x8e	/* TLB refill entryhi */
> +#define  CSR_TLBREHI_PS_SHIFT		0
> +#define  CSR_TLBREHI_PS			(_ULCAST_(0x3f) << CSR_TLBREHI_PS_SHIFT)
> +#define LOONGARCH_CSR_TLBRPRMD		0x8f	/* TLB refill mode info */
> +
> +/* Machine Error registers */
> +#define LOONGARCH_CSR_MERRCTL		0x90	/* MERRCTL */
> +#define LOONGARCH_CSR_MERRINFO1		0x91	/* MError info1 */
> +#define LOONGARCH_CSR_MERRINFO2		0x92	/* MError info2 */
> +#define LOONGARCH_CSR_MERRENTRY		0x93	/* MError exception entry */
> +#define LOONGARCH_CSR_MERRERA		0x94	/* MError exception ERA */
> +#define LOONGARCH_CSR_MERRSAVE		0x95	/* KScratch for machine error exception */
> +
> +#define LOONGARCH_CSR_CTAG		0x98	/* TagLo + TagHi */
> +
> +#define LOONGARCH_CSR_PRID		0xc0
> +
> +/* Shadow MCSR : 0xc0 ~ 0xff */
> +#define LOONGARCH_CSR_MCSR0		0xc0	/* CPUCFG0 and CPUCFG1 */
> +#define  MCSR0_INT_IMPL_SHIFT		58
> +#define  MCSR0_INT_IMPL			0
> +#define  MCSR0_IOCSR_BRD_SHIFT		57
> +#define  MCSR0_IOCSR_BRD		(_ULCAST_(1) << MCSR0_IOCSR_BRD_SHIFT)
> +#define  MCSR0_HUGEPG_SHIFT		56
> +#define  MCSR0_HUGEPG			(_ULCAST_(1) << MCSR0_HUGEPG_SHIFT)
> +#define  MCSR0_RPLMTLB_SHIFT		55
> +#define  MCSR0_RPLMTLB			(_ULCAST_(1) << MCSR0_RPLMTLB_SHIFT)
> +#define  MCSR0_EP_SHIFT			54
> +#define  MCSR0_EP			(_ULCAST_(1) << MCSR0_EP_SHIFT)
> +#define  MCSR0_RI_SHIFT			53
> +#define  MCSR0_RI			(_ULCAST_(1) << MCSR0_RI_SHIFT)
> +#define  MCSR0_UAL_SHIFT		52
> +#define  MCSR0_UAL			(_ULCAST_(1) << MCSR0_UAL_SHIFT)
> +#define  MCSR0_VABIT_SHIFT		44
> +#define  MCSR0_VABIT_WIDTH		8
> +#define  MCSR0_VABIT			(_ULCAST_(0xff) << MCSR0_VABIT_SHIFT)
> +#define  VABIT_DEFAULT			0x2f
> +#define  MCSR0_PABIT_SHIFT		36
> +#define  MCSR0_PABIT_WIDTH		8
> +#define  MCSR0_PABIT			(_ULCAST_(0xff) << MCSR0_PABIT_SHIFT)
> +#define  PABIT_DEFAULT			0x2f
> +#define  MCSR0_IOCSR_SHIFT		35
> +#define  MCSR0_IOCSR			(_ULCAST_(1) << MCSR0_IOCSR_SHIFT)
> +#define  MCSR0_PAGING_SHIFT		34
> +#define  MCSR0_PAGING			(_ULCAST_(1) << MCSR0_PAGING_SHIFT)
> +#define  MCSR0_GR64_SHIFT		33
> +#define  MCSR0_GR64			(_ULCAST_(1) << MCSR0_GR64_SHIFT)
> +#define  GR64_DEFAULT			1
> +#define  MCSR0_GR32_SHIFT		32
> +#define  MCSR0_GR32			(_ULCAST_(1) << MCSR0_GR32_SHIFT)
> +#define  GR32_DEFAULT			0
> +#define  MCSR0_PRID_WIDTH		32
> +#define  MCSR0_PRID			0x14C010
> +
> +#define LOONGARCH_CSR_MCSR1		0xc1	/* CPUCFG2 and CPUCFG3 */
> +#define  MCSR1_HPFOLD_SHIFT		43
> +#define  MCSR1_HPFOLD			(_ULCAST_(1) << MCSR1_HPFOLD_SHIFT)
> +#define  MCSR1_SPW_LVL_SHIFT		40
> +#define  MCSR1_SPW_LVL_WIDTH		3
> +#define  MCSR1_SPW_LVL			(_ULCAST_(7) << MCSR1_SPW_LVL_SHIFT)
> +#define  MCSR1_ICACHET_SHIFT		39
> +#define  MCSR1_ICACHET			(_ULCAST_(1) << MCSR1_ICACHET_SHIFT)
> +#define  MCSR1_ITLBT_SHIFT		38
> +#define  MCSR1_ITLBT			(_ULCAST_(1) << MCSR1_ITLBT_SHIFT)
> +#define  MCSR1_LLDBAR_SHIFT		37
> +#define  MCSR1_LLDBAR			(_ULCAST_(1) << MCSR1_LLDBAR_SHIFT)
> +#define  MCSR1_SCDLY_SHIFT		36
> +#define  MCSR1_SCDLY			(_ULCAST_(1) << MCSR1_SCDLY_SHIFT)
> +#define  MCSR1_LLEXC_SHIFT		35
> +#define  MCSR1_LLEXC			(_ULCAST_(1) << MCSR1_LLEXC_SHIFT)
> +#define  MCSR1_UCACC_SHIFT		34
> +#define  MCSR1_UCACC			(_ULCAST_(1) << MCSR1_UCACC_SHIFT)
> +#define  MCSR1_SFB_SHIFT		33
> +#define  MCSR1_SFB			(_ULCAST_(1) << MCSR1_SFB_SHIFT)
> +#define  MCSR1_CCDMA_SHIFT		32
> +#define  MCSR1_CCDMA			(_ULCAST_(1) << MCSR1_CCDMA_SHIFT)
> +#define  MCSR1_LAMO_SHIFT		22
> +#define  MCSR1_LAMO			(_ULCAST_(1) << MCSR1_LAMO_SHIFT)
> +#define  MCSR1_LSPW_SHIFT		21
> +#define  MCSR1_LSPW			(_ULCAST_(1) << MCSR1_LSPW_SHIFT)
> +#define  MCSR1_MIPSBT_SHIFT		20
> +#define  MCSR1_MIPSBT			(_ULCAST_(1) << MCSR1_MIPSBT_SHIFT)
> +#define  MCSR1_ARMBT_SHIFT		19
> +#define  MCSR1_ARMBT			(_ULCAST_(1) << MCSR1_ARMBT_SHIFT)
> +#define  MCSR1_X86BT_SHIFT		18
> +#define  MCSR1_X86BT			(_ULCAST_(1) << MCSR1_X86BT_SHIFT)
> +#define  MCSR1_LLFTPVERS_SHIFT		15
> +#define  MCSR1_LLFTPVERS_WIDTH		3
> +#define  MCSR1_LLFTPVERS		(_ULCAST_(7) << MCSR1_LLFTPVERS_SHIFT)
> +#define  MCSR1_LLFTP_SHIFT		14
> +#define  MCSR1_LLFTP			(_ULCAST_(1) << MCSR1_LLFTP_SHIFT)
> +#define  MCSR1_VZVERS_SHIFT		11
> +#define  MCSR1_VZVERS_WIDTH		3
> +#define  MCSR1_VZVERS			(_ULCAST_(7) << MCSR1_VZVERS_SHIFT)
> +#define  MCSR1_VZ_SHIFT			10
> +#define  MCSR1_VZ			(_ULCAST_(1) << MCSR1_VZ_SHIFT)
> +#define  MCSR1_CRYPTO_SHIFT		9
> +#define  MCSR1_CRYPTO			(_ULCAST_(1) << MCSR1_CRYPTO_SHIFT)
> +#define  MCSR1_COMPLEX_SHIFT		8
> +#define  MCSR1_COMPLEX			(_ULCAST_(1) << MCSR1_COMPLEX_SHIFT)
> +#define  MCSR1_LASX_SHIFT		7
> +#define  MCSR1_LASX			(_ULCAST_(1) << MCSR1_LASX_SHIFT)
> +#define  MCSR1_LSX_SHIFT		6
> +#define  MCSR1_LSX			(_ULCAST_(1) << MCSR1_LSX_SHIFT)
> +#define  MCSR1_FPVERS_SHIFT		3
> +#define  MCSR1_FPVERS_WIDTH		3
> +#define  MCSR1_FPVERS			(_ULCAST_(7) << MCSR1_FPVERS_SHIFT)
> +#define  MCSR1_FPDP_SHIFT		2
> +#define  MCSR1_FPDP			(_ULCAST_(1) << MCSR1_FPDP_SHIFT)
> +#define  MCSR1_FPSP_SHIFT		1
> +#define  MCSR1_FPSP			(_ULCAST_(1) << MCSR1_FPSP_SHIFT)
> +#define  MCSR1_FP_SHIFT			0
> +#define  MCSR1_FP			(_ULCAST_(1) << MCSR1_FP_SHIFT)
> +
> +#define LOONGARCH_CSR_MCSR2		0xc2	/* CPUCFG4 and CPUCFG5 */
> +#define  MCSR2_CCDIV_SHIFT		48
> +#define  MCSR2_CCDIV_WIDTH		16
> +#define  MCSR2_CCDIV			(_ULCAST_(0xffff) << MCSR2_CCDIV_SHIFT)
> +#define  MCSR2_CCMUL_SHIFT		32
> +#define  MCSR2_CCMUL_WIDTH		16
> +#define  MCSR2_CCMUL			(_ULCAST_(0xffff) << MCSR2_CCMUL_SHIFT)
> +#define  MCSR2_CCFREQ_WIDTH		32
> +#define  MCSR2_CCFREQ			(_ULCAST_(0xffffffff))
> +#define  CCFREQ_DEFAULT			0x5f5e100	/* 100MHz */
> +
> +#define LOONGARCH_CSR_MCSR3		0xc3	/* CPUCFG6 */
> +#define  MCSR3_UPM_SHIFT		14
> +#define  MCSR3_UPM			(_ULCAST_(1) << MCSR3_UPM_SHIFT)
> +#define  MCSR3_PMBITS_SHIFT		8
> +#define  MCSR3_PMBITS_WIDTH		6
> +#define  MCSR3_PMBITS			(_ULCAST_(0x3f) << MCSR3_PMBITS_SHIFT)
> +#define  PMBITS_DEFAULT			0x40
> +#define  MCSR3_PMNUM_SHIFT		4
> +#define  MCSR3_PMNUM_WIDTH		4
> +#define  MCSR3_PMNUM			(_ULCAST_(0xf) << MCSR3_PMNUM_SHIFT)
> +#define  MCSR3_PAMVER_SHIFT		1
> +#define  MCSR3_PAMVER_WIDTH		3
> +#define  MCSR3_PAMVER			(_ULCAST_(0x7) << MCSR3_PAMVER_SHIFT)
> +#define  MCSR3_PMP_SHIFT		0
> +#define  MCSR3_PMP			(_ULCAST_(1) << MCSR3_PMP_SHIFT)
> +
> +#define LOONGARCH_CSR_MCSR8		0xc8	/* CPUCFG16 and CPUCFG17 */
> +#define  MCSR8_L1I_SIZE_SHIFT		56
> +#define  MCSR8_L1I_SIZE_WIDTH		7
> +#define  MCSR8_L1I_SIZE			(_ULCAST_(0x7f) << MCSR8_L1I_SIZE_SHIFT)
> +#define  MCSR8_L1I_IDX_SHIFT		48
> +#define  MCSR8_L1I_IDX_WIDTH		8
> +#define  MCSR8_L1I_IDX			(_ULCAST_(0xff) << MCSR8_L1I_IDX_SHIFT)
> +#define  MCSR8_L1I_WAY_SHIFT		32
> +#define  MCSR8_L1I_WAY_WIDTH		16
> +#define  MCSR8_L1I_WAY			(_ULCAST_(0xffff) << MCSR8_L1I_WAY_SHIFT)
> +#define  MCSR8_L3DINCL_SHIFT		16
> +#define  MCSR8_L3DINCL			(_ULCAST_(1) << MCSR8_L3DINCL_SHIFT)
> +#define  MCSR8_L3DPRIV_SHIFT		15
> +#define  MCSR8_L3DPRIV			(_ULCAST_(1) << MCSR8_L3DPRIV_SHIFT)
> +#define  MCSR8_L3DPRE_SHIFT		14
> +#define  MCSR8_L3DPRE			(_ULCAST_(1) << MCSR8_L3DPRE_SHIFT)
> +#define  MCSR8_L3IUINCL_SHIFT		13
> +#define  MCSR8_L3IUINCL			(_ULCAST_(1) << MCSR8_L3IUINCL_SHIFT)
> +#define  MCSR8_L3IUPRIV_SHIFT		12
> +#define  MCSR8_L3IUPRIV			(_ULCAST_(1) << MCSR8_L3IUPRIV_SHIFT)
> +#define  MCSR8_L3IUUNIFY_SHIFT		11
> +#define  MCSR8_L3IUUNIFY		(_ULCAST_(1) << MCSR8_L3IUUNIFY_SHIFT)
> +#define  MCSR8_L3IUPRE_SHIFT		10
> +#define  MCSR8_L3IUPRE			(_ULCAST_(1) << MCSR8_L3IUPRE_SHIFT)
> +#define  MCSR8_L2DINCL_SHIFT		9
> +#define  MCSR8_L2DINCL			(_ULCAST_(1) << MCSR8_L2DINCL_SHIFT)
> +#define  MCSR8_L2DPRIV_SHIFT		8
> +#define  MCSR8_L2DPRIV			(_ULCAST_(1) << MCSR8_L2DPRIV_SHIFT)
> +#define  MCSR8_L2DPRE_SHIFT		7
> +#define  MCSR8_L2DPRE			(_ULCAST_(1) << MCSR8_L2DPRE_SHIFT)
> +#define  MCSR8_L2IUINCL_SHIFT		6
> +#define  MCSR8_L2IUINCL			(_ULCAST_(1) << MCSR8_L2IUINCL_SHIFT)
> +#define  MCSR8_L2IUPRIV_SHIFT		5
> +#define  MCSR8_L2IUPRIV			(_ULCAST_(1) << MCSR8_L2IUPRIV_SHIFT)
> +#define  MCSR8_L2IUUNIFY_SHIFT		4
> +#define  MCSR8_L2IUUNIFY		(_ULCAST_(1) << MCSR8_L2IUUNIFY_SHIFT)
> +#define  MCSR8_L2IUPRE_SHIFT		3
> +#define  MCSR8_L2IUPRE			(_ULCAST_(1) << MCSR8_L2IUPRE_SHIFT)
> +#define  MCSR8_L1DPRE_SHIFT		2
> +#define  MCSR8_L1DPRE			(_ULCAST_(1) << MCSR8_L1DPRE_SHIFT)
> +#define  MCSR8_L1IUUNIFY_SHIFT		1
> +#define  MCSR8_L1IUUNIFY		(_ULCAST_(1) << MCSR8_L1IUUNIFY_SHIFT)
> +#define  MCSR8_L1IUPRE_SHIFT		0
> +#define  MCSR8_L1IUPRE			(_ULCAST_(1) << MCSR8_L1IUPRE_SHIFT)
> +
> +#define LOONGARCH_CSR_MCSR9		0xc9	/* CPUCFG18 and CPUCFG19 */
> +#define  MCSR9_L2U_SIZE_SHIFT		56
> +#define  MCSR9_L2U_SIZE_WIDTH		7
> +#define  MCSR9_L2U_SIZE			(_ULCAST_(0x7f) << MCSR9_L2U_SIZE_SHIFT)
> +#define  MCSR9_L2U_IDX_SHIFT		48
> +#define  MCSR9_L2U_IDX_WIDTH		8
> +#define  MCSR9_L2U_IDX			(_ULCAST_(0xff) << MCSR9_IDX_LOG_SHIFT)
> +#define  MCSR9_L2U_WAY_SHIFT		32
> +#define  MCSR9_L2U_WAY_WIDTH		16
> +#define  MCSR9_L2U_WAY			(_ULCAST_(0xffff) << MCSR9_L2U_WAY_SHIFT)
> +#define  MCSR9_L1D_SIZE_SHIFT		24
> +#define  MCSR9_L1D_SIZE_WIDTH		7
> +#define  MCSR9_L1D_SIZE			(_ULCAST_(0x7f) << MCSR9_L1D_SIZE_SHIFT)
> +#define  MCSR9_L1D_IDX_SHIFT		16
> +#define  MCSR9_L1D_IDX_WIDTH		8
> +#define  MCSR9_L1D_IDX			(_ULCAST_(0xff) << MCSR9_L1D_IDX_SHIFT)
> +#define  MCSR9_L1D_WAY_SHIFT		0
> +#define  MCSR9_L1D_WAY_WIDTH		16
> +#define  MCSR9_L1D_WAY			(_ULCAST_(0xffff) << MCSR9_L1D_WAY_SHIFT)
> +
> +#define LOONGARCH_CSR_MCSR10		0xca	/* CPUCFG20 */
> +#define  MCSR10_L3U_SIZE_SHIFT		24
> +#define  MCSR10_L3U_SIZE_WIDTH		7
> +#define  MCSR10_L3U_SIZE		(_ULCAST_(0x7f) << MCSR10_L3U_SIZE_SHIFT)
> +#define  MCSR10_L3U_IDX_SHIFT		16
> +#define  MCSR10_L3U_IDX_WIDTH		8
> +#define  MCSR10_L3U_IDX			(_ULCAST_(0xff) << MCSR10_L3U_IDX_SHIFT)
> +#define  MCSR10_L3U_WAY_SHIFT		0
> +#define  MCSR10_L3U_WAY_WIDTH		16
> +#define  MCSR10_L3U_WAY			(_ULCAST_(0xffff) << MCSR10_L3U_WAY_SHIFT)
> +
> +#define LOONGARCH_CSR_MCSR24		0xf0	/* cpucfg48 */
> +#define  MCSR24_RAMCG_SHIFT		3
> +#define  MCSR24_RAMCG			(_ULCAST_(1) << MCSR24_RAMCG_SHIFT)
> +#define  MCSR24_VFPUCG_SHIFT		2
> +#define  MCSR24_VFPUCG			(_ULCAST_(1) << MCSR24_VFPUCG_SHIFT)
> +#define  MCSR24_NAPEN_SHIFT		1
> +#define  MCSR24_NAPEN			(_ULCAST_(1) << MCSR24_NAPEN_SHIFT)
> +#define  MCSR24_MCSRLOCK_SHIFT		0
> +#define  MCSR24_MCSRLOCK		(_ULCAST_(1) << MCSR24_MCSRLOCK_SHIFT)
> +
> +/* Uncached accelerate windows registers */
> +#define LOONGARCH_CSR_UCAWIN		0x100
> +#define LOONGARCH_CSR_UCAWIN0_LO	0x102
> +#define LOONGARCH_CSR_UCAWIN0_HI	0x103
> +#define LOONGARCH_CSR_UCAWIN1_LO	0x104
> +#define LOONGARCH_CSR_UCAWIN1_HI	0x105
> +#define LOONGARCH_CSR_UCAWIN2_LO	0x106
> +#define LOONGARCH_CSR_UCAWIN2_HI	0x107
> +#define LOONGARCH_CSR_UCAWIN3_LO	0x108
> +#define LOONGARCH_CSR_UCAWIN3_HI	0x109
> +
> +/* Direct Map windows registers */
> +#define LOONGARCH_CSR_DMWIN0		0x180	/* 64 direct map win0: MEM & IF */
> +#define LOONGARCH_CSR_DMWIN1		0x181	/* 64 direct map win1: MEM & IF */
> +#define LOONGARCH_CSR_DMWIN2		0x182	/* 64 direct map win2: MEM */
> +#define LOONGARCH_CSR_DMWIN3		0x183	/* 64 direct map win3: MEM */
> +
> +/* Direct Map window 0/1 */
> +#define CSR_DMW0_PLV0		_CONST64_(1 << 0)
> +#define CSR_DMW0_VSEG		_CONST64_(0x8000)
> +#define CSR_DMW0_BASE		(CSR_DMW0_VSEG << DMW_PABITS)
> +#define CSR_DMW0_INIT		(CSR_DMW0_BASE | CSR_DMW0_PLV0)
> +
> +#define CSR_DMW1_PLV0		_CONST64_(1 << 0)
> +#define CSR_DMW1_MAT		_CONST64_(1 << 4)
> +#define CSR_DMW1_VSEG		_CONST64_(0x9000)
> +#define CSR_DMW1_BASE		(CSR_DMW1_VSEG << DMW_PABITS)
> +#define CSR_DMW1_INIT		(CSR_DMW1_BASE | CSR_DMW1_MAT | CSR_DMW1_PLV0)
> +
> +/* Performance Counter registers */
> +#define LOONGARCH_CSR_PERFCTRL0		0x200	/* 32 perf event 0 config */
> +#define LOONGARCH_CSR_PERFCNTR0		0x201	/* 64 perf event 0 count value */
> +#define LOONGARCH_CSR_PERFCTRL1		0x202	/* 32 perf event 1 config */
> +#define LOONGARCH_CSR_PERFCNTR1		0x203	/* 64 perf event 1 count value */
> +#define LOONGARCH_CSR_PERFCTRL2		0x204	/* 32 perf event 2 config */
> +#define LOONGARCH_CSR_PERFCNTR2		0x205	/* 64 perf event 2 count value */
> +#define LOONGARCH_CSR_PERFCTRL3		0x206	/* 32 perf event 3 config */
> +#define LOONGARCH_CSR_PERFCNTR3		0x207	/* 64 perf event 3 count value */
> +#define  CSR_PERFCTRL_PLV0		(_ULCAST_(1) << 16)
> +#define  CSR_PERFCTRL_PLV1		(_ULCAST_(1) << 17)
> +#define  CSR_PERFCTRL_PLV2		(_ULCAST_(1) << 18)
> +#define  CSR_PERFCTRL_PLV3		(_ULCAST_(1) << 19)
> +#define  CSR_PERFCTRL_IE		(_ULCAST_(1) << 20)
> +#define  CSR_PERFCTRL_EVENT		0x3ff
> +
> +/* Debug registers */
> +#define LOONGARCH_CSR_MWPC		0x300	/* data breakpoint config */
> +#define LOONGARCH_CSR_MWPS		0x301	/* data breakpoint status */
> +
> +#define LOONGARCH_CSR_DB0ADDR		0x310	/* data breakpoint 0 address */
> +#define LOONGARCH_CSR_DB0MASK		0x311	/* data breakpoint 0 mask */
> +#define LOONGARCH_CSR_DB0CTL		0x312	/* data breakpoint 0 control */
> +#define LOONGARCH_CSR_DB0ASID		0x313	/* data breakpoint 0 asid */
> +
> +#define LOONGARCH_CSR_DB1ADDR		0x318	/* data breakpoint 1 address */
> +#define LOONGARCH_CSR_DB1MASK		0x319	/* data breakpoint 1 mask */
> +#define LOONGARCH_CSR_DB1CTL		0x31a	/* data breakpoint 1 control */
> +#define LOONGARCH_CSR_DB1ASID		0x31b	/* data breakpoint 1 asid */
> +
> +#define LOONGARCH_CSR_DB2ADDR		0x320	/* data breakpoint 2 address */
> +#define LOONGARCH_CSR_DB2MASK		0x321	/* data breakpoint 2 mask */
> +#define LOONGARCH_CSR_DB2CTL		0x322	/* data breakpoint 2 control */
> +#define LOONGARCH_CSR_DB2ASID		0x323	/* data breakpoint 2 asid */
> +
> +#define LOONGARCH_CSR_DB3ADDR		0x328	/* data breakpoint 3 address */
> +#define LOONGARCH_CSR_DB3MASK		0x329	/* data breakpoint 3 mask */
> +#define LOONGARCH_CSR_DB3CTL		0x32a	/* data breakpoint 3 control */
> +#define LOONGARCH_CSR_DB3ASID		0x32b	/* data breakpoint 3 asid */
> +
> +#define LOONGARCH_CSR_DB4ADDR		0x330	/* data breakpoint 4 address */
> +#define LOONGARCH_CSR_DB4MASK		0x331	/* data breakpoint 4 maks */
> +#define LOONGARCH_CSR_DB4CTL		0x332	/* data breakpoint 4 control */
> +#define LOONGARCH_CSR_DB4ASID		0x333	/* data breakpoint 4 asid */
> +
> +#define LOONGARCH_CSR_DB5ADDR		0x338	/* data breakpoint 5 address */
> +#define LOONGARCH_CSR_DB5MASK		0x339	/* data breakpoint 5 mask */
> +#define LOONGARCH_CSR_DB5CTL		0x33a	/* data breakpoint 5 control */
> +#define LOONGARCH_CSR_DB5ASID		0x33b	/* data breakpoint 5 asid */
> +
> +#define LOONGARCH_CSR_DB6ADDR		0x340	/* data breakpoint 6 address */
> +#define LOONGARCH_CSR_DB6MASK		0x341	/* data breakpoint 6 mask */
> +#define LOONGARCH_CSR_DB6CTL		0x342	/* data breakpoint 6 control */
> +#define LOONGARCH_CSR_DB6ASID		0x343	/* data breakpoint 6 asid */
> +
> +#define LOONGARCH_CSR_DB7ADDR		0x348	/* data breakpoint 7 address */
> +#define LOONGARCH_CSR_DB7MASK		0x349	/* data breakpoint 7 mask */
> +#define LOONGARCH_CSR_DB7CTL		0x34a	/* data breakpoint 7 control */
> +#define LOONGARCH_CSR_DB7ASID		0x34b	/* data breakpoint 7 asid */
> +
> +#define LOONGARCH_CSR_FWPC		0x380	/* instruction breakpoint config */
> +#define LOONGARCH_CSR_FWPS		0x381	/* instruction breakpoint status */
> +
> +#define LOONGARCH_CSR_IB0ADDR		0x390	/* inst breakpoint 0 address */
> +#define LOONGARCH_CSR_IB0MASK		0x391	/* inst breakpoint 0 mask */
> +#define LOONGARCH_CSR_IB0CTL		0x392	/* inst breakpoint 0 control */
> +#define LOONGARCH_CSR_IB0ASID		0x393	/* inst breakpoint 0 asid */
> +
> +#define LOONGARCH_CSR_IB1ADDR		0x398	/* inst breakpoint 1 address */
> +#define LOONGARCH_CSR_IB1MASK		0x399	/* inst breakpoint 1 mask */
> +#define LOONGARCH_CSR_IB1CTL		0x39a	/* inst breakpoint 1 control */
> +#define LOONGARCH_CSR_IB1ASID		0x39b	/* inst breakpoint 1 asid */
> +
> +#define LOONGARCH_CSR_IB2ADDR		0x3a0	/* inst breakpoint 2 address */
> +#define LOONGARCH_CSR_IB2MASK		0x3a1	/* inst breakpoint 2 mask */
> +#define LOONGARCH_CSR_IB2CTL		0x3a2	/* inst breakpoint 2 control */
> +#define LOONGARCH_CSR_IB2ASID		0x3a3	/* inst breakpoint 2 asid */
> +
> +#define LOONGARCH_CSR_IB3ADDR		0x3a8	/* inst breakpoint 3 address */
> +#define LOONGARCH_CSR_IB3MASK		0x3a9	/* breakpoint 3 mask */
> +#define LOONGARCH_CSR_IB3CTL		0x3aa	/* inst breakpoint 3 control */
> +#define LOONGARCH_CSR_IB3ASID		0x3ab	/* inst breakpoint 3 asid */
> +
> +#define LOONGARCH_CSR_IB4ADDR		0x3b0	/* inst breakpoint 4 address */
> +#define LOONGARCH_CSR_IB4MASK		0x3b1	/* inst breakpoint 4 mask */
> +#define LOONGARCH_CSR_IB4CTL		0x3b2	/* inst breakpoint 4 control */
> +#define LOONGARCH_CSR_IB4ASID		0x3b3	/* inst breakpoint 4 asid */
> +
> +#define LOONGARCH_CSR_IB5ADDR		0x3b8	/* inst breakpoint 5 address */
> +#define LOONGARCH_CSR_IB5MASK		0x3b9	/* inst breakpoint 5 mask */
> +#define LOONGARCH_CSR_IB5CTL		0x3ba	/* inst breakpoint 5 control */
> +#define LOONGARCH_CSR_IB5ASID		0x3bb	/* inst breakpoint 5 asid */
> +
> +#define LOONGARCH_CSR_IB6ADDR		0x3c0	/* inst breakpoint 6 address */
> +#define LOONGARCH_CSR_IB6MASK		0x3c1	/* inst breakpoint 6 mask */
> +#define LOONGARCH_CSR_IB6CTL		0x3c2	/* inst breakpoint 6 control */
> +#define LOONGARCH_CSR_IB6ASID		0x3c3	/* inst breakpoint 6 asid */
> +
> +#define LOONGARCH_CSR_IB7ADDR		0x3c8	/* inst breakpoint 7 address */
> +#define LOONGARCH_CSR_IB7MASK		0x3c9	/* inst breakpoint 7 mask */
> +#define LOONGARCH_CSR_IB7CTL		0x3ca	/* inst breakpoint 7 control */
> +#define LOONGARCH_CSR_IB7ASID		0x3cb	/* inst breakpoint 7 asid */
> +
> +#define LOONGARCH_CSR_DEBUG		0x500	/* debug config */
> +#define LOONGARCH_CSR_DERA		0x501	/* debug era */
> +#define LOONGARCH_CSR_DESAVE		0x502	/* debug save */
> +
> +/*
> + * CSR_ECFG IM
> + */
> +#define ECFG0_IM		0x00001fff
> +#define ECFGB_SIP0		0
> +#define ECFGF_SIP0		(_ULCAST_(1) << ECFGB_SIP0)
> +#define ECFGB_SIP1		1
> +#define ECFGF_SIP1		(_ULCAST_(1) << ECFGB_SIP1)
> +#define ECFGB_IP0		2
> +#define ECFGF_IP0		(_ULCAST_(1) << ECFGB_IP0)
> +#define ECFGB_IP1		3
> +#define ECFGF_IP1		(_ULCAST_(1) << ECFGB_IP1)
> +#define ECFGB_IP2		4
> +#define ECFGF_IP2		(_ULCAST_(1) << ECFGB_IP2)
> +#define ECFGB_IP3		5
> +#define ECFGF_IP3		(_ULCAST_(1) << ECFGB_IP3)
> +#define ECFGB_IP4		6
> +#define ECFGF_IP4		(_ULCAST_(1) << ECFGB_IP4)
> +#define ECFGB_IP5		7
> +#define ECFGF_IP5		(_ULCAST_(1) << ECFGB_IP5)
> +#define ECFGB_IP6		8
> +#define ECFGF_IP6		(_ULCAST_(1) << ECFGB_IP6)
> +#define ECFGB_IP7		9
> +#define ECFGF_IP7		(_ULCAST_(1) << ECFGB_IP7)
> +#define ECFGB_PMC		10
> +#define ECFGF_PMC		(_ULCAST_(1) << ECFGB_PMC)
> +#define ECFGB_TIMER		11
> +#define ECFGF_TIMER		(_ULCAST_(1) << ECFGB_TIMER)
> +#define ECFGB_IPI		12
> +#define ECFGF_IPI		(_ULCAST_(1) << ECFGB_IPI)
> +#define ECFGF(hwirq)		(_ULCAST_(1) << hwirq)
> +
> +#define ESTATF_IP		0x00001fff
> +
> +#define LOONGARCH_IOCSR_FEATURES	0x8
> +#define  IOCSRF_TEMP			BIT_ULL(0)
> +#define  IOCSRF_NODECNT			BIT_ULL(1)
> +#define  IOCSRF_MSI			BIT_ULL(2)
> +#define  IOCSRF_EXTIOI			BIT_ULL(3)
> +#define  IOCSRF_CSRIPI			BIT_ULL(4)
> +#define  IOCSRF_FREQCSR			BIT_ULL(5)
> +#define  IOCSRF_FREQSCALE		BIT_ULL(6)
> +#define  IOCSRF_DVFSV1			BIT_ULL(7)
> +#define  IOCSRF_EIODECODE		BIT_ULL(9)
> +#define  IOCSRF_FLATMODE		BIT_ULL(10)
> +#define  IOCSRF_VM			BIT_ULL(11)
> +
> +#define LOONGARCH_IOCSR_VENDOR		0x10
> +
> +#define LOONGARCH_IOCSR_CPUNAME		0x20
> +
> +#define LOONGARCH_IOCSR_NODECNT		0x408
> +
> +#define LOONGARCH_IOCSR_MISC_FUNC	0x420
> +#define  IOCSR_MISC_FUNC_TIMER_RESET	BIT_ULL(21)
> +#define  IOCSR_MISC_FUNC_EXT_IOI_EN	BIT_ULL(48)
> +
> +#define LOONGARCH_IOCSR_CPUTEMP		0x428
> +
> +/* PerCore CSR, only accessible by local cores */
> +#define LOONGARCH_IOCSR_IPI_STATUS	0x1000
> +#define LOONGARCH_IOCSR_IPI_EN		0x1004
> +#define LOONGARCH_IOCSR_IPI_SET		0x1008
> +#define LOONGARCH_IOCSR_IPI_CLEAR	0x100c
> +#define LOONGARCH_IOCSR_MBUF0		0x1020
> +#define LOONGARCH_IOCSR_MBUF1		0x1028
> +#define LOONGARCH_IOCSR_MBUF2		0x1030
> +#define LOONGARCH_IOCSR_MBUF3		0x1038
> +
> +#define LOONGARCH_IOCSR_IPI_SEND	0x1040
> +#define  IOCSR_IPI_SEND_IP_SHIFT	0
> +#define  IOCSR_IPI_SEND_CPU_SHIFT	16
> +#define  IOCSR_IPI_SEND_BLOCKING	BIT(31)
> +
> +#define LOONGARCH_IOCSR_MBUF_SEND	0x1048
> +#define  IOCSR_MBUF_SEND_BLOCKING	BIT_ULL(31)
> +#define  IOCSR_MBUF_SEND_BOX_SHIFT	2
> +#define  IOCSR_MBUF_SEND_BOX_LO(box)	(box << 1)
> +#define  IOCSR_MBUF_SEND_BOX_HI(box)	((box << 1) + 1)
> +#define  IOCSR_MBUF_SEND_CPU_SHIFT	16
> +#define  IOCSR_MBUF_SEND_BUF_SHIFT	32
> +#define  IOCSR_MBUF_SEND_H32_MASK	0xFFFFFFFF00000000ULL
> +
> +#define LOONGARCH_IOCSR_ANY_SEND	0x1158
> +#define  IOCSR_ANY_SEND_BLOCKING	BIT_ULL(31)
> +#define  IOCSR_ANY_SEND_CPU_SHIFT	16
> +#define  IOCSR_ANY_SEND_MASK_SHIFT	27
> +#define  IOCSR_ANY_SEND_BUF_SHIFT	32
> +#define  IOCSR_ANY_SEND_H32_MASK	0xFFFFFFFF00000000ULL
> +
> +/* Register offset and bit definition for CSR access */
> +#define LOONGARCH_IOCSR_TIMER_CFG       0x1060
> +#define LOONGARCH_IOCSR_TIMER_TICK      0x1070
> +#define  IOCSR_TIMER_CFG_RESERVED       (_ULCAST_(1) << 63)
> +#define  IOCSR_TIMER_CFG_PERIODIC       (_ULCAST_(1) << 62)
> +#define  IOCSR_TIMER_CFG_EN             (_ULCAST_(1) << 61)
> +#define  IOCSR_TIMER_MASK		0x0ffffffffffffULL
> +#define  IOCSR_TIMER_INITVAL_RST        (_ULCAST_(0xffff) << 48)
> +
> +#define LOONGARCH_IOCSR_EXTIOI_NODEMAP_BASE	0x14a0
> +#define LOONGARCH_IOCSR_EXTIOI_IPMAP_BASE	0x14c0
> +#define LOONGARCH_IOCSR_EXTIOI_EN_BASE		0x1600
> +#define LOONGARCH_IOCSR_EXTIOI_BOUNCE_BASE	0x1680
> +#define LOONGARCH_IOCSR_EXTIOI_ISR_BASE		0x1800
> +#define LOONGARCH_IOCSR_EXTIOI_ROUTE_BASE	0x1c00
> +#define IOCSR_EXTIOI_VECTOR_NUM			256
> +
> +#ifndef __ASSEMBLY__
> +
> +static inline u64 drdtime(void)
> +{
> +	int rID = 0;
> +	u64 val = 0;
> +
> +	__asm__ __volatile__(
> +		"rdtime.d %0, %1 \n\t"
> +		: "=r"(val), "=r"(rID)
> +		:
> +		);
> +	return val;
> +}
> +
> +static inline unsigned int get_csr_cpuid(void)
> +{
> +	return csr_readl(LOONGARCH_CSR_CPUID);
> +}
> +
> +static inline void csr_any_send(unsigned int addr, unsigned int data,
> +				unsigned int data_mask, unsigned int cpu)
> +{
> +	uint64_t val = 0;
> +
> +	val = IOCSR_ANY_SEND_BLOCKING | addr;
> +	val |= (cpu << IOCSR_ANY_SEND_CPU_SHIFT);
> +	val |= (data_mask << IOCSR_ANY_SEND_MASK_SHIFT);
> +	val |= ((uint64_t)data << IOCSR_ANY_SEND_BUF_SHIFT);
> +	iocsr_writeq(val, LOONGARCH_IOCSR_ANY_SEND);
> +}
> +
> +static inline unsigned int read_csr_excode(void)
> +{
> +	return (csr_readl(LOONGARCH_CSR_ESTAT) & CSR_ESTAT_EXC) >> CSR_ESTAT_EXC_SHIFT;
> +}
> +
> +static inline void write_csr_index(unsigned int idx)
> +{
> +	csr_xchgl(idx, CSR_TLBIDX_IDXM, LOONGARCH_CSR_TLBIDX);
> +}
> +
> +static inline unsigned int read_csr_pagesize(void)
> +{
> +	return (csr_readl(LOONGARCH_CSR_TLBIDX) & CSR_TLBIDX_SIZEM) >> CSR_TLBIDX_SIZE;
> +}
> +
> +static inline void write_csr_pagesize(unsigned int size)
> +{
> +	csr_xchgl(size << CSR_TLBIDX_SIZE, CSR_TLBIDX_SIZEM, LOONGARCH_CSR_TLBIDX);
> +}
> +
> +static inline unsigned int read_csr_tlbrefill_pagesize(void)
> +{
> +	return (csr_readq(LOONGARCH_CSR_TLBREHI) & CSR_TLBREHI_PS) >> CSR_TLBREHI_PS_SHIFT;
> +}
> +
> +static inline void write_csr_tlbrefill_pagesize(unsigned int size)
> +{
> +	csr_xchgq(size << CSR_TLBREHI_PS_SHIFT, CSR_TLBREHI_PS, LOONGARCH_CSR_TLBREHI);
> +}
> +
> +#define read_csr_asid()			csr_readl(LOONGARCH_CSR_ASID)
> +#define write_csr_asid(val)		csr_writel(val, LOONGARCH_CSR_ASID)
> +#define read_csr_entryhi()		csr_readq(LOONGARCH_CSR_TLBEHI)
> +#define write_csr_entryhi(val)		csr_writeq(val, LOONGARCH_CSR_TLBEHI)
> +#define read_csr_entrylo0()		csr_readq(LOONGARCH_CSR_TLBELO0)
> +#define write_csr_entrylo0(val)		csr_writeq(val, LOONGARCH_CSR_TLBELO0)
> +#define read_csr_entrylo1()		csr_readq(LOONGARCH_CSR_TLBELO1)
> +#define write_csr_entrylo1(val)		csr_writeq(val, LOONGARCH_CSR_TLBELO1)
> +#define read_csr_ecfg()			csr_readl(LOONGARCH_CSR_ECFG)
> +#define write_csr_ecfg(val)		csr_writel(val, LOONGARCH_CSR_ECFG)
> +#define read_csr_estat()		csr_readl(LOONGARCH_CSR_ESTAT)
> +#define write_csr_estat(val)		csr_writel(val, LOONGARCH_CSR_ESTAT)
> +#define read_csr_tlbidx()		csr_readl(LOONGARCH_CSR_TLBIDX)
> +#define write_csr_tlbidx(val)		csr_writel(val, LOONGARCH_CSR_TLBIDX)
> +#define read_csr_euen()			csr_readl(LOONGARCH_CSR_EUEN)
> +#define write_csr_euen(val)		csr_writel(val, LOONGARCH_CSR_EUEN)
> +#define read_csr_cpuid()		csr_readl(LOONGARCH_CSR_CPUID)
> +#define read_csr_prcfg1()		csr_readq(LOONGARCH_CSR_PRCFG1)
> +#define write_csr_prcfg1(val)		csr_writeq(val, LOONGARCH_CSR_PRCFG1)
> +#define read_csr_prcfg2()		csr_readq(LOONGARCH_CSR_PRCFG2)
> +#define write_csr_prcfg2(val)		csr_writeq(val, LOONGARCH_CSR_PRCFG2)
> +#define read_csr_prcfg3()		csr_readq(LOONGARCH_CSR_PRCFG3)
> +#define write_csr_prcfg3(val)		csr_writeq(val, LOONGARCH_CSR_PRCFG3)
> +#define read_csr_stlbpgsize()		csr_readl(LOONGARCH_CSR_STLBPGSIZE)
> +#define write_csr_stlbpgsize(val)	csr_writel(val, LOONGARCH_CSR_STLBPGSIZE)
> +#define read_csr_rvacfg()		csr_readl(LOONGARCH_CSR_RVACFG)
> +#define write_csr_rvacfg(val)		csr_writel(val, LOONGARCH_CSR_RVACFG)
> +#define write_csr_tintclear(val)	csr_writel(val, LOONGARCH_CSR_TINTCLR)
> +#define read_csr_impctl1()		csr_readq(LOONGARCH_CSR_IMPCTL1)
> +#define write_csr_impctl1(val)		csr_writeq(val, LOONGARCH_CSR_IMPCTL1)
> +#define write_csr_impctl2(val)		csr_writeq(val, LOONGARCH_CSR_IMPCTL2)
> +
> +#define read_csr_perfctrl0()		csr_readq(LOONGARCH_CSR_PERFCTRL0)
> +#define read_csr_perfcntr0()		csr_readq(LOONGARCH_CSR_PERFCNTR0)
> +#define read_csr_perfctrl1()		csr_readq(LOONGARCH_CSR_PERFCTRL1)
> +#define read_csr_perfcntr1()		csr_readq(LOONGARCH_CSR_PERFCNTR1)
> +#define read_csr_perfctrl2()		csr_readq(LOONGARCH_CSR_PERFCTRL2)
> +#define read_csr_perfcntr2()		csr_readq(LOONGARCH_CSR_PERFCNTR2)
> +#define read_csr_perfctrl3()		csr_readq(LOONGARCH_CSR_PERFCTRL3)
> +#define read_csr_perfcntr3()		csr_readq(LOONGARCH_CSR_PERFCNTR3)
> +#define write_csr_perfctrl0(val)	csr_writeq(val, LOONGARCH_CSR_PERFCTRL0)
> +#define write_csr_perfcntr0(val)	csr_writeq(val, LOONGARCH_CSR_PERFCNTR0)
> +#define write_csr_perfctrl1(val)	csr_writeq(val, LOONGARCH_CSR_PERFCTRL1)
> +#define write_csr_perfcntr1(val)	csr_writeq(val, LOONGARCH_CSR_PERFCNTR1)
> +#define write_csr_perfctrl2(val)	csr_writeq(val, LOONGARCH_CSR_PERFCTRL2)
> +#define write_csr_perfcntr2(val)	csr_writeq(val, LOONGARCH_CSR_PERFCNTR2)
> +#define write_csr_perfctrl3(val)	csr_writeq(val, LOONGARCH_CSR_PERFCTRL3)
> +#define write_csr_perfcntr3(val)	csr_writeq(val, LOONGARCH_CSR_PERFCNTR3)
> +
> +/*
> + * Manipulate bits in a register.
> + */
> +#define __BUILD_CSR_COMMON(name)				\
> +static inline unsigned long					\
> +set_##name(unsigned long set)					\
> +{								\
> +	unsigned long res, new;					\
> +								\
> +	res = read_##name();					\
> +	new = res | set;					\
> +	write_##name(new);					\
> +								\
> +	return res;						\
> +}								\
> +								\
> +static inline unsigned long					\
> +clear_##name(unsigned long clear)				\
> +{								\
> +	unsigned long res, new;					\
> +								\
> +	res = read_##name();					\
> +	new = res & ~clear;					\
> +	write_##name(new);					\
> +								\
> +	return res;						\
> +}								\
> +								\
> +static inline unsigned long					\
> +change_##name(unsigned long change, unsigned long val)		\
> +{								\
> +	unsigned long res, new;					\
> +								\
> +	res = read_##name();					\
> +	new = res & ~change;					\
> +	new |= (val & change);					\
> +	write_##name(new);					\
> +								\
> +	return res;						\
> +}
> +
> +#define __BUILD_CSR_OP(name)	__BUILD_CSR_COMMON(csr_##name)
> +
> +__BUILD_CSR_OP(euen)
> +__BUILD_CSR_OP(ecfg)
> +__BUILD_CSR_OP(tlbidx)
> +
> +#define set_csr_estat(val)	\
> +	csr_xchgl(val, val, LOONGARCH_CSR_ESTAT)
> +#define clear_csr_estat(val)	\
> +	csr_xchgl(~(val), val, LOONGARCH_CSR_ESTAT)
> +
> +#endif /* __ASSEMBLY__ */
> +
> +/* Generic EntryLo bit definitions */
> +#define ENTRYLO_V		(_ULCAST_(1) << 0)
> +#define ENTRYLO_D		(_ULCAST_(1) << 1)
> +#define ENTRYLO_PLV_SHIFT	2
> +#define ENTRYLO_PLV		(_ULCAST_(3) << ENTRYLO_PLV_SHIFT)
> +#define ENTRYLO_C_SHIFT		4
> +#define ENTRYLO_C		(_ULCAST_(3) << ENTRYLO_C_SHIFT)
> +#define ENTRYLO_G		(_ULCAST_(1) << 6)
> +#define ENTRYLO_NR		(_ULCAST_(1) << 61)
> +#define ENTRYLO_NX		(_ULCAST_(1) << 62)
> +
> +/* LoongArch GlobalNumber definitions */
> +#define LOONGARCH_GLOBALNUMBER_VP_SHF	0
> +#define LOONGARCH_GLOBALNUMBER_VP		(_ULCAST_(0xff) << LOONGARCH_GLOBALNUMBER_VP_SHF)
> +#define LOONGARCH_GLOBALNUMBER_CORE_SHF	8
> +#define LOONGARCH_GLOBALNUMBER_CORE		(_ULCAST_(0xff) << LOONGARCH_GLOBALNUMBER_CORE_SHF)
> +#define LOONGARCH_GLOBALNUMBER_CLUSTER_SHF	16
> +#define LOONGARCH_GLOBALNUMBER_CLUSTER	(_ULCAST_(0xf) << LOONGARCH_GLOBALNUMBER_CLUSTER_SHF)
> +
> +/* Values for PageSize register */
> +#define PS_4K		0x0000000c
> +#define PS_8K		0x0000000d
> +#define PS_16K		0x0000000e
> +#define PS_32K		0x0000000f
> +#define PS_64K		0x00000010
> +#define PS_128K		0x00000011
> +#define PS_256K		0x00000012
> +#define PS_512K		0x00000013
> +#define PS_1M		0x00000014
> +#define PS_2M		0x00000015
> +#define PS_4M		0x00000016
> +#define PS_8M		0x00000017
> +#define PS_16M		0x00000018
> +#define PS_32M		0x00000019
> +#define PS_64M		0x0000001a
> +#define PS_128M		0x0000001b
> +#define PS_256M		0x0000001c
> +#define PS_512M		0x0000001d
> +#define PS_1G		0x0000001e
> +
> +#define PS_MASK		0x3f000000
> +#define PS_SHIFT	24
> +
> +/* Default page size for a given kernel configuration */
> +#ifdef CONFIG_PAGE_SIZE_4KB
> +#define PS_DEFAULT_SIZE PS_4K
> +#elif defined(CONFIG_PAGE_SIZE_16KB)
> +#define PS_DEFAULT_SIZE PS_16K
> +#elif defined(CONFIG_PAGE_SIZE_64KB)
> +#define PS_DEFAULT_SIZE PS_64K
> +#else
> +#error Bad page size configuration!
> +#endif
> +
> +/* Default huge tlb size for a given kernel configuration */
> +#ifdef CONFIG_PAGE_SIZE_4KB
> +#define PS_HUGE_SIZE   PS_1M
> +#elif defined(CONFIG_PAGE_SIZE_16KB)
> +#define PS_HUGE_SIZE   PS_16M
> +#elif defined(CONFIG_PAGE_SIZE_64KB)
> +#define PS_HUGE_SIZE   PS_256M
> +#else
> +#error Bad page size configuration for hugetlbfs!
> +#endif
> +
> +/* ExStatus.ExcCode */
> +#define EXCCODE_RSV		0	/* Reserved */
> +#define EXCCODE_TLBL		1	/* TLB miss on a load */
> +#define EXCCODE_TLBS		2	/* TLB miss on a store */
> +#define EXCCODE_TLBI		3	/* TLB miss on a ifetch */
> +#define EXCCODE_TLBM		4	/* TLB modified fault */
> +#define EXCCODE_TLBNR		5	/* TLB Read-Inhibit exception */
> +#define EXCCODE_TLBNX		6	/* TLB Execution-Inhibit exception */
> +#define EXCCODE_TLBPE		7	/* TLB Privilege Error */
> +#define EXCCODE_ADE		8	/* Address Error */
> +	#define EXSUBCODE_ADEF		0	/* Fetch Instruction */
> +	#define EXSUBCODE_ADEM		1	/* Access Memory*/
> +#define EXCCODE_ALE		9	/* Unalign Access */
> +#define EXCCODE_OOB		10	/* Out of bounds */
> +#define EXCCODE_SYS		11	/* System call */
> +#define EXCCODE_BP		12	/* Breakpoint */
> +#define EXCCODE_INE		13	/* Inst. Not Exist */
> +#define EXCCODE_IPE		14	/* Inst. Privileged Error */
"Privilege Error"?
> +#define EXCCODE_FPDIS		15	/* FPU Disabled */
> +#define EXCCODE_LSXDIS		16	/* LSX Disabled */
> +#define EXCCODE_LASXDIS		17	/* LASX Disabled */
> +#define EXCCODE_FPE		18	/* Floating Point Exception */
> +	#define EXCSUBCODE_FPE		0	/* Floating Point Exception */
> +	#define EXCSUBCODE_VFPE		1	/* Vector Exception */
> +#define EXCCODE_WATCH		19	/* Watch address reference */
> +#define EXCCODE_BTDIS		20	/* Binary Trans. Disabled */
> +#define EXCCODE_BTE		21	/* Binary Trans. Exception */
> +#define EXCCODE_PSI		22	/* Guest Privileged Error */
> +#define EXCCODE_HYP		23	/* Hypercall */
> +#define EXCCODE_GCM		24	/* Guest CSR modified */
> +	#define EXCSUBCODE_GCSC		0	/* Software caused */
> +	#define EXCSUBCODE_GCHC		1	/* Hardware caused */
> +#define EXCCODE_SE		25	/* Security */
> +
> +#define EXCCODE_INT_START   64
> +#define EXCCODE_SIP0        64
> +#define EXCCODE_SIP1        65
> +#define EXCCODE_IP0         66
> +#define EXCCODE_IP1         67
> +#define EXCCODE_IP2         68
> +#define EXCCODE_IP3         69
> +#define EXCCODE_IP4         70
> +#define EXCCODE_IP5         71
> +#define EXCCODE_IP6         72
> +#define EXCCODE_IP7         73
> +#define EXCCODE_PMC         74 /* Performance Counter */
> +#define EXCCODE_TIMER       75
> +#define EXCCODE_IPI         76
> +#define EXCCODE_NMI         77
> +#define EXCCODE_INT_END     78
> +#define EXCCODE_INT_NUM	    (EXCCODE_INT_END - EXCCODE_INT_START)
> +
> +/* FPU register names */
> +#define LOONGARCH_FCSR0	$r0
> +#define LOONGARCH_FCSR1	$r1
> +#define LOONGARCH_FCSR2	$r2
> +#define LOONGARCH_FCSR3	$r3
> +
> +/* FPU Status Register Values */
> +#define FPU_CSR_RSVD	0xe0e0fce0
> +
> +/*
> + * X the exception cause indicator
> + * E the exception enable
> + * S the sticky/flag bit
> + */
> +#define FPU_CSR_ALL_X	0x1f000000
> +#define FPU_CSR_INV_X	0x10000000
> +#define FPU_CSR_DIV_X	0x08000000
> +#define FPU_CSR_OVF_X	0x04000000
> +#define FPU_CSR_UDF_X	0x02000000
> +#define FPU_CSR_INE_X	0x01000000
> +
> +#define FPU_CSR_ALL_S	0x001f0000
> +#define FPU_CSR_INV_S	0x00100000
> +#define FPU_CSR_DIV_S	0x00080000
> +#define FPU_CSR_OVF_S	0x00040000
> +#define FPU_CSR_UDF_S	0x00020000
> +#define FPU_CSR_INE_S	0x00010000
> +
> +#define FPU_CSR_ALL_E	0x0000001f
> +#define FPU_CSR_INV_E	0x00000010
> +#define FPU_CSR_DIV_E	0x00000008
> +#define FPU_CSR_OVF_E	0x00000004
> +#define FPU_CSR_UDF_E	0x00000002
> +#define FPU_CSR_INE_E	0x00000001
> +
> +/* Bits 8 and 9 of FPU Status Register specify the rounding mode */
> +#define FPU_CSR_RM	0x300
> +#define FPU_CSR_RN	0x000	/* nearest */
> +#define FPU_CSR_RZ	0x100	/* towards zero */
> +#define FPU_CSR_RU	0x200	/* towards +Infinity */
> +#define FPU_CSR_RD	0x300	/* towards -Infinity */
> +
> +#define read_fcsr(source)	\
> +({	\
> +	unsigned int __res;	\
> +\
> +	__asm__ __volatile__(	\
> +	"	movfcsr2gr	%0, "STR(source)"	\n"	\
> +	: "=r" (__res));	\
> +	__res;	\
> +})
> +
> +#define write_fcsr(dest, val) \
> +do {	\
> +	__asm__ __volatile__(	\
> +	"	movgr2fcsr	%0, "STR(dest)"	\n"	\
> +	: : "r" (val));	\
> +} while (0)
> +
> +#endif /* _ASM_LOONGARCH_H */
> diff --git a/arch/loongarch/include/asm/loongson.h b/arch/loongarch/include/asm/loongson.h
> new file mode 100644
> index 000000000000..4cefd393fd5c
> --- /dev/null
> +++ b/arch/loongarch/include/asm/loongson.h
> @@ -0,0 +1,159 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Author: Huacai Chen <chenhuacai@loongson.cn>
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
Better explain this file's name and purpose... This is mostly 
definitions for models produced by Loongson Corporation, much like how 
MTI stands for MIPS the corporation while MIPS stands for the 
architecture. Don't overload the "Loongson" term, it's too ambiguous 
already...
> + */
> +
> +#ifndef __ASM_LOONGSON_H
> +#define __ASM_LOONGSON_H
> +
> +#include <linux/init.h>
> +#include <linux/io.h>
> +#include <linux/irq.h>
> +#include <linux/pci.h>
> +#include <asm/addrspace.h>
> +#include <asm/boot_param.h>
> +
> +extern const struct plat_smp_ops loongson3_smp_ops;
> +
> +/* loongson-specific command line, env and memory initialization */
> +extern void __init fw_init_environ(void);
> +extern void __init fw_init_memory(void);
> +extern void __init fw_init_numa_memory(void);
> +
> +#define LOONGSON_REG(x) \
> +	(*(volatile u32 *)((char *)TO_UNCAC(LOONGSON_REG_BASE) + (x)))
> +
> +#define LOONGSON_LIO_BASE	0x18000000
> +#define LOONGSON_LIO_SIZE	0x00100000	/* 1M */
> +#define LOONGSON_LIO_TOP	(LOONGSON_LIO_BASE+LOONGSON_LIO_SIZE-1)
> +
> +#define LOONGSON_BOOT_BASE	0x1c000000
> +#define LOONGSON_BOOT_SIZE	0x02000000	/* 32M */
> +#define LOONGSON_BOOT_TOP	(LOONGSON_BOOT_BASE+LOONGSON_BOOT_SIZE-1)
> +
> +#define LOONGSON_REG_BASE	0x1fe00000
> +#define LOONGSON_REG_SIZE	0x00100000	/* 1M */
> +#define LOONGSON_REG_TOP	(LOONGSON_REG_BASE+LOONGSON_REG_SIZE-1)
> +
> +/* GPIO Regs - r/w */
> +
> +#define LOONGSON_GPIODATA		LOONGSON_REG(0x11c)
> +#define LOONGSON_GPIOIE			LOONGSON_REG(0x120)
> +#define LOONGSON_REG_GPIO_BASE          (LOONGSON_REG_BASE + 0x11c)
> +
> +#define MAX_PACKAGES 16
> +
> +/* Chip Config registor of each physical cpu package */
> +extern u64 loongson_chipcfg[MAX_PACKAGES];
> +#define LOONGSON_CHIPCFG(id) (*(volatile u32 *)(loongson_chipcfg[id]))
> +
> +/* Chip Temperature registor of each physical cpu package */
> +extern u64 loongson_chiptemp[MAX_PACKAGES];
> +#define LOONGSON_CHIPTEMP(id) (*(volatile u32 *)(loongson_chiptemp[id]))
> +
> +/* Freq Control register of each physical cpu package */
> +extern u64 loongson_freqctrl[MAX_PACKAGES];
> +#define LOONGSON_FREQCTRL(id) (*(volatile u32 *)(loongson_freqctrl[id]))
> +
> +#define xconf_readl(addr) readl(addr)
> +#define xconf_readq(addr) readq(addr)
> +
> +static inline void xconf_writel(u32 val, volatile void __iomem *addr)
> +{
> +	asm volatile (
> +	"	st.w	%[v], %[hw], 0	\n"
> +	"	ld.b	$r0, %[hw], 0	\n"
> +	:
> +	: [hw] "r" (addr), [v] "r" (val)
> +	);
> +}
> +
> +static inline void xconf_writeq(u64 val64, volatile void __iomem *addr)
> +{
> +	asm volatile (
> +	"	st.d	%[v], %[hw], 0	\n"
> +	"	ld.b	$r0, %[hw], 0	\n"
> +	:
> +	: [hw] "r" (addr),  [v] "r" (val64)
> +	);
> +}
> +
> +/* ============== LS7A registers =============== */
> +#define LS7A_PCH_REG_BASE		0x10000000UL
> +/* LPC regs */
> +#define LS7A_LPC_REG_BASE		(LS7A_PCH_REG_BASE + 0x00002000)
> +/* CHIPCFG regs */
> +#define LS7A_CHIPCFG_REG_BASE		(LS7A_PCH_REG_BASE + 0x00010000)
> +/* MISC reg base */
> +#define LS7A_MISC_REG_BASE		(LS7A_PCH_REG_BASE + 0x00080000)
> +/* ACPI regs */
> +#define LS7A_ACPI_REG_BASE		(LS7A_MISC_REG_BASE + 0x00050000)
> +/* RTC regs */
> +#define LS7A_RTC_REG_BASE		(LS7A_MISC_REG_BASE + 0x00050100)
> +
> +#define LS7A_DMA_CFG			(volatile void *)TO_UNCAC(LS7A_CHIPCFG_REG_BASE + 0x041c)
> +#define LS7A_DMA_NODE_SHF		8
> +#define LS7A_DMA_NODE_MASK		0x1F00
> +
> +#define LS7A_INT_MASK_REG		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x020)
> +#define LS7A_INT_EDGE_REG		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x060)
> +#define LS7A_INT_CLEAR_REG		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x080)
> +#define LS7A_INT_HTMSI_EN_REG		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x040)
> +#define LS7A_INT_ROUTE_ENTRY_REG	(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x100)
> +#define LS7A_INT_HTMSI_VEC_REG		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x200)
> +#define LS7A_INT_STATUS_REG		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x3a0)
> +#define LS7A_INT_POL_REG		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x3e0)
> +#define LS7A_LPC_INT_CTL		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x2000)
> +#define LS7A_LPC_INT_ENA		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x2004)
> +#define LS7A_LPC_INT_STS		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x2008)
> +#define LS7A_LPC_INT_CLR		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x200c)
> +#define LS7A_LPC_INT_POL		(volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x2010)
> +
> +#define LS7A_PMCON_SOC_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x000)
> +#define LS7A_PMCON_RESUME_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x004)
> +#define LS7A_PMCON_RTC_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x008)
> +#define LS7A_PM1_EVT_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x00c)
> +#define LS7A_PM1_ENA_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x010)
> +#define LS7A_PM1_CNT_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x014)
> +#define LS7A_PM1_TMR_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x018)
> +#define LS7A_P_CNT_REG			(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x01c)
> +#define LS7A_GPE0_STS_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x028)
> +#define LS7A_GPE0_ENA_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x02c)
> +#define LS7A_RST_CNT_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x030)
> +#define LS7A_WD_SET_REG			(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x034)
> +#define LS7A_WD_TIMER_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x038)
> +#define LS7A_THSENS_CNT_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x04c)
> +#define LS7A_GEN_RTC_1_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x050)
> +#define LS7A_GEN_RTC_2_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x054)
> +#define LS7A_DPM_CFG_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x400)
> +#define LS7A_DPM_STS_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x404)
> +#define LS7A_DPM_CNT_REG		(volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x408)
> +
> +typedef enum {
> +	ACPI_PCI_HOTPLUG_STATUS	= 1 << 1,
> +	ACPI_CPU_HOTPLUG_STATUS	= 1 << 2,
> +	ACPI_MEM_HOTPLUG_STATUS	= 1 << 3,
> +	ACPI_POWERBUTTON_STATUS	= 1 << 8,
> +	ACPI_RTC_WAKE_STATUS	= 1 << 10,
> +	ACPI_PCI_WAKE_STATUS	= 1 << 14,
> +	ACPI_ANY_WAKE_STATUS	= 1 << 15,
> +} AcpiEventStatusBits;
> +
> +#define HT1LO_OFFSET		0xe0000000000UL
> +
> +/* PCI Configuration Space Base */
> +#define MCFG_EXT_PCICFG_BASE		0xefe00000000UL
> +
> +/* REG ACCESS*/
Do we really need this tiny comment? The code is pretty self-explanatory 
and the comment end marker is lacking a space before.
> +#define ls7a_readb(addr)			  (*(volatile unsigned char  *)TO_UNCAC(addr))
> +#define ls7a_readw(addr)			  (*(volatile unsigned short *)TO_UNCAC(addr))
> +#define ls7a_readl(addr)			  (*(volatile unsigned int   *)TO_UNCAC(addr))
> +#define ls7a_readq(addr)			  (*(volatile unsigned long  *)TO_UNCAC(addr))
> +#define ls7a_writeb(val, addr)		*(volatile unsigned char  *)TO_UNCAC(addr) = (val)
> +#define ls7a_writew(val, addr)		*(volatile unsigned short *)TO_UNCAC(addr) = (val)
> +#define ls7a_writel(val, addr)		ls7a_write_type(val, addr, uint32_t)
> +#define ls7a_writeq(val, addr)		ls7a_write_type(val, addr, uint64_t)
> +#define ls7a_write(val, addr)		ls7a_write_type(val, addr, uint64_t)
> +
> +#endif /* __ASM_LOONGSON_H */
> diff --git a/arch/loongarch/include/asm/regdef.h b/arch/loongarch/include/asm/regdef.h
> new file mode 100644
> index 000000000000..9f24f0c05fe3
> --- /dev/null
> +++ b/arch/loongarch/include/asm/regdef.h
> @@ -0,0 +1,43 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef _ASM_REGDEF_H
> +#define _ASM_REGDEF_H
> +
> +#define zero	$r0	/* wired zero */
> +#define ra	$r1	/* return address */
> +#define tp	$r2
> +#define sp	$r3	/* stack pointer */
> +#define v0	$r4	/* return value - caller saved */
> +#define v1	$r5
> +#define a0	$r4	/* argument registers */
> +#define a1	$r5
> +#define a2	$r6
> +#define a3	$r7
> +#define a4	$r8
> +#define a5	$r9
> +#define a6	$r10
> +#define a7	$r11
> +#define t0	$r12	/* caller saved */
> +#define t1	$r13
> +#define t2	$r14
> +#define t3	$r15
> +#define t4	$r16
> +#define t5	$r17
> +#define t6	$r18
> +#define t7	$r19
> +#define t8	$r20
> +#define u0	$r21
> +#define fp	$r22	/* frame pointer */
> +#define s0	$r23	/* callee saved */
> +#define s1	$r24
> +#define s2	$r25
> +#define s3	$r26
> +#define s4	$r27
> +#define s5	$r28
> +#define s6	$r29
> +#define s7	$r30
> +#define s8	$r31
> +
> +#endif /* _ASM_REGDEF_H */
Why can't this file be combined with the FP one (absorbing the FP 
definitions into this file)? While at it, remove $vX and document $u0 too.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 07/24] LoongArch: Add atomic/locking headers
  2022-04-30  9:05 ` [PATCH V9 07/24] LoongArch: Add atomic/locking headers Huacai Chen
@ 2022-05-01 11:16   ` WANG Xuerui
  2022-05-01 13:16     ` Huacai Chen
  0 siblings, 1 reply; 94+ messages in thread
From: WANG Xuerui @ 2022-05-01 11:16 UTC (permalink / raw)
  To: Huacai Chen, Arnd Bergmann, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang


On 4/30/22 17:05, Huacai Chen wrote:
> This patch adds common headers (atomic, bitops, barrier and locking)
> for basic LoongArch support.
>
> LoongArch has no native sub-word xchg/cmpxchg instructions now, but
> LoongArch-based CPUs support NUMA (e.g., quad-core Loongson-3A5000
> supports as many as 16 nodes, 64 cores in total). So, we emulate sub-
> word xchg/cmpxchg in software and use qspinlock/qrwlock rather than
> ticket locks.
I'd leave the details for others more familiar with the intricate art of 
locking to review; here's only a couple minor suggestions.
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> ---
>   arch/loongarch/include/asm/atomic.h         | 358 ++++++++++++++++++++
>   arch/loongarch/include/asm/barrier.h        |  51 +++
>   arch/loongarch/include/asm/bitops.h         |  33 ++
>   arch/loongarch/include/asm/bitrev.h         |  34 ++
>   arch/loongarch/include/asm/cmpxchg.h        | 135 ++++++++
>   arch/loongarch/include/asm/local.h          | 138 ++++++++
>   arch/loongarch/include/asm/percpu.h         |  20 ++
>   arch/loongarch/include/asm/spinlock.h       |  12 +
>   arch/loongarch/include/asm/spinlock_types.h |  11 +
>   9 files changed, 792 insertions(+)
>   create mode 100644 arch/loongarch/include/asm/atomic.h
>   create mode 100644 arch/loongarch/include/asm/barrier.h
>   create mode 100644 arch/loongarch/include/asm/bitops.h
>   create mode 100644 arch/loongarch/include/asm/bitrev.h
>   create mode 100644 arch/loongarch/include/asm/cmpxchg.h
>   create mode 100644 arch/loongarch/include/asm/local.h
>   create mode 100644 arch/loongarch/include/asm/percpu.h
>   create mode 100644 arch/loongarch/include/asm/spinlock.h
>   create mode 100644 arch/loongarch/include/asm/spinlock_types.h
>
> diff --git a/arch/loongarch/include/asm/atomic.h b/arch/loongarch/include/asm/atomic.h
> new file mode 100644
> index 000000000000..f0ed7f9c08c9
> --- /dev/null
> +++ b/arch/loongarch/include/asm/atomic.h
> @@ -0,0 +1,358 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Atomic operations.
> + *
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef _ASM_ATOMIC_H
> +#define _ASM_ATOMIC_H
> +
> +#include <linux/types.h>
> +#include <asm/barrier.h>
> +#include <asm/cmpxchg.h>
> +#include <asm/compiler.h>
> +
> +#if _LOONGARCH_SZLONG == 32

Please don't use the MIPS-like macros, as they *may* go away (once my 
https://github.com/loongson/LoongArch-Documentation/pull/28 is merged); 
you may use the architecture-independent macro __SIZEOF_LONG__ instead 
(this would become "__SIZEOF_LONG__ == 4"). Or use 
__loongarch32/__loongarch64.

> +#define __LL		"ll.w	"
> +#define __SC		"sc.w	"
> +#define __AMADD		"amadd.w	"
> +#define __AMAND_SYNC	"amand_db.w	"
"__AMADD_DB" would better match the instruction mnemonic... IIRC 
"amadd_sync" is the old LoongISA-era name!
> +#define __AMOR_SYNC	"amor_db.w	"
> +#define __AMXOR_SYNC	"amxor_db.w	"
> +#elif _LOONGARCH_SZLONG == 64
> +#define __LL		"ll.d	"
> +#define __SC		"sc.d	"
> +#define __AMADD		"amadd.d	"
> +#define __AMAND_SYNC	"amand_db.d	"
> +#define __AMOR_SYNC	"amor_db.d	"
> +#define __AMXOR_SYNC	"amxor_db.d	"
> +#endif
> +
> +#define ATOMIC_INIT(i)	  { (i) }
> +
> +/*
> + * arch_atomic_read - read atomic variable
> + * @v: pointer of type atomic_t
> + *
> + * Atomically reads the value of @v.
> + */
> +#define arch_atomic_read(v)	READ_ONCE((v)->counter)
> +
> +/*
> + * arch_atomic_set - set atomic variable
> + * @v: pointer of type atomic_t
> + * @i: required value
> + *
> + * Atomically sets the value of @v to @i.
> + */
> +#define arch_atomic_set(v, i)	WRITE_ONCE((v)->counter, (i))
> +
> +#define ATOMIC_OP(op, I, asm_op)					\
> +static inline void arch_atomic_##op(int i, atomic_t *v)			\
> +{									\
> +	__asm__ __volatile__(						\
> +	"am"#asm_op"_db.w" " $zero, %1, %0	\n"			\
> +	: "+ZB" (v->counter)						\
> +	: "r" (I)							\
> +	: "memory");							\
> +}
> +
> +#define ATOMIC_OP_RETURN(op, I, asm_op, c_op)				\
> +static inline int arch_atomic_##op##_return_relaxed(int i, atomic_t *v)	\
> +{									\
> +	int result;							\
> +									\
> +	__asm__ __volatile__(						\
> +	"am"#asm_op"_db.w" " %1, %2, %0		\n"			\
> +	: "+ZB" (v->counter), "=&r" (result)				\
> +	: "r" (I)							\
> +	: "memory");							\
> +									\
> +	return result c_op I;						\
> +}
> +
> +#define ATOMIC_FETCH_OP(op, I, asm_op)					\
> +static inline int arch_atomic_fetch_##op##_relaxed(int i, atomic_t *v)	\
> +{									\
> +	int result;							\
> +									\
> +	__asm__ __volatile__(						\
> +	"am"#asm_op"_db.w" " %1, %2, %0		\n"			\
> +	: "+ZB" (v->counter), "=&r" (result)				\
> +	: "r" (I)							\
> +	: "memory");							\
> +									\
> +	return result;							\
> +}
> +
> +#define ATOMIC_OPS(op, I, asm_op, c_op)					\
> +	ATOMIC_OP(op, I, asm_op)					\
> +	ATOMIC_OP_RETURN(op, I, asm_op, c_op)				\
> +	ATOMIC_FETCH_OP(op, I, asm_op)
> +
> +ATOMIC_OPS(add, i, add, +)
> +ATOMIC_OPS(sub, -i, add, +)
> +
> +#define arch_atomic_add_return_relaxed	arch_atomic_add_return_relaxed
> +#define arch_atomic_sub_return_relaxed	arch_atomic_sub_return_relaxed
> +#define arch_atomic_fetch_add_relaxed	arch_atomic_fetch_add_relaxed
> +#define arch_atomic_fetch_sub_relaxed	arch_atomic_fetch_sub_relaxed
> +
> +#undef ATOMIC_OPS
> +
> +#define ATOMIC_OPS(op, I, asm_op)					\
> +	ATOMIC_OP(op, I, asm_op)					\
> +	ATOMIC_FETCH_OP(op, I, asm_op)
> +
> +ATOMIC_OPS(and, i, and)
> +ATOMIC_OPS(or, i, or)
> +ATOMIC_OPS(xor, i, xor)
> +
> +#define arch_atomic_fetch_and_relaxed	arch_atomic_fetch_and_relaxed
> +#define arch_atomic_fetch_or_relaxed	arch_atomic_fetch_or_relaxed
> +#define arch_atomic_fetch_xor_relaxed	arch_atomic_fetch_xor_relaxed
> +
> +#undef ATOMIC_OPS
> +#undef ATOMIC_FETCH_OP
> +#undef ATOMIC_OP_RETURN
> +#undef ATOMIC_OP
> +
> +static inline int arch_atomic_fetch_add_unless(atomic_t *v, int a, int u)
> +{
> +       int prev, rc;
> +
> +	__asm__ __volatile__ (
> +		"0:	ll.w	%[p],  %[c]\n"
> +		"	beq	%[p],  %[u], 1f\n"
> +		"	add.w	%[rc], %[p], %[a]\n"
> +		"	sc.w	%[rc], %[c]\n"
> +		"	beqz	%[rc], 0b\n"
> +		"	b	2f\n"
> +		"1:\n"
> +		__WEAK_LLSC_MB
> +		"2:\n"
> +		: [p]"=&r" (prev), [rc]"=&r" (rc),
> +		  [c]"=ZB" (v->counter)
> +		: [a]"r" (a), [u]"r" (u)
> +		: "memory");
> +
> +	return prev;
> +}
> +#define arch_atomic_fetch_add_unless arch_atomic_fetch_add_unless
> +
> +/*
> + * arch_atomic_sub_if_positive - conditionally subtract integer from atomic variable
> + * @i: integer value to subtract
> + * @v: pointer of type atomic_t
> + *
> + * Atomically test @v and subtract @i if @v is greater or equal than @i.
> + * The function returns the old value of @v minus @i.
> + */
> +static inline int arch_atomic_sub_if_positive(int i, atomic_t *v)
> +{
> +	int result;
> +	int temp;
> +
> +	if (__builtin_constant_p(i)) {
> +		__asm__ __volatile__(
> +		"1:	ll.w	%1, %2		# atomic_sub_if_positive\n"
> +		"	addi.w	%0, %1, %3				\n"
> +		"	or	%1, %0, $zero				\n"
> +		"	blt	%0, $zero, 2f				\n"
> +		"	sc.w	%1, %2					\n"
> +		"	beq	$zero, %1, 1b				\n"
> +		"2:							\n"
> +		: "=&r" (result), "=&r" (temp),
> +		  "+" GCC_OFF_SMALL_ASM() (v->counter)
> +		: "I" (-i));
> +	} else {
> +		__asm__ __volatile__(
> +		"1:	ll.w	%1, %2		# atomic_sub_if_positive\n"
> +		"	sub.w	%0, %1, %3				\n"
> +		"	or	%1, %0, $zero				\n"
> +		"	blt	%0, $zero, 2f				\n"
> +		"	sc.w	%1, %2					\n"
> +		"	beq	$zero, %1, 1b				\n"
> +		"2:							\n"
> +		: "=&r" (result), "=&r" (temp),
> +		  "+" GCC_OFF_SMALL_ASM() (v->counter)
> +		: "r" (i));
> +	}
> +
> +	return result;
> +}
> +
> +#define arch_atomic_cmpxchg(v, o, n) (arch_cmpxchg(&((v)->counter), (o), (n)))
> +#define arch_atomic_xchg(v, new) (arch_xchg(&((v)->counter), (new)))
> +
> +/*
> + * arch_atomic_dec_if_positive - decrement by 1 if old value positive
> + * @v: pointer of type atomic_t
> + */
> +#define arch_atomic_dec_if_positive(v)	arch_atomic_sub_if_positive(1, v)
> +
> +#ifdef CONFIG_64BIT
> +
> +#define ATOMIC64_INIT(i)    { (i) }
> +
> +/*
> + * arch_atomic64_read - read atomic variable
> + * @v: pointer of type atomic64_t
> + *
> + */
> +#define arch_atomic64_read(v)	READ_ONCE((v)->counter)
> +
> +/*
> + * arch_atomic64_set - set atomic variable
> + * @v: pointer of type atomic64_t
> + * @i: required value
> + */
> +#define arch_atomic64_set(v, i)	WRITE_ONCE((v)->counter, (i))
> +
> +#define ATOMIC64_OP(op, I, asm_op)					\
> +static inline void arch_atomic64_##op(long i, atomic64_t *v)		\
> +{									\
> +	__asm__ __volatile__(						\
> +	"am"#asm_op"_db.d " " $zero, %1, %0	\n"			\
> +	: "+ZB" (v->counter)						\
> +	: "r" (I)							\
> +	: "memory");							\
> +}
> +
> +#define ATOMIC64_OP_RETURN(op, I, asm_op, c_op)					\
> +static inline long arch_atomic64_##op##_return_relaxed(long i, atomic64_t *v)	\
> +{										\
> +	long result;								\
> +	__asm__ __volatile__(							\
> +	"am"#asm_op"_db.d " " %1, %2, %0		\n"			\
> +	: "+ZB" (v->counter), "=&r" (result)					\
> +	: "r" (I)								\
> +	: "memory");								\
> +										\
> +	return result c_op I;							\
> +}
> +
> +#define ATOMIC64_FETCH_OP(op, I, asm_op)					\
> +static inline long arch_atomic64_fetch_##op##_relaxed(long i, atomic64_t *v)	\
> +{										\
> +	long result;								\
> +										\
> +	__asm__ __volatile__(							\
> +	"am"#asm_op"_db.d " " %1, %2, %0		\n"			\
> +	: "+ZB" (v->counter), "=&r" (result)					\
> +	: "r" (I)								\
> +	: "memory");								\
> +										\
> +	return result;								\
> +}
> +
> +#define ATOMIC64_OPS(op, I, asm_op, c_op)				      \
> +	ATOMIC64_OP(op, I, asm_op)					      \
> +	ATOMIC64_OP_RETURN(op, I, asm_op, c_op)				      \
> +	ATOMIC64_FETCH_OP(op, I, asm_op)
> +
> +ATOMIC64_OPS(add, i, add, +)
> +ATOMIC64_OPS(sub, -i, add, +)
> +
> +#define arch_atomic64_add_return_relaxed	arch_atomic64_add_return_relaxed
> +#define arch_atomic64_sub_return_relaxed	arch_atomic64_sub_return_relaxed
> +#define arch_atomic64_fetch_add_relaxed		arch_atomic64_fetch_add_relaxed
> +#define arch_atomic64_fetch_sub_relaxed		arch_atomic64_fetch_sub_relaxed
> +
> +#undef ATOMIC64_OPS
> +
> +#define ATOMIC64_OPS(op, I, asm_op)					      \
> +	ATOMIC64_OP(op, I, asm_op)					      \
> +	ATOMIC64_FETCH_OP(op, I, asm_op)
> +
> +ATOMIC64_OPS(and, i, and)
> +ATOMIC64_OPS(or, i, or)
> +ATOMIC64_OPS(xor, i, xor)
> +
> +#define arch_atomic64_fetch_and_relaxed	arch_atomic64_fetch_and_relaxed
> +#define arch_atomic64_fetch_or_relaxed	arch_atomic64_fetch_or_relaxed
> +#define arch_atomic64_fetch_xor_relaxed	arch_atomic64_fetch_xor_relaxed
> +
> +#undef ATOMIC64_OPS
> +#undef ATOMIC64_FETCH_OP
> +#undef ATOMIC64_OP_RETURN
> +#undef ATOMIC64_OP
> +
> +static inline long arch_atomic64_fetch_add_unless(atomic64_t *v, long a, long u)
> +{
> +       long prev, rc;
> +
> +	__asm__ __volatile__ (
> +		"0:	ll.d	%[p],  %[c]\n"
> +		"	beq	%[p],  %[u], 1f\n"
> +		"	add.d	%[rc], %[p], %[a]\n"
> +		"	sc.d	%[rc], %[c]\n"
> +		"	beqz	%[rc], 0b\n"
> +		"	b	2f\n"
> +		"1:\n"
> +		__WEAK_LLSC_MB
> +		"2:\n"
> +		: [p]"=&r" (prev), [rc]"=&r" (rc),
> +		  [c] "=ZB" (v->counter)
> +		: [a]"r" (a), [u]"r" (u)
> +		: "memory");
> +
> +	return prev;
> +}
> +#define arch_atomic64_fetch_add_unless arch_atomic64_fetch_add_unless
> +
> +/*
> + * arch_atomic64_sub_if_positive - conditionally subtract integer from atomic variable
> + * @i: integer value to subtract
> + * @v: pointer of type atomic64_t
> + *
> + * Atomically test @v and subtract @i if @v is greater or equal than @i.
> + * The function returns the old value of @v minus @i.
> + */
> +static inline long arch_atomic64_sub_if_positive(long i, atomic64_t *v)
> +{
> +	long result;
> +	long temp;
> +
> +	if (__builtin_constant_p(i)) {
> +		__asm__ __volatile__(
> +		"1:	ll.d	%1, %2	# atomic64_sub_if_positive	\n"
> +		"	addi.d	%0, %1, %3				\n"
> +		"	or	%1, %0, $zero				\n"
> +		"	blt	%0, $zero, 2f				\n"
> +		"	sc.d	%1, %2					\n"
> +		"	beq	%1, $zero, 1b				\n"
> +		"2:							\n"
> +		: "=&r" (result), "=&r" (temp),
> +		  "+" GCC_OFF_SMALL_ASM() (v->counter)
> +		: "I" (-i));
> +	} else {
> +		__asm__ __volatile__(
> +		"1:	ll.d	%1, %2	# atomic64_sub_if_positive	\n"
> +		"	sub.d	%0, %1, %3				\n"
> +		"	or	%1, %0, $zero				\n"
> +		"	blt	%0, $zero, 2f				\n"
> +		"	sc.d	%1, %2					\n"
> +		"	beq	%1, $zero, 1b				\n"
> +		"2:							\n"
> +		: "=&r" (result), "=&r" (temp),
> +		  "+" GCC_OFF_SMALL_ASM() (v->counter)
> +		: "r" (i));
> +	}
> +
> +	return result;
> +}
> +
> +#define arch_atomic64_cmpxchg(v, o, n) \
> +	((__typeof__((v)->counter))arch_cmpxchg(&((v)->counter), (o), (n)))
> +#define arch_atomic64_xchg(v, new) (arch_xchg(&((v)->counter), (new)))
> +
> +/*
> + * arch_atomic64_dec_if_positive - decrement by 1 if old value positive
> + * @v: pointer of type atomic64_t
> + */
> +#define arch_atomic64_dec_if_positive(v)	arch_atomic64_sub_if_positive(1, v)
> +
> +#endif /* CONFIG_64BIT */
> +
> +#endif /* _ASM_ATOMIC_H */
> diff --git a/arch/loongarch/include/asm/barrier.h b/arch/loongarch/include/asm/barrier.h
> new file mode 100644
> index 000000000000..cc6c7e3f5ce6
> --- /dev/null
> +++ b/arch/loongarch/include/asm/barrier.h
> @@ -0,0 +1,51 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef __ASM_BARRIER_H
> +#define __ASM_BARRIER_H
> +
> +#define __sync()	__asm__ __volatile__("dbar 0" : : : "memory")
> +
> +#define fast_wmb()	__sync()
> +#define fast_rmb()	__sync()
> +#define fast_mb()	__sync()
> +#define fast_iob()	__sync()
> +#define wbflush()	__sync()
> +
> +#define wmb()		fast_wmb()
> +#define rmb()		fast_rmb()
> +#define mb()		fast_mb()
> +#define iob()		fast_iob()
> +
> +/**
> + * array_index_mask_nospec() - generate a ~0 mask when index < size, 0 otherwise
> + * @index: array element index
> + * @size: number of elements in array
> + *
> + * Returns:
> + *     0 - (@index < @size)
> + */
> +#define array_index_mask_nospec array_index_mask_nospec
> +static inline unsigned long array_index_mask_nospec(unsigned long index,
> +						    unsigned long size)
> +{
> +	unsigned long mask;
> +
> +	__asm__ __volatile__(
> +		"sltu	%0, %1, %2\n\t"
> +#if (_LOONGARCH_SZLONG == 32)
> +		"sub.w	%0, $r0, %0\n\t"
> +#elif (_LOONGARCH_SZLONG == 64)
> +		"sub.d	%0, $r0, %0\n\t"
> +#endif
> +		: "=r" (mask)
> +		: "r" (index), "r" (size)
> +		:);
> +
> +	return mask;
> +}
> +
> +#include <asm-generic/barrier.h>
> +
> +#endif /* __ASM_BARRIER_H */
> diff --git a/arch/loongarch/include/asm/bitops.h b/arch/loongarch/include/asm/bitops.h
> new file mode 100644
> index 000000000000..69e00f8d8034
> --- /dev/null
> +++ b/arch/loongarch/include/asm/bitops.h
> @@ -0,0 +1,33 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef _ASM_BITOPS_H
> +#define _ASM_BITOPS_H
> +
> +#include <linux/compiler.h>
> +
> +#ifndef _LINUX_BITOPS_H
> +#error only <linux/bitops.h> can be included directly
> +#endif
> +
> +#include <asm/barrier.h>
> +
> +#include <asm-generic/bitops/builtin-ffs.h>
> +#include <asm-generic/bitops/builtin-fls.h>
> +#include <asm-generic/bitops/builtin-__ffs.h>
> +#include <asm-generic/bitops/builtin-__fls.h>
> +
> +#include <asm-generic/bitops/ffz.h>
> +#include <asm-generic/bitops/fls64.h>
> +
> +#include <asm-generic/bitops/sched.h>
> +#include <asm-generic/bitops/hweight.h>
> +
> +#include <asm-generic/bitops/atomic.h>
> +#include <asm-generic/bitops/non-atomic.h>
> +#include <asm-generic/bitops/lock.h>
> +#include <asm-generic/bitops/le.h>
> +#include <asm-generic/bitops/ext2-atomic.h>
> +
> +#endif /* _ASM_BITOPS_H */
> diff --git a/arch/loongarch/include/asm/bitrev.h b/arch/loongarch/include/asm/bitrev.h
> new file mode 100644
> index 000000000000..46f275b9cdf7
> --- /dev/null
> +++ b/arch/loongarch/include/asm/bitrev.h
> @@ -0,0 +1,34 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef __LOONGARCH_ASM_BITREV_H__
> +#define __LOONGARCH_ASM_BITREV_H__
> +
> +#include <linux/swab.h>
> +
> +static __always_inline __attribute_const__ u32 __arch_bitrev32(u32 x)
> +{
> +	u32 ret;
> +
> +	asm("bitrev.4b	%0, %1" : "=r"(ret) : "r"(__swab32(x)));
> +	return ret;
> +}
> +
> +static __always_inline __attribute_const__ u16 __arch_bitrev16(u16 x)
> +{
> +	u16 ret;
> +
> +	asm("bitrev.4b	%0, %1" : "=r"(ret) : "r"(__swab16(x)));
> +	return ret;
> +}
> +
> +static __always_inline __attribute_const__ u8 __arch_bitrev8(u8 x)
> +{
> +	u8 ret;
> +
> +	asm("bitrev.4b	%0, %1" : "=r"(ret) : "r"(x));
> +	return ret;
> +}
> +
> +#endif /* __LOONGARCH_ASM_BITREV_H__ */
> diff --git a/arch/loongarch/include/asm/cmpxchg.h b/arch/loongarch/include/asm/cmpxchg.h
> new file mode 100644
> index 000000000000..69c3e2b7827d
> --- /dev/null
> +++ b/arch/loongarch/include/asm/cmpxchg.h
> @@ -0,0 +1,135 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef __ASM_CMPXCHG_H
> +#define __ASM_CMPXCHG_H
> +
> +#include <linux/build_bug.h>
> +
> +#define __xchg_asm(amswap_db, m, val)		\
> +({						\
> +		__typeof(val) __ret;		\
> +						\
> +		__asm__ __volatile__ (		\
> +		" "amswap_db" %1, %z2, %0 \n"	\
> +		: "+ZB" (*m), "=&r" (__ret)	\
> +		: "Jr" (val)			\
> +		: "memory");			\
> +						\
> +		__ret;				\
> +})
> +
> +extern unsigned long __xchg_small(volatile void *ptr, unsigned long x,
> +				  unsigned int size);
> +
> +static inline unsigned long __xchg(volatile void *ptr, unsigned long x,
> +				   int size)
> +{
> +	switch (size) {
> +	case 1:
> +	case 2:
> +		return __xchg_small(ptr, x, size);
> +
> +	case 4:
> +		return __xchg_asm("amswap_db.w", (volatile u32 *)ptr, (u32)x);
> +
> +	case 8:
> +		return __xchg_asm("amswap_db.d", (volatile u64 *)ptr, (u64)x);
> +
> +	default:
> +		BUILD_BUG();
> +	}
> +
> +	return 0;
> +}
> +
> +#define arch_xchg(ptr, x)						\
> +({									\
> +	__typeof__(*(ptr)) __res;					\
> +									\
> +	__res = (__typeof__(*(ptr)))					\
> +		__xchg((ptr), (unsigned long)(x), sizeof(*(ptr)));	\
> +									\
> +	__res;								\
> +})
> +
> +#define __cmpxchg_asm(ld, st, m, old, new)				\
> +({									\
> +	__typeof(old) __ret;						\
> +									\
> +	__asm__ __volatile__(						\
> +	"1:	" ld "	%0, %2		# __cmpxchg_asm \n"		\
> +	"	bne	%0, %z3, 2f			\n"		\
> +	"	or	$t0, %z4, $zero			\n"		\
> +	"	" st "	$t0, %1				\n"		\
> +	"	beq	$zero, $t0, 1b			\n"		\
> +	"2:						\n"		\
> +	: "=&r" (__ret), "=ZB"(*m)					\
> +	: "ZB"(*m), "Jr" (old), "Jr" (new)				\
> +	: "t0", "memory");						\
> +									\
> +	__ret;								\
> +})
> +
> +extern unsigned long __cmpxchg_small(volatile void *ptr, unsigned long old,
> +				     unsigned long new, unsigned int size);
> +
> +static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old,
> +				      unsigned long new, unsigned int size)
> +{
> +	switch (size) {
> +	case 1:
> +	case 2:
> +		return __cmpxchg_small(ptr, old, new, size);
> +
> +	case 4:
> +		return __cmpxchg_asm("ll.w", "sc.w", (volatile u32 *)ptr,
> +				     (u32)old, new);
> +
> +	case 8:
> +		return __cmpxchg_asm("ll.d", "sc.d", (volatile u64 *)ptr,
> +				     (u64)old, new);
> +
> +	default:
> +		BUILD_BUG();
> +	}
> +
> +	return 0;
> +}
> +
> +#define arch_cmpxchg_local(ptr, old, new)				\
> +	((__typeof__(*(ptr)))						\
> +		__cmpxchg((ptr),					\
> +			  (unsigned long)(__typeof__(*(ptr)))(old),	\
> +			  (unsigned long)(__typeof__(*(ptr)))(new),	\
> +			  sizeof(*(ptr))))
> +
> +#define arch_cmpxchg(ptr, old, new)					\
> +({									\
> +	__typeof__(*(ptr)) __res;					\
> +									\
> +	__res = arch_cmpxchg_local((ptr), (old), (new));		\
> +									\
> +	__res;								\
> +})
> +
> +#ifdef CONFIG_64BIT
> +#define arch_cmpxchg64_local(ptr, o, n)					\
> +  ({									\
> +	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
> +	arch_cmpxchg_local((ptr), (o), (n));				\
> +  })
> +
> +#define arch_cmpxchg64(ptr, o, n)					\
> +  ({									\
> +	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
> +	arch_cmpxchg((ptr), (o), (n));					\
> +  })
> +#else
> +#include <asm-generic/cmpxchg-local.h>
> +#define arch_cmpxchg64_local(ptr, o, n) __generic_cmpxchg64_local((ptr), (o), (n))
> +#define arch_cmpxchg64(ptr, o, n) arch_cmpxchg64_local((ptr), (o), (n))
> +#endif
> +
> +#endif /* __ASM_CMPXCHG_H */
> diff --git a/arch/loongarch/include/asm/local.h b/arch/loongarch/include/asm/local.h
> new file mode 100644
> index 000000000000..2052a2267337
> --- /dev/null
> +++ b/arch/loongarch/include/asm/local.h
> @@ -0,0 +1,138 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef _ARCH_LOONGARCH_LOCAL_H
> +#define _ARCH_LOONGARCH_LOCAL_H
> +
> +#include <linux/percpu.h>
> +#include <linux/bitops.h>
> +#include <linux/atomic.h>
> +#include <asm/cmpxchg.h>
> +#include <asm/compiler.h>
> +
> +typedef struct {
> +	atomic_long_t a;
> +} local_t;
> +
> +#define LOCAL_INIT(i)	{ ATOMIC_LONG_INIT(i) }
> +
> +#define local_read(l)	atomic_long_read(&(l)->a)
> +#define local_set(l, i) atomic_long_set(&(l)->a, (i))
> +
> +#define local_add(i, l) atomic_long_add((i), (&(l)->a))
> +#define local_sub(i, l) atomic_long_sub((i), (&(l)->a))
> +#define local_inc(l)	atomic_long_inc(&(l)->a)
> +#define local_dec(l)	atomic_long_dec(&(l)->a)
> +
> +/*
> + * Same as above, but return the result value
> + */
> +static inline long local_add_return(long i, local_t *l)
> +{
> +	unsigned long result;
> +
> +	__asm__ __volatile__(
> +	"   " __AMADD " %1, %2, %0      \n"
> +	: "+ZB" (l->a.counter), "=&r" (result)
> +	: "r" (i)
> +	: "memory");
> +	result = result + i;
> +
> +	return result;
> +}
> +
> +static inline long local_sub_return(long i, local_t *l)
> +{
> +	unsigned long result;
> +
> +	__asm__ __volatile__(
> +	"   " __AMADD "%1, %2, %0       \n"
> +	: "+ZB" (l->a.counter), "=&r" (result)
> +	: "r" (-i)
> +	: "memory");
> +
> +	result = result - i;
> +
> +	return result;
> +}
> +
> +#define local_cmpxchg(l, o, n) \
> +	((long)cmpxchg_local(&((l)->a.counter), (o), (n)))
> +#define local_xchg(l, n) (atomic_long_xchg((&(l)->a), (n)))
> +
> +/**
> + * local_add_unless - add unless the number is a given value
> + * @l: pointer of type local_t
> + * @a: the amount to add to l...
> + * @u: ...unless l is equal to u.
> + *
> + * Atomically adds @a to @l, so long as it was not @u.
> + * Returns non-zero if @l was not @u, and zero otherwise.
> + */
> +#define local_add_unless(l, a, u)				\
> +({								\
> +	long c, old;						\
> +	c = local_read(l);					\
> +	while (c != (u) && (old = local_cmpxchg((l), c, c + (a))) != c) \
> +		c = old;					\
> +	c != (u);						\
> +})
> +#define local_inc_not_zero(l) local_add_unless((l), 1, 0)
> +
> +#define local_dec_return(l) local_sub_return(1, (l))
> +#define local_inc_return(l) local_add_return(1, (l))
> +
> +/*
> + * local_sub_and_test - subtract value from variable and test result
> + * @i: integer value to subtract
> + * @l: pointer of type local_t
> + *
> + * Atomically subtracts @i from @l and returns
> + * true if the result is zero, or false for all
> + * other cases.
> + */
> +#define local_sub_and_test(i, l) (local_sub_return((i), (l)) == 0)
> +
> +/*
> + * local_inc_and_test - increment and test
> + * @l: pointer of type local_t
> + *
> + * Atomically increments @l by 1
> + * and returns true if the result is zero, or false for all
> + * other cases.
> + */
> +#define local_inc_and_test(l) (local_inc_return(l) == 0)
> +
> +/*
> + * local_dec_and_test - decrement by 1 and test
> + * @l: pointer of type local_t
> + *
> + * Atomically decrements @l by 1 and
> + * returns true if the result is 0, or false for all other
> + * cases.
> + */
> +#define local_dec_and_test(l) (local_sub_return(1, (l)) == 0)
> +
> +/*
> + * local_add_negative - add and test if negative
> + * @l: pointer of type local_t
> + * @i: integer value to add
> + *
> + * Atomically adds @i to @l and returns true
> + * if the result is negative, or false when
> + * result is greater than or equal to zero.
> + */
> +#define local_add_negative(i, l) (local_add_return(i, (l)) < 0)
> +
> +/* Use these for per-cpu local_t variables: on some archs they are
> + * much more efficient than these naive implementations.  Note they take
> + * a variable, not an address.
> + */
> +
> +#define __local_inc(l)		((l)->a.counter++)
> +#define __local_dec(l)		((l)->a.counter++)
> +#define __local_add(i, l)	((l)->a.counter += (i))
> +#define __local_sub(i, l)	((l)->a.counter -= (i))
> +
> +#endif /* _ARCH_LOONGARCH_LOCAL_H */
> diff --git a/arch/loongarch/include/asm/percpu.h b/arch/loongarch/include/asm/percpu.h
> new file mode 100644
> index 000000000000..7d5b22ebd834
> --- /dev/null
> +++ b/arch/loongarch/include/asm/percpu.h
> @@ -0,0 +1,20 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef __ASM_PERCPU_H
> +#define __ASM_PERCPU_H
> +
> +/* Use r21 for fast access */
> +register unsigned long __my_cpu_offset __asm__("$r21");
> +
> +static inline void set_my_cpu_offset(unsigned long off)
> +{
> +	__my_cpu_offset = off;
> +	csr_writeq(off, PERCPU_BASE_KS);
> +}
> +#define __my_cpu_offset __my_cpu_offset
> +
> +#include <asm-generic/percpu.h>
> +
> +#endif /* __ASM_PERCPU_H */
> diff --git a/arch/loongarch/include/asm/spinlock.h b/arch/loongarch/include/asm/spinlock.h
> new file mode 100644
> index 000000000000..7cb3476999be
> --- /dev/null
> +++ b/arch/loongarch/include/asm/spinlock.h
> @@ -0,0 +1,12 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef _ASM_SPINLOCK_H
> +#define _ASM_SPINLOCK_H
> +
> +#include <asm/processor.h>
> +#include <asm/qspinlock.h>
> +#include <asm/qrwlock.h>
> +
> +#endif /* _ASM_SPINLOCK_H */
> diff --git a/arch/loongarch/include/asm/spinlock_types.h b/arch/loongarch/include/asm/spinlock_types.h
> new file mode 100644
> index 000000000000..7458d036c161
> --- /dev/null
> +++ b/arch/loongarch/include/asm/spinlock_types.h
> @@ -0,0 +1,11 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef _ASM_SPINLOCK_TYPES_H
> +#define _ASM_SPINLOCK_TYPES_H
> +
> +#include <asm-generic/qspinlock_types.h>
> +#include <asm-generic/qrwlock_types.h>
> +
> +#endif

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
  2022-05-01  8:46           ` Huacai Chen
  (?)
@ 2022-05-01 11:28             ` Russell King (Oracle)
  -1 siblings, 0 replies; 94+ messages in thread
From: Russell King (Oracle) @ 2022-05-01 11:28 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Huacai Chen, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Guo Ren,
	Xuerui Wang, Jiaxun Yang, Linux ARM, Catalin Marinas,
	Will Deacon, linux-riscv, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Ard Biesheuvel, linux-efi

On Sun, May 01, 2022 at 04:46:50PM +0800, Huacai Chen wrote:
> Hi, Russell,
> 
> On Sun, May 1, 2022 at 2:35 PM Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> >
> > On Sun, May 01, 2022 at 01:22:25PM +0800, Huacai Chen wrote:
> > > Hi, Arnd,
> > >
> > > On Sat, Apr 30, 2022 at 7:02 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > >
> > > > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> > > > >
> > > > > This patch adds zboot (self-extracting compressed kernel) support, all
> > > > > existing in-kernel compressing algorithm and efistub are supported.
> > > > >
> > > > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > > >
> > > > I have no objections to adding a decompressor in principle, and
> > > > the implementation seems reasonable. However, I think we should try to
> > > > be consistent between architectures. On both arm64 and riscv, the
> > > > maintainers decided to not include a decompressor and instead leave
> > > > it up to the boot loader to decompress the kernel and enter it from there.
> > > X86, ARM32 and MIPS already support self-extracting kernel, and in
> > > 5.17 we even support self-extracting modules. So I think a
> > > self-extracting kernel is better than a pure compressed kernel.
> >
> > FYI, kernel modules are not self-extracting. They don't contain the code
> > to do the decompression - that is contained within the kernel, and it is
> > the kernel that does the decompression. The userspace tooling tells the
> > kernel that the module is compressed.
> I call "self-extracting" here means we don't need out-of-kernel help:
> kernel decompress doesn't need the bootloader, module decompress
> doesn't need kmod.

As I understand it, it does require out-of-kernel help. The module
loading program needs to pass in to the finit_module syscall a flag
to tell the kernel to decompress it. See the
MODULE_INIT_COMPRESSED_FILE flag.

So it's definitely not "self-extracting" by any sense of "self". My
definition of "self-extracting" is where a program contains the
extractor inside the same image, and when the program is run, it
performs the extraction using code contained within the image itself.

Your definition would mean a gzipped kernel binary would be able to
be called "self-extracting" if the boot loader decompresses it. This
is definitely not "self-extracting" in my book.

Sorry to be such a pedant.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
@ 2022-05-01 11:28             ` Russell King (Oracle)
  0 siblings, 0 replies; 94+ messages in thread
From: Russell King (Oracle) @ 2022-05-01 11:28 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Huacai Chen, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Guo Ren,
	Xuerui Wang, Jiaxun Yang, Linux ARM, Catalin Marinas,
	Will Deacon, linux-riscv, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Ard Biesheuvel, linux-efi

On Sun, May 01, 2022 at 04:46:50PM +0800, Huacai Chen wrote:
> Hi, Russell,
> 
> On Sun, May 1, 2022 at 2:35 PM Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> >
> > On Sun, May 01, 2022 at 01:22:25PM +0800, Huacai Chen wrote:
> > > Hi, Arnd,
> > >
> > > On Sat, Apr 30, 2022 at 7:02 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > >
> > > > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> > > > >
> > > > > This patch adds zboot (self-extracting compressed kernel) support, all
> > > > > existing in-kernel compressing algorithm and efistub are supported.
> > > > >
> > > > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > > >
> > > > I have no objections to adding a decompressor in principle, and
> > > > the implementation seems reasonable. However, I think we should try to
> > > > be consistent between architectures. On both arm64 and riscv, the
> > > > maintainers decided to not include a decompressor and instead leave
> > > > it up to the boot loader to decompress the kernel and enter it from there.
> > > X86, ARM32 and MIPS already support self-extracting kernel, and in
> > > 5.17 we even support self-extracting modules. So I think a
> > > self-extracting kernel is better than a pure compressed kernel.
> >
> > FYI, kernel modules are not self-extracting. They don't contain the code
> > to do the decompression - that is contained within the kernel, and it is
> > the kernel that does the decompression. The userspace tooling tells the
> > kernel that the module is compressed.
> I call "self-extracting" here means we don't need out-of-kernel help:
> kernel decompress doesn't need the bootloader, module decompress
> doesn't need kmod.

As I understand it, it does require out-of-kernel help. The module
loading program needs to pass in to the finit_module syscall a flag
to tell the kernel to decompress it. See the
MODULE_INIT_COMPRESSED_FILE flag.

So it's definitely not "self-extracting" by any sense of "self". My
definition of "self-extracting" is where a program contains the
extractor inside the same image, and when the program is run, it
performs the extraction using code contained within the image itself.

Your definition would mean a gzipped kernel binary would be able to
be called "self-extracting" if the boot loader decompresses it. This
is definitely not "self-extracting" in my book.

Sorry to be such a pedant.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
@ 2022-05-01 11:28             ` Russell King (Oracle)
  0 siblings, 0 replies; 94+ messages in thread
From: Russell King (Oracle) @ 2022-05-01 11:28 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Huacai Chen, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Guo Ren,
	Xuerui Wang, Jiaxun Yang, Linux ARM, Catalin Marinas,
	Will Deacon, linux-riscv, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Ard Biesheuvel, linux-efi

On Sun, May 01, 2022 at 04:46:50PM +0800, Huacai Chen wrote:
> Hi, Russell,
> 
> On Sun, May 1, 2022 at 2:35 PM Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> >
> > On Sun, May 01, 2022 at 01:22:25PM +0800, Huacai Chen wrote:
> > > Hi, Arnd,
> > >
> > > On Sat, Apr 30, 2022 at 7:02 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > >
> > > > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> > > > >
> > > > > This patch adds zboot (self-extracting compressed kernel) support, all
> > > > > existing in-kernel compressing algorithm and efistub are supported.
> > > > >
> > > > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > > >
> > > > I have no objections to adding a decompressor in principle, and
> > > > the implementation seems reasonable. However, I think we should try to
> > > > be consistent between architectures. On both arm64 and riscv, the
> > > > maintainers decided to not include a decompressor and instead leave
> > > > it up to the boot loader to decompress the kernel and enter it from there.
> > > X86, ARM32 and MIPS already support self-extracting kernel, and in
> > > 5.17 we even support self-extracting modules. So I think a
> > > self-extracting kernel is better than a pure compressed kernel.
> >
> > FYI, kernel modules are not self-extracting. They don't contain the code
> > to do the decompression - that is contained within the kernel, and it is
> > the kernel that does the decompression. The userspace tooling tells the
> > kernel that the module is compressed.
> I call "self-extracting" here means we don't need out-of-kernel help:
> kernel decompress doesn't need the bootloader, module decompress
> doesn't need kmod.

As I understand it, it does require out-of-kernel help. The module
loading program needs to pass in to the finit_module syscall a flag
to tell the kernel to decompress it. See the
MODULE_INIT_COMPRESSED_FILE flag.

So it's definitely not "self-extracting" by any sense of "self". My
definition of "self-extracting" is where a program contains the
extractor inside the same image, and when the program is run, it
performs the extraction using code contained within the image itself.

Your definition would mean a gzipped kernel binary would be able to
be called "self-extracting" if the boot loader decompresses it. This
is definitely not "self-extracting" in my book.

Sorry to be such a pedant.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 08/24] LoongArch: Add other common headers
  2022-04-30  9:05 ` [PATCH V9 08/24] LoongArch: Add other common headers Huacai Chen
@ 2022-05-01 11:39   ` WANG Xuerui
  2022-05-01 14:26     ` Huacai Chen
  0 siblings, 1 reply; 94+ messages in thread
From: WANG Xuerui @ 2022-05-01 11:39 UTC (permalink / raw)
  To: Huacai Chen, Arnd Bergmann, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang


On 4/30/22 17:05, Huacai Chen wrote:
> This patch adds some other common headers for basic LoongArch support.
>
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> ---
>   arch/loongarch/include/asm/asm-prototypes.h   |   7 +
>   arch/loongarch/include/asm/asm.h              | 190 +++++++++++
>   arch/loongarch/include/asm/asmmacro.h         | 294 ++++++++++++++++++
>   arch/loongarch/include/asm/clocksource.h      |  12 +
>   arch/loongarch/include/asm/compiler.h         |  15 +
>   arch/loongarch/include/asm/inst.h             |  63 ++++
>   arch/loongarch/include/asm/linkage.h          |  36 +++
>   arch/loongarch/include/asm/perf_event.h       |  10 +
>   arch/loongarch/include/asm/prefetch.h         |  29 ++
>   arch/loongarch/include/asm/serial.h           |  11 +
>   arch/loongarch/include/asm/time.h             |  50 +++
>   arch/loongarch/include/asm/timex.h            |  31 ++
>   arch/loongarch/include/asm/topology.h         |  15 +
>   arch/loongarch/include/asm/types.h            |  33 ++
>   arch/loongarch/include/uapi/asm/bitfield.h    |  15 +
>   arch/loongarch/include/uapi/asm/bitsperlong.h |   9 +
>   arch/loongarch/include/uapi/asm/byteorder.h   |  13 +
>   arch/loongarch/include/uapi/asm/inst.h        |  57 ++++
>   arch/loongarch/include/uapi/asm/reg.h         |  59 ++++
>   tools/include/uapi/asm/bitsperlong.h          |   2 +
>   20 files changed, 951 insertions(+)
>   create mode 100644 arch/loongarch/include/asm/asm-prototypes.h
>   create mode 100644 arch/loongarch/include/asm/asm.h
>   create mode 100644 arch/loongarch/include/asm/asmmacro.h
>   create mode 100644 arch/loongarch/include/asm/clocksource.h
>   create mode 100644 arch/loongarch/include/asm/compiler.h
>   create mode 100644 arch/loongarch/include/asm/inst.h
>   create mode 100644 arch/loongarch/include/asm/linkage.h
>   create mode 100644 arch/loongarch/include/asm/perf_event.h
>   create mode 100644 arch/loongarch/include/asm/prefetch.h
>   create mode 100644 arch/loongarch/include/asm/serial.h
>   create mode 100644 arch/loongarch/include/asm/time.h
>   create mode 100644 arch/loongarch/include/asm/timex.h
>   create mode 100644 arch/loongarch/include/asm/topology.h
>   create mode 100644 arch/loongarch/include/asm/types.h
>   create mode 100644 arch/loongarch/include/uapi/asm/bitfield.h
>   create mode 100644 arch/loongarch/include/uapi/asm/bitsperlong.h
>   create mode 100644 arch/loongarch/include/uapi/asm/byteorder.h
>   create mode 100644 arch/loongarch/include/uapi/asm/inst.h
>   create mode 100644 arch/loongarch/include/uapi/asm/reg.h
>
> diff --git a/arch/loongarch/include/asm/asm-prototypes.h b/arch/loongarch/include/asm/asm-prototypes.h
> new file mode 100644
> index 000000000000..ed06d3997420
> --- /dev/null
> +++ b/arch/loongarch/include/asm/asm-prototypes.h
> @@ -0,0 +1,7 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#include <linux/uaccess.h>
> +#include <asm/fpu.h>
> +#include <asm/mmu_context.h>
> +#include <asm/page.h>
> +#include <asm/ftrace.h>
> +#include <asm-generic/asm-prototypes.h>
> diff --git a/arch/loongarch/include/asm/asm.h b/arch/loongarch/include/asm/asm.h
> new file mode 100644
> index 000000000000..6de8f9e6a21e
> --- /dev/null
> +++ b/arch/loongarch/include/asm/asm.h
> @@ -0,0 +1,190 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Some useful macros for LoongArch assembler code
> + *
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + *
> + * Derived from MIPS:
> + * Copyright (C) 1995, 1996, 1997, 1999, 2001 by Ralf Baechle
> + * Copyright (C) 1999 by Silicon Graphics, Inc.
> + * Copyright (C) 2001 MIPS Technologies, Inc.
> + * Copyright (C) 2002  Maciej W. Rozycki
> + */
> +#ifndef __ASM_ASM_H
> +#define __ASM_ASM_H
> +
> +/* LoongArch pref instruction. */
> +#ifdef CONFIG_CPU_HAS_PREFETCH
> +
> +#define PREF(hint, addr, offs)				\
> +		preld	hint, addr, offs;		\
> +
> +#define PREFX(hint, addr, index)			\
> +		preldx	hint, addr, index;		\
> +
> +#else /* !CONFIG_CPU_HAS_PREFETCH */
> +
> +#define PREF(hint, addr, offs)
> +#define PREFX(hint, addr, index)
> +
> +#endif /* !CONFIG_CPU_HAS_PREFETCH */
> +
> +/*
> + * Stack alignment
> + */
> +#define ALSZ	0xf
> +#define ALMASK	~ALSZ
Name is too ugly... why not simply "STACK_ALIGNMENT"?
> +
> +/*
> + * Macros to handle different pointer/register sizes for 32/64-bit code
> + */
> +
> +/*
> + * Size of a register
> + */
> +#ifndef __loongarch64
> +#define SZREG	4
> +#else
> +#define SZREG	8
> +#endif
Better use something like the __REG_SEL in arch/riscv (and for all 
definitions below). This way we don't have to repeat the symbol name twice.
> +
> +/*
> + * Use the following macros in assemblercode to load/store registers,
> + * pointers etc.
> + */
> +#if (SZREG == 4)
> +#define REG_L		ld.w
> +#define REG_S		st.w
> +#define REG_ADDU	add.w
> +#define REG_SUBU	sub.w

Please don't "ADDU"; just "ADD". The U suffix clearly means "unsigned" 
in LoongArch instruction mnemonics, while in MIPS "addu" the "u" 
actually means "unchecked for overflow" (see the MIPS manual about this 
misnomer).

Similarly for "SUBU".

> +#else /* SZREG == 8 */
> +#define REG_L		ld.d
> +#define REG_S		st.d
> +#define REG_ADDU	add.d
> +#define REG_SUBU	sub.d
> +#endif
> +
> +/*
> + * How to add/sub/load/store/shift C int variables.
> + */
> +#if (_LOONGARCH_SZINT == 32)
> +#define INT_ADDU	add.w
> +#define INT_ADDIU	addi.w
> +#define INT_SUBU	sub.w
> +#define INT_L		ld.w
> +#define INT_S		st.w
> +#define INT_SLL		slli.w
> +#define INT_SLLV	sll.w
> +#define INT_SRL		srli.w
> +#define INT_SRLV	srl.w
> +#define INT_SRA		srai.w
> +#define INT_SRAV	sra.w
Again, please don't carry MIPS names over.
> +#endif
> +
> +#if (_LOONGARCH_SZINT == 64)
> +#define INT_ADDU	add.d
> +#define INT_ADDIU	addi.d
> +#define INT_SUBU	sub.d
> +#define INT_L		ld.d
> +#define INT_S		st.d
> +#define INT_SLL		slli.d
> +#define INT_SLLV	sll.d
> +#define INT_SRL		srli.d
> +#define INT_SRLV	srl.d
> +#define INT_SRA		sra.w
> +#define INT_SRAV	sra.d
> +#endif
> +
> +/*
> + * How to add/sub/load/store/shift C long variables.
> + */
> +#if (_LOONGARCH_SZLONG == 32)
> +#define LONG_ADDU	add.w
> +#define LONG_ADDIU	addi.w
> +#define LONG_SUBU	sub.w
> +#define LONG_L		ld.w
> +#define LONG_S		st.w
> +#define LONG_SP		swp
Is this a typo?
> +#define LONG_SLL	slli.w
> +#define LONG_SLLV	sll.w
> +#define LONG_SRL	srli.w
> +#define LONG_SRLV	srl.w
> +#define LONG_SRA	srai.w
> +#define LONG_SRAV	sra.w
> +
> +#ifdef __ASSEMBLY__
> +#define LONG		.word
> +#endif
> +#define LONGSIZE	4
> +#define LONGMASK	3
> +#define LONGLOG		2
> +#endif
> +
> +#if (_LOONGARCH_SZLONG == 64)
> +#define LONG_ADDU	add.d
> +#define LONG_ADDIU	addi.d
> +#define LONG_SUBU	sub.d
> +#define LONG_L		ld.d
> +#define LONG_S		st.d
> +#define LONG_SP		sdp
> +#define LONG_SLL	slli.d
> +#define LONG_SLLV	sll.d
> +#define LONG_SRL	srli.d
> +#define LONG_SRLV	srl.d
> +#define LONG_SRA	sra.w
> +#define LONG_SRAV	sra.d
> +
> +#ifdef __ASSEMBLY__
> +#define LONG		.dword
> +#endif
> +#define LONGSIZE	8
> +#define LONGMASK	7
> +#define LONGLOG		3
> +#endif
> +
> +/*
> + * How to add/sub/load/store/shift pointers.
> + */
> +#if (_LOONGARCH_SZPTR == 32)
> +#define PTR_ADDU	add.w
> +#define PTR_ADDIU	addi.w
> +#define PTR_SUBU	sub.w
> +#define PTR_L		ld.w
> +#define PTR_S		st.w
> +#define PTR_LI		li.w
> +#define PTR_SLL		slli.w
> +#define PTR_SLLV	sll.w
> +#define PTR_SRL		srli.w
> +#define PTR_SRLV	srl.w
> +#define PTR_SRA		srai.w
> +#define PTR_SRAV	sra.w
> +
> +#define PTR_SCALESHIFT	2
> +
> +#define PTR		.word
> +#define PTRSIZE		4
> +#define PTRLOG		2
> +#endif
> +
> +#if (_LOONGARCH_SZPTR == 64)
> +#define PTR_ADDU	add.d
> +#define PTR_ADDIU	addi.d
> +#define PTR_SUBU	sub.d
> +#define PTR_L		ld.d
> +#define PTR_S		st.d
> +#define PTR_LI		li.d
> +#define PTR_SLL		slli.d
> +#define PTR_SLLV	sll.d
> +#define PTR_SRL		srli.d
> +#define PTR_SRLV	srl.d
> +#define PTR_SRA		srai.d
> +#define PTR_SRAV	sra.d
> +
> +#define PTR_SCALESHIFT	3
> +
> +#define PTR		.dword
> +#define PTRSIZE		8
> +#define PTRLOG		3
> +#endif
> +
> +#endif /* __ASM_ASM_H */
> diff --git a/arch/loongarch/include/asm/asmmacro.h b/arch/loongarch/include/asm/asmmacro.h
> new file mode 100644
> index 000000000000..d7089fab00e1
> --- /dev/null
> +++ b/arch/loongarch/include/asm/asmmacro.h
> @@ -0,0 +1,294 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef _ASM_ASMMACRO_H
> +#define _ASM_ASMMACRO_H
> +
> +#include <asm/asm-offsets.h>
> +#include <asm/regdef.h>
> +#include <asm/fpregdef.h>
> +#include <asm/loongarch.h>
> +
> +#undef v0
> +#undef v1
> +
> +	.macro	parse_v var val
> +	\var	= \val
> +	.endm
> +
> +	.macro	parse_r var r
> +	\var	= -1
> +	.ifc	\r, $r0
> +	\var	= 0
> +	.endif
> +	.ifc	\r, $r1
> +	\var	= 1
> +	.endif
> +	.ifc	\r, $r2
> +	\var	= 2
> +	.endif
> +	.ifc	\r, $r3
> +	\var	= 3
> +	.endif
> +	.ifc	\r, $r4
> +	\var	= 4
> +	.endif
> +	.ifc	\r, $r5
> +	\var	= 5
> +	.endif
> +	.ifc	\r, $r6
> +	\var	= 6
> +	.endif
> +	.ifc	\r, $r7
> +	\var	= 7
> +	.endif
> +	.ifc	\r, $r8
> +	\var	= 8
> +	.endif
> +	.ifc	\r, $r9
> +	\var	= 9
> +	.endif
> +	.ifc	\r, $r10
> +	\var	= 10
> +	.endif
> +	.ifc	\r, $r11
> +	\var	= 11
> +	.endif
> +	.ifc	\r, $r12
> +	\var	= 12
> +	.endif
> +	.ifc	\r, $r13
> +	\var	= 13
> +	.endif
> +	.ifc	\r, $r14
> +	\var	= 14
> +	.endif
> +	.ifc	\r, $r15
> +	\var	= 15
> +	.endif
> +	.ifc	\r, $r16
> +	\var	= 16
> +	.endif
> +	.ifc	\r, $r17
> +	\var	= 17
> +	.endif
> +	.ifc	\r, $r18
> +	\var	= 18
> +	.endif
> +	.ifc	\r, $r19
> +	\var	= 19
> +	.endif
> +	.ifc	\r, $r20
> +	\var	= 20
> +	.endif
> +	.ifc	\r, $r21
> +	\var	= 21
> +	.endif
> +	.ifc	\r, $r22
> +	\var	= 22
> +	.endif
> +	.ifc	\r, $r23
> +	\var	= 23
> +	.endif
> +	.ifc	\r, $r24
> +	\var	= 24
> +	.endif
> +	.ifc	\r, $r25
> +	\var	= 25
> +	.endif
> +	.ifc	\r, $r26
> +	\var	= 26
> +	.endif
> +	.ifc	\r, $r27
> +	\var	= 27
> +	.endif
> +	.ifc	\r, $r28
> +	\var	= 28
> +	.endif
> +	.ifc	\r, $r29
> +	\var	= 29
> +	.endif
> +	.ifc	\r, $r30
> +	\var	= 30
> +	.endif
> +	.ifc	\r, $r31
> +	\var	= 31
> +	.endif
> +	.iflt	\var
> +	.error	"Unable to parse register name \r"
> +	.endif
> +	.endm
> +
> +	.macro	cpu_save_nonscratch thread
> +	stptr.d	s0, \thread, THREAD_REG23
> +	stptr.d	s1, \thread, THREAD_REG24
> +	stptr.d	s2, \thread, THREAD_REG25
> +	stptr.d	s3, \thread, THREAD_REG26
> +	stptr.d	s4, \thread, THREAD_REG27
> +	stptr.d	s5, \thread, THREAD_REG28
> +	stptr.d	s6, \thread, THREAD_REG29
> +	stptr.d	s7, \thread, THREAD_REG30
> +	stptr.d	s8, \thread, THREAD_REG31
> +	stptr.d	sp, \thread, THREAD_REG03
> +	stptr.d	fp, \thread, THREAD_REG22
> +	.endm
> +
> +	.macro	cpu_restore_nonscratch thread
> +	ldptr.d	s0, \thread, THREAD_REG23
> +	ldptr.d	s1, \thread, THREAD_REG24
> +	ldptr.d	s2, \thread, THREAD_REG25
> +	ldptr.d	s3, \thread, THREAD_REG26
> +	ldptr.d	s4, \thread, THREAD_REG27
> +	ldptr.d	s5, \thread, THREAD_REG28
> +	ldptr.d	s6, \thread, THREAD_REG29
> +	ldptr.d	s7, \thread, THREAD_REG30
> +	ldptr.d	s8, \thread, THREAD_REG31
> +	ldptr.d	ra, \thread, THREAD_REG01
> +	ldptr.d	sp, \thread, THREAD_REG03
> +	ldptr.d	fp, \thread, THREAD_REG22
> +	.endm
> +
> +	.macro fpu_save_csr thread tmp
> +	movfcsr2gr	\tmp, fcsr0
> +	stptr.w	\tmp, \thread, THREAD_FCSR
> +	.endm
> +
> +	.macro fpu_restore_csr thread tmp
> +	ldptr.w	\tmp, \thread, THREAD_FCSR
> +	movgr2fcsr	fcsr0, \tmp
> +	.endm
> +
> +	.macro fpu_save_cc thread tmp0 tmp1
> +	movcf2gr	\tmp0, $fcc0
> +	move	\tmp1, \tmp0
> +	movcf2gr	\tmp0, $fcc1
> +	bstrins.d	\tmp1, \tmp0, 15, 8
> +	movcf2gr	\tmp0, $fcc2
> +	bstrins.d	\tmp1, \tmp0, 23, 16
> +	movcf2gr	\tmp0, $fcc3
> +	bstrins.d	\tmp1, \tmp0, 31, 24
> +	movcf2gr	\tmp0, $fcc4
> +	bstrins.d	\tmp1, \tmp0, 39, 32
> +	movcf2gr	\tmp0, $fcc5
> +	bstrins.d	\tmp1, \tmp0, 47, 40
> +	movcf2gr	\tmp0, $fcc6
> +	bstrins.d	\tmp1, \tmp0, 55, 48
> +	movcf2gr	\tmp0, $fcc7
> +	bstrins.d	\tmp1, \tmp0, 63, 56
> +	stptr.d		\tmp1, \thread, THREAD_FCC
> +	.endm
> +
> +	.macro fpu_restore_cc thread tmp0 tmp1
> +	ldptr.d	\tmp0, \thread, THREAD_FCC
> +	bstrpick.d	\tmp1, \tmp0, 7, 0
> +	movgr2cf	$fcc0, \tmp1
> +	bstrpick.d	\tmp1, \tmp0, 15, 8
> +	movgr2cf	$fcc1, \tmp1
> +	bstrpick.d	\tmp1, \tmp0, 23, 16
> +	movgr2cf	$fcc2, \tmp1
> +	bstrpick.d	\tmp1, \tmp0, 31, 24
> +	movgr2cf	$fcc3, \tmp1
> +	bstrpick.d	\tmp1, \tmp0, 39, 32
> +	movgr2cf	$fcc4, \tmp1
> +	bstrpick.d	\tmp1, \tmp0, 47, 40
> +	movgr2cf	$fcc5, \tmp1
> +	bstrpick.d	\tmp1, \tmp0, 55, 48
> +	movgr2cf	$fcc6, \tmp1
> +	bstrpick.d	\tmp1, \tmp0, 63, 56
> +	movgr2cf	$fcc7, \tmp1
> +	.endm
> +
> +	.macro	fpu_save_double thread tmp
> +	li.w	\tmp, THREAD_FPR0
> +	PTR_ADDU \tmp, \tmp, \thread
> +	fst.d	$f0, \tmp, THREAD_FPR0  - THREAD_FPR0
> +	fst.d	$f1, \tmp, THREAD_FPR1  - THREAD_FPR0
> +	fst.d	$f2, \tmp, THREAD_FPR2  - THREAD_FPR0
> +	fst.d	$f3, \tmp, THREAD_FPR3  - THREAD_FPR0
> +	fst.d	$f4, \tmp, THREAD_FPR4  - THREAD_FPR0
> +	fst.d	$f5, \tmp, THREAD_FPR5  - THREAD_FPR0
> +	fst.d	$f6, \tmp, THREAD_FPR6  - THREAD_FPR0
> +	fst.d	$f7, \tmp, THREAD_FPR7  - THREAD_FPR0
> +	fst.d	$f8, \tmp, THREAD_FPR8  - THREAD_FPR0
> +	fst.d	$f9, \tmp, THREAD_FPR9  - THREAD_FPR0
> +	fst.d	$f10, \tmp, THREAD_FPR10 - THREAD_FPR0
> +	fst.d	$f11, \tmp, THREAD_FPR11 - THREAD_FPR0
> +	fst.d	$f12, \tmp, THREAD_FPR12 - THREAD_FPR0
> +	fst.d	$f13, \tmp, THREAD_FPR13 - THREAD_FPR0
> +	fst.d	$f14, \tmp, THREAD_FPR14 - THREAD_FPR0
> +	fst.d	$f15, \tmp, THREAD_FPR15 - THREAD_FPR0
> +	fst.d	$f16, \tmp, THREAD_FPR16 - THREAD_FPR0
> +	fst.d	$f17, \tmp, THREAD_FPR17 - THREAD_FPR0
> +	fst.d	$f18, \tmp, THREAD_FPR18 - THREAD_FPR0
> +	fst.d	$f19, \tmp, THREAD_FPR19 - THREAD_FPR0
> +	fst.d	$f20, \tmp, THREAD_FPR20 - THREAD_FPR0
> +	fst.d	$f21, \tmp, THREAD_FPR21 - THREAD_FPR0
> +	fst.d	$f22, \tmp, THREAD_FPR22 - THREAD_FPR0
> +	fst.d	$f23, \tmp, THREAD_FPR23 - THREAD_FPR0
> +	fst.d	$f24, \tmp, THREAD_FPR24 - THREAD_FPR0
> +	fst.d	$f25, \tmp, THREAD_FPR25 - THREAD_FPR0
> +	fst.d	$f26, \tmp, THREAD_FPR26 - THREAD_FPR0
> +	fst.d	$f27, \tmp, THREAD_FPR27 - THREAD_FPR0
> +	fst.d	$f28, \tmp, THREAD_FPR28 - THREAD_FPR0
> +	fst.d	$f29, \tmp, THREAD_FPR29 - THREAD_FPR0
> +	fst.d	$f30, \tmp, THREAD_FPR30 - THREAD_FPR0
> +	fst.d	$f31, \tmp, THREAD_FPR31 - THREAD_FPR0
> +	.endm
> +
> +	.macro	fpu_restore_double thread tmp
> +	li.w	\tmp, THREAD_FPR0
> +	PTR_ADDU \tmp, \tmp, \thread
> +	fld.d	$f0, \tmp, THREAD_FPR0  - THREAD_FPR0
> +	fld.d	$f1, \tmp, THREAD_FPR1  - THREAD_FPR0
> +	fld.d	$f2, \tmp, THREAD_FPR2  - THREAD_FPR0
> +	fld.d	$f3, \tmp, THREAD_FPR3  - THREAD_FPR0
> +	fld.d	$f4, \tmp, THREAD_FPR4  - THREAD_FPR0
> +	fld.d	$f5, \tmp, THREAD_FPR5  - THREAD_FPR0
> +	fld.d	$f6, \tmp, THREAD_FPR6  - THREAD_FPR0
> +	fld.d	$f7, \tmp, THREAD_FPR7  - THREAD_FPR0
> +	fld.d	$f8, \tmp, THREAD_FPR8  - THREAD_FPR0
> +	fld.d	$f9, \tmp, THREAD_FPR9  - THREAD_FPR0
> +	fld.d	$f10, \tmp, THREAD_FPR10 - THREAD_FPR0
> +	fld.d	$f11, \tmp, THREAD_FPR11 - THREAD_FPR0
> +	fld.d	$f12, \tmp, THREAD_FPR12 - THREAD_FPR0
> +	fld.d	$f13, \tmp, THREAD_FPR13 - THREAD_FPR0
> +	fld.d	$f14, \tmp, THREAD_FPR14 - THREAD_FPR0
> +	fld.d	$f15, \tmp, THREAD_FPR15 - THREAD_FPR0
> +	fld.d	$f16, \tmp, THREAD_FPR16 - THREAD_FPR0
> +	fld.d	$f17, \tmp, THREAD_FPR17 - THREAD_FPR0
> +	fld.d	$f18, \tmp, THREAD_FPR18 - THREAD_FPR0
> +	fld.d	$f19, \tmp, THREAD_FPR19 - THREAD_FPR0
> +	fld.d	$f20, \tmp, THREAD_FPR20 - THREAD_FPR0
> +	fld.d	$f21, \tmp, THREAD_FPR21 - THREAD_FPR0
> +	fld.d	$f22, \tmp, THREAD_FPR22 - THREAD_FPR0
> +	fld.d	$f23, \tmp, THREAD_FPR23 - THREAD_FPR0
> +	fld.d	$f24, \tmp, THREAD_FPR24 - THREAD_FPR0
> +	fld.d	$f25, \tmp, THREAD_FPR25 - THREAD_FPR0
> +	fld.d	$f26, \tmp, THREAD_FPR26 - THREAD_FPR0
> +	fld.d	$f27, \tmp, THREAD_FPR27 - THREAD_FPR0
> +	fld.d	$f28, \tmp, THREAD_FPR28 - THREAD_FPR0
> +	fld.d	$f29, \tmp, THREAD_FPR29 - THREAD_FPR0
> +	fld.d	$f30, \tmp, THREAD_FPR30 - THREAD_FPR0
> +	fld.d	$f31, \tmp, THREAD_FPR31 - THREAD_FPR0
> +	.endm
> +
> +.macro not dst src
> +	nor	\dst, \src, zero
> +.endm
> +
> +.macro bgt r0 r1 label
> +	blt	\r1, \r0, \label
> +.endm
> +
> +.macro bltz r0 label
> +	blt	\r0, zero, \label
> +.endm
> +
> +.macro bgez r0 label
> +	bge	\r0, zero, \label
> +.endm
These are all supported in upstream binutils, so you can just remove them.
> +
> +#define v0 $r4
> +#define v1 $r5
If you removed every mention of v0 and v1, this would be unnecessary as 
well. ;-)
> +#endif /* _ASM_ASMMACRO_H */
> diff --git a/arch/loongarch/include/asm/clocksource.h b/arch/loongarch/include/asm/clocksource.h
> new file mode 100644
> index 000000000000..58e64aa05d26
> --- /dev/null
> +++ b/arch/loongarch/include/asm/clocksource.h
> @@ -0,0 +1,12 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Author: Huacai Chen <chenhuacai@loongson.cn>
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#ifndef __ASM_CLOCKSOURCE_H
> +#define __ASM_CLOCKSOURCE_H
> +
> +#include <asm/vdso/clocksource.h>
> +
> +#endif /* __ASM_CLOCKSOURCE_H */
> diff --git a/arch/loongarch/include/asm/compiler.h b/arch/loongarch/include/asm/compiler.h
> new file mode 100644
> index 000000000000..657cebe70ace
> --- /dev/null
> +++ b/arch/loongarch/include/asm/compiler.h
> @@ -0,0 +1,15 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef _ASM_COMPILER_H
> +#define _ASM_COMPILER_H
> +
> +#define GCC_OFF_SMALL_ASM() "ZC"
> +
> +#define LOONGARCH_ISA_LEVEL "loongarch"
> +#define LOONGARCH_ISA_ARCH_LEVEL "arch=loongarch"
> +#define LOONGARCH_ISA_LEVEL_RAW loongarch
Do these need updating? I remember "-march=loongarch" is an old-world thing.
> +#define LOONGARCH_ISA_ARCH_LEVEL_RAW LOONGARCH_ISA_LEVEL_RAW
> +
> +#endif /* _ASM_COMPILER_H */
> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> new file mode 100644
> index 000000000000..46166ee1e33f
> --- /dev/null
> +++ b/arch/loongarch/include/asm/inst.h
> @@ -0,0 +1,63 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef _ASM_INST_H
> +#define _ASM_INST_H
> +
> +#include <linux/types.h>
> +#include <asm/asm.h>
> +#include <uapi/asm/inst.h>
> +
> +#define ADDR_IMMMASK_LU52ID	0xFFF0000000000000
> +#define ADDR_IMMMASK_LU32ID	0x000FFFFF00000000
> +#define ADDR_IMMMASK_ADDU16ID	0x00000000FFFF0000
> +
> +#define ADDR_IMMSHIFT_LU52ID	52
> +#define ADDR_IMMSHIFT_LU32ID	32
> +#define ADDR_IMMSHIFT_ADDU16ID	16
> +
> +#define ADDR_IMM(addr, INSN)	((addr & ADDR_IMMMASK_##INSN) >> ADDR_IMMSHIFT_##INSN)
> +
> +enum loongarch_gpr {
> +	LOONGARCH_GPR_ZERO = 0,
> +	LOONGARCH_GPR_RA = 1,
> +	LOONGARCH_GPR_TP = 2,
> +	LOONGARCH_GPR_SP = 3,
> +	LOONGARCH_GPR_A0 = 4,
> +	LOONGARCH_GPR_A1,
> +	LOONGARCH_GPR_A2,
> +	LOONGARCH_GPR_A3,
> +	LOONGARCH_GPR_A4,
> +	LOONGARCH_GPR_A5,
> +	LOONGARCH_GPR_A6,
> +	LOONGARCH_GPR_A7,
> +	LOONGARCH_GPR_V0 = 4,
> +	LOONGARCH_GPR_V1 = 5,
> +	LOONGARCH_GPR_T0 = 12,
> +	LOONGARCH_GPR_T1,
> +	LOONGARCH_GPR_T2,
> +	LOONGARCH_GPR_T3,
> +	LOONGARCH_GPR_T4,
> +	LOONGARCH_GPR_T5,
> +	LOONGARCH_GPR_T6,
> +	LOONGARCH_GPR_T7,
> +	LOONGARCH_GPR_T8,
> +	LOONGARCH_GPR_FP = 22,
> +	LOONGARCH_GPR_S0 = 23,
> +	LOONGARCH_GPR_S1,
> +	LOONGARCH_GPR_S2,
> +	LOONGARCH_GPR_S3,
> +	LOONGARCH_GPR_S4,
> +	LOONGARCH_GPR_S5,
> +	LOONGARCH_GPR_S6,
> +	LOONGARCH_GPR_S7,
> +	LOONGARCH_GPR_S8,
> +	LOONGARCH_GPR_MAX
> +};
> +
> +u32 larch_insn_gen_lu32id(enum loongarch_gpr rd, int imm);
> +u32 larch_insn_gen_lu52id(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
> +u32 larch_insn_gen_jirl(enum loongarch_gpr rd, enum loongarch_gpr rj, unsigned long pc, unsigned long dest);
> +
> +#endif /* _ASM_INST_H */
> diff --git a/arch/loongarch/include/asm/linkage.h b/arch/loongarch/include/asm/linkage.h
> new file mode 100644
> index 000000000000..283b3389b561
> --- /dev/null
> +++ b/arch/loongarch/include/asm/linkage.h
> @@ -0,0 +1,36 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __ASM_LINKAGE_H
> +#define __ASM_LINKAGE_H
> +
> +#define __ALIGN		.align 2
> +#define __ALIGN_STR	".align 2"
Just __stringify(__ALIGN) where needed.
> +
> +#define SYM_FUNC_START(name)				\
> +	SYM_START(name, SYM_L_GLOBAL, SYM_A_ALIGN)	\
> +	.cfi_startproc;
> +
> +#define SYM_FUNC_START_NOALIGN(name)			\
> +	SYM_START(name, SYM_L_GLOBAL, SYM_A_NONE)	\
> +	.cfi_startproc;
> +
> +#define SYM_FUNC_START_LOCAL(name)			\
> +	SYM_START(name, SYM_L_LOCAL, SYM_A_ALIGN)	\
> +	.cfi_startproc;
> +
> +#define SYM_FUNC_START_LOCAL_NOALIGN(name)		\
> +	SYM_START(name, SYM_L_LOCAL, SYM_A_NONE)	\
> +	.cfi_startproc;
> +
> +#define SYM_FUNC_START_WEAK(name)			\
> +	SYM_START(name, SYM_L_WEAK, SYM_A_ALIGN)	\
> +	.cfi_startproc;
> +
> +#define SYM_FUNC_START_WEAK_NOALIGN(name)		\
> +	SYM_START(name, SYM_L_WEAK, SYM_A_NONE)		\
> +	.cfi_startproc;
> +
> +#define SYM_FUNC_END(name)				\
> +	.cfi_endproc;					\
> +	SYM_END(name, SYM_T_FUNC)
> +
> +#endif
> diff --git a/arch/loongarch/include/asm/perf_event.h b/arch/loongarch/include/asm/perf_event.h
> new file mode 100644
> index 000000000000..44293ec8c153
> --- /dev/null
> +++ b/arch/loongarch/include/asm/perf_event.h
> @@ -0,0 +1,10 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Author: Huacai Chen <chenhuacai@loongson.cn>
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#ifndef __LOONGARCH_PERF_EVENT_H__
> +#define __LOONGARCH_PERF_EVENT_H__
> +/* Leave it empty here. The file is required by linux/perf_event.h */
> +#endif /* __LOONGARCH_PERF_EVENT_H__ */
> diff --git a/arch/loongarch/include/asm/prefetch.h b/arch/loongarch/include/asm/prefetch.h
> new file mode 100644
> index 000000000000..1672262a5e2e
> --- /dev/null
> +++ b/arch/loongarch/include/asm/prefetch.h
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef __ASM_PREFETCH_H
> +#define __ASM_PREFETCH_H
> +
> +#define Pref_Load	0
> +#define Pref_Store	8
> +
> +#ifdef __ASSEMBLY__
> +
> +	.macro	__pref hint addr
> +#ifdef CONFIG_CPU_HAS_PREFETCH
> +	preld	\hint, \addr, 0
> +#endif
> +	.endm
> +
> +	.macro	pref_load addr
> +	__pref	Pref_Load, \addr
> +	.endm
> +
> +	.macro	pref_store addr
> +	__pref	Pref_Store, \addr
> +	.endm
> +
> +#endif
> +
> +#endif /* __ASM_PREFETCH_H */
> diff --git a/arch/loongarch/include/asm/serial.h b/arch/loongarch/include/asm/serial.h
> new file mode 100644
> index 000000000000..3fb550eb9115
> --- /dev/null
> +++ b/arch/loongarch/include/asm/serial.h
> @@ -0,0 +1,11 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef __ASM__SERIAL_H
> +#define __ASM__SERIAL_H
> +
> +#define BASE_BAUD 0
> +#define STD_COM_FLAGS (ASYNC_BOOT_AUTOCONF | ASYNC_SKIP_TEST)
> +
> +#endif /* __ASM__SERIAL_H */
> diff --git a/arch/loongarch/include/asm/time.h b/arch/loongarch/include/asm/time.h
> new file mode 100644
> index 000000000000..ace1665695b8
> --- /dev/null
> +++ b/arch/loongarch/include/asm/time.h
> @@ -0,0 +1,50 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef _ASM_TIME_H
> +#define _ASM_TIME_H
> +
> +#include <linux/clockchips.h>
> +#include <linux/clocksource.h>
> +#include <asm/loongarch.h>
> +
> +extern u64 cpu_clock_freq;
> +extern u64 const_clock_freq;
> +
> +extern void sync_counter(void);
> +
> +static inline unsigned int calc_const_freq(void)
> +{
> +	unsigned int res;
> +	unsigned int base_freq;
> +	unsigned int cfm, cfd;
> +
> +	res = read_cpucfg(LOONGARCH_CPUCFG2);
> +	if (!(res & CPUCFG2_LLFTP))
> +		return 0;
> +
> +	base_freq = read_cpucfg(LOONGARCH_CPUCFG4);
> +	res = read_cpucfg(LOONGARCH_CPUCFG5);
> +	cfm = res & 0xffff;
> +	cfd = (res >> 16) & 0xffff;
> +
> +	if (!base_freq || !cfm || !cfd)
> +		return 0;
> +	else
> +		return (base_freq * cfm / cfd);
No need for the "else" here.
> +}
> +
> +/*
> + * Initialize the calling CPU's timer interrupt as clockevent device
> + */
> +extern int constant_clockevent_init(void);
> +extern int constant_clocksource_init(void);
> +
> +static inline void clockevent_set_clock(struct clock_event_device *cd,
> +					unsigned int clock)
> +{
> +	clockevents_calc_mult_shift(cd, clock, 4);
> +}
> +
> +#endif /* _ASM_TIME_H */
> diff --git a/arch/loongarch/include/asm/timex.h b/arch/loongarch/include/asm/timex.h
> new file mode 100644
> index 000000000000..3f8db082f00d
> --- /dev/null
> +++ b/arch/loongarch/include/asm/timex.h
> @@ -0,0 +1,31 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef _ASM_TIMEX_H
> +#define _ASM_TIMEX_H
> +
> +#ifdef __KERNEL__
> +
> +#include <linux/compiler.h>
> +
> +#include <asm/cpu.h>
> +#include <asm/cpu-features.h>
> +
> +/*
> + * Standard way to access the cycle counter.
> + * Currently only used on SMP for scheduling.
> + *
> + * We know that all SMP capable CPUs have cycle counters.
> + */
> +
> +typedef unsigned long cycles_t;
> +
> +static inline cycles_t get_cycles(void)
> +{
> +	return drdtime();
> +}
> +
> +#endif /* __KERNEL__ */
> +
> +#endif /*  _ASM_TIMEX_H */
> diff --git a/arch/loongarch/include/asm/topology.h b/arch/loongarch/include/asm/topology.h
> new file mode 100644
> index 000000000000..9ac71a25207a
> --- /dev/null
> +++ b/arch/loongarch/include/asm/topology.h
> @@ -0,0 +1,15 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef __ASM_TOPOLOGY_H
> +#define __ASM_TOPOLOGY_H
> +
> +#include <linux/smp.h>
> +
> +#define cpu_logical_map(cpu)  0
> +
> +#include <asm-generic/topology.h>
> +
> +static inline void arch_fix_phys_package_id(int num, u32 slot) { }
> +#endif /* __ASM_TOPOLOGY_H */
> diff --git a/arch/loongarch/include/asm/types.h b/arch/loongarch/include/asm/types.h
> new file mode 100644
> index 000000000000..f783cf11ea52
> --- /dev/null
> +++ b/arch/loongarch/include/asm/types.h
> @@ -0,0 +1,33 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef _ASM_TYPES_H
> +#define _ASM_TYPES_H
> +
> +#include <asm-generic/int-ll64.h>
> +#include <uapi/asm/types.h>
> +
> +/*
> + * The following macros are especially useful for __asm__
> + * inline assembler.
> + */
> +#ifndef __STR
> +#define __STR(x) #x
> +#endif
> +#ifndef STR
> +#define STR(x) __STR(x)
> +#endif
Again, just use __stringify from <linux/stringify.h> where appropriate.
> +
> +/*
> + *  Configure language
> + */
> +#ifdef __ASSEMBLY__
> +#define _ULCAST_
> +#define _U64CAST_
> +#else
> +#define _ULCAST_ (unsigned long)
> +#define _U64CAST_ (u64)
> +#endif
> +
> +#endif /* _ASM_TYPES_H */
> diff --git a/arch/loongarch/include/uapi/asm/bitfield.h b/arch/loongarch/include/uapi/asm/bitfield.h
> new file mode 100644
> index 000000000000..e31a719b7007
> --- /dev/null
> +++ b/arch/loongarch/include/uapi/asm/bitfield.h
> @@ -0,0 +1,15 @@
> +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
> +/*
> + * Author: Hanlu Li <lihanlu@loongson.cn>
> + *         Huacai Chen <chenhuacai@loongson.cn>
> + *
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef __UAPI_ASM_BITFIELD_H
> +#define __UAPI_ASM_BITFIELD_H
> +
> +#define __BITFIELD_FIELD(field, more)					\
> +	more								\
> +	field;
> +
> +#endif /* __UAPI_ASM_BITFIELD_H */
> diff --git a/arch/loongarch/include/uapi/asm/bitsperlong.h b/arch/loongarch/include/uapi/asm/bitsperlong.h
> new file mode 100644
> index 000000000000..5c2c8779a695
> --- /dev/null
> +++ b/arch/loongarch/include/uapi/asm/bitsperlong.h
> @@ -0,0 +1,9 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +#ifndef __ASM_LOONGARCH_BITSPERLONG_H
> +#define __ASM_LOONGARCH_BITSPERLONG_H
> +
> +#define __BITS_PER_LONG _LOONGARCH_SZLONG
Use __loongarch_grlen instead of the MIPS-like symbol, or 
__SIZEOF_LONG__ * 8.
> +
> +#include <asm-generic/bitsperlong.h>
> +
> +#endif /* __ASM_LOONGARCH_BITSPERLONG_H */
> diff --git a/arch/loongarch/include/uapi/asm/byteorder.h b/arch/loongarch/include/uapi/asm/byteorder.h
> new file mode 100644
> index 000000000000..b1722d890deb
> --- /dev/null
> +++ b/arch/loongarch/include/uapi/asm/byteorder.h
> @@ -0,0 +1,13 @@
> +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
> +/*
> + * Author: Hanlu Li <lihanlu@loongson.cn>
> + *         Huacai Chen <chenhuacai@loongson.cn>
> + *
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef _ASM_BYTEORDER_H
> +#define _ASM_BYTEORDER_H
> +
> +#include <linux/byteorder/little_endian.h>
> +
> +#endif /* _ASM_BYTEORDER_H */
> diff --git a/arch/loongarch/include/uapi/asm/inst.h b/arch/loongarch/include/uapi/asm/inst.h
File is named "inst" while talking all about "insn"... and do we even 
need this in the UAPI? We have to act quick before merging, or this 
unfortunate legacy from MIPS would continue to plague us.
> new file mode 100644
> index 000000000000..fa00cc5ede9d
> --- /dev/null
> +++ b/arch/loongarch/include/uapi/asm/inst.h
> @@ -0,0 +1,57 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +/*
> + * Format of an instruction in memory.
> + *
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +#ifndef _UAPI_ASM_INST_H
> +#define _UAPI_ASM_INST_H
> +
> +#include <asm/bitfield.h>
> +
> +enum reg1i20_op {
> +	lu12iw_op	= 0x0a,
> +	lu32id_op	= 0x0b,
> +};
> +
> +enum reg2i12_op {
> +	lu52id_op	= 0x0c,
> +};
> +
> +enum reg2i16_op {
> +	jirl_op		= 0x13,
> +};
> +
> +struct reg1i20_format {
> +	__BITFIELD_FIELD(unsigned int opcode : 7,
> +	__BITFIELD_FIELD(unsigned int simmediate : 20,
> +	__BITFIELD_FIELD(unsigned int rd : 5,
> +	;)))
> +};
> +
> +struct reg2i12_format {
> +	__BITFIELD_FIELD(unsigned int opcode : 10,
> +	__BITFIELD_FIELD(signed int simmediate : 12,
> +	__BITFIELD_FIELD(unsigned int rj : 5,
> +	__BITFIELD_FIELD(unsigned int rd : 5,
> +	;))))
> +};
> +
> +struct reg2i16_format {
> +	__BITFIELD_FIELD(unsigned int opcode : 6,
> +	__BITFIELD_FIELD(unsigned int simmediate : 16,
> +	__BITFIELD_FIELD(unsigned int rj : 5,
> +	__BITFIELD_FIELD(unsigned int rd : 5,
> +	;))))
> +};
> +
> +union loongarch_instruction {
> +	unsigned int word;
> +	struct reg1i20_format reg1i20_format;
> +	struct reg2i12_format reg2i12_format;
> +	struct reg2i16_format reg2i16_format;

So the official names for instruction formats are like "2RI12" and 
"2RI16", while "1RI20" is not even one of the 9 "basic" formats... You 
may call the formats "fmt_XriXX", or better yet, use the 
loongarch-opcodes scheme [1] which is more well-defined than the 
official (which would be something like "struct fmt_djsk12 djsk12;").

Also while at it you could shorten "instruction" into "insn" like almost 
everywhere else (e.g. a few lines below).

[1]: https://github.com/loongson-community/loongarch-opcodes

> +};
> +
> +#define LOONGARCH_INSN_SIZE	sizeof(union loongarch_instruction)
> +
> +#endif /* _UAPI_ASM_INST_H */
> diff --git a/arch/loongarch/include/uapi/asm/reg.h b/arch/loongarch/include/uapi/asm/reg.h
> new file mode 100644
> index 000000000000..90ad910c60eb
> --- /dev/null
> +++ b/arch/loongarch/include/uapi/asm/reg.h
> @@ -0,0 +1,59 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +/*
> + * Various register offset definitions for debuggers, core file
> + * examiners and whatnot.
> + *
> + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> + */
> +
> +#ifndef __UAPI_ASM_LOONGARCH_REG_H
> +#define __UAPI_ASM_LOONGARCH_REG_H
> +
> +#define LOONGARCH_EF_R0		0
> +#define LOONGARCH_EF_R1		1
> +#define LOONGARCH_EF_R2		2
> +#define LOONGARCH_EF_R3		3
> +#define LOONGARCH_EF_R4		4
> +#define LOONGARCH_EF_R5		5
> +#define LOONGARCH_EF_R6		6
> +#define LOONGARCH_EF_R7		7
> +#define LOONGARCH_EF_R8		8
> +#define LOONGARCH_EF_R9		9
> +#define LOONGARCH_EF_R10	10
> +#define LOONGARCH_EF_R11	11
> +#define LOONGARCH_EF_R12	12
> +#define LOONGARCH_EF_R13	13
> +#define LOONGARCH_EF_R14	14
> +#define LOONGARCH_EF_R15	15
> +#define LOONGARCH_EF_R16	16
> +#define LOONGARCH_EF_R17	17
> +#define LOONGARCH_EF_R18	18
> +#define LOONGARCH_EF_R19	19
> +#define LOONGARCH_EF_R20	20
> +#define LOONGARCH_EF_R21	21
> +#define LOONGARCH_EF_R22	22
> +#define LOONGARCH_EF_R23	23
> +#define LOONGARCH_EF_R24	24
> +#define LOONGARCH_EF_R25	25
> +#define LOONGARCH_EF_R26	26
> +#define LOONGARCH_EF_R27	27
> +#define LOONGARCH_EF_R28	28
> +#define LOONGARCH_EF_R29	29
> +#define LOONGARCH_EF_R30	30
> +#define LOONGARCH_EF_R31	31
> +
> +/*
> + * Saved special registers
> + */
> +#define LOONGARCH_EF_ORIG_A0	32
> +#define LOONGARCH_EF_CSR_ERA	33
> +#define LOONGARCH_EF_CSR_BADV	34
> +#define LOONGARCH_EF_CSR_CRMD	35
> +#define LOONGARCH_EF_CSR_PRMD	36
> +#define LOONGARCH_EF_CSR_EUEN	37
> +#define LOONGARCH_EF_CSR_ECFG	38
> +#define LOONGARCH_EF_CSR_ESTAT	39
> +
> +#define LOONGARCH_EF_SIZE	320	/* size in bytes */
> +
> +#endif /* __UAPI_ASM_LOONGARCH_REG_H */
> diff --git a/tools/include/uapi/asm/bitsperlong.h b/tools/include/uapi/asm/bitsperlong.h
> index edba4d93e9e6..da5206517158 100644
> --- a/tools/include/uapi/asm/bitsperlong.h
> +++ b/tools/include/uapi/asm/bitsperlong.h
> @@ -17,6 +17,8 @@
>   #include "../../../arch/riscv/include/uapi/asm/bitsperlong.h"
>   #elif defined(__alpha__)
>   #include "../../../arch/alpha/include/uapi/asm/bitsperlong.h"
> +#elif defined(__loongarch__)
> +#include "../../../arch/loongarch/include/uapi/asm/bitsperlong.h"
>   #else
>   #include <asm-generic/bitsperlong.h>
>   #endif

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 17/24] LoongArch: Add some library functions
  2022-05-01 10:55   ` Guo Ren
@ 2022-05-01 12:18     ` Huacai Chen
  0 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-05-01 12:18 UTC (permalink / raw)
  To: Guo Ren
  Cc: Huacai Chen, Arnd Bergmann, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, Linux Doc Mailing List,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Xuerui Wang,
	Jiaxun Yang

Hi, Ren,

On Sun, May 1, 2022 at 6:56 PM Guo Ren <guoren@kernel.org> wrote:
>
> On Sat, Apr 30, 2022 at 5:23 PM Huacai Chen <chenhuacai@loongson.cn> wrote:
> >
> > This patch adds some library functions for LoongArch, including: delay,
> > memset, memcpy, memmove, copy_user, strncpy_user, strnlen_user and tlb
> > dump functions.
> >
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > ---
> >  arch/loongarch/include/asm/delay.h  |  26 +++++++
> >  arch/loongarch/include/asm/string.h |  17 +++++
> >  arch/loongarch/lib/clear_user.S     |  43 +++++++++++
> >  arch/loongarch/lib/copy_user.S      |  47 ++++++++++++
> >  arch/loongarch/lib/delay.c          |  43 +++++++++++
> >  arch/loongarch/lib/dump_tlb.c       | 111 ++++++++++++++++++++++++++++
> >  arch/loongarch/lib/memcpy.S         |  32 ++++++++
> >  arch/loongarch/lib/memmove.S        |  45 +++++++++++
> >  arch/loongarch/lib/memset.S         |  30 ++++++++
> >  9 files changed, 394 insertions(+)
> >  create mode 100644 arch/loongarch/include/asm/delay.h
> >  create mode 100644 arch/loongarch/include/asm/string.h
> >  create mode 100644 arch/loongarch/lib/clear_user.S
> >  create mode 100644 arch/loongarch/lib/copy_user.S
> >  create mode 100644 arch/loongarch/lib/delay.c
> >  create mode 100644 arch/loongarch/lib/dump_tlb.c
> >  create mode 100644 arch/loongarch/lib/memcpy.S
> >  create mode 100644 arch/loongarch/lib/memmove.S
> >  create mode 100644 arch/loongarch/lib/memset.S
> >
> > diff --git a/arch/loongarch/include/asm/delay.h b/arch/loongarch/include/asm/delay.h
> > new file mode 100644
> > index 000000000000..016b3aca65cb
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/delay.h
> > @@ -0,0 +1,26 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef _ASM_DELAY_H
> > +#define _ASM_DELAY_H
> > +
> > +#include <linux/param.h>
> > +
> > +extern void __delay(unsigned long loops);
> > +extern void __ndelay(unsigned long ns);
> > +extern void __udelay(unsigned long us);
> > +
> > +#define ndelay(ns) __ndelay(ns)
> > +#define udelay(us) __udelay(us)
> > +
> > +/* make sure "usecs *= ..." in udelay do not overflow. */
> > +#if HZ >= 1000
> > +#define MAX_UDELAY_MS  1
> > +#elif HZ <= 200
> > +#define MAX_UDELAY_MS  5
> > +#else
> > +#define MAX_UDELAY_MS  (1000 / HZ)
> > +#endif
> > +
> > +#endif /* _ASM_DELAY_H */
> > diff --git a/arch/loongarch/include/asm/string.h b/arch/loongarch/include/asm/string.h
> > new file mode 100644
> > index 000000000000..7b29cc9c70aa
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/string.h
> > @@ -0,0 +1,17 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef _ASM_STRING_H
> > +#define _ASM_STRING_H
> > +
> > +#define __HAVE_ARCH_MEMSET
> > +extern void *memset(void *__s, int __c, size_t __count);
> > +
> > +#define __HAVE_ARCH_MEMCPY
> > +extern void *memcpy(void *__to, __const__ void *__from, size_t __n);
> > +
> > +#define __HAVE_ARCH_MEMMOVE
> > +extern void *memmove(void *__dest, __const__ void *__src, size_t __n);
> > +
> > +#endif /* _ASM_STRING_H */
> > diff --git a/arch/loongarch/lib/clear_user.S b/arch/loongarch/lib/clear_user.S
> > new file mode 100644
> > index 000000000000..b8168d22ac80
> > --- /dev/null
> > +++ b/arch/loongarch/lib/clear_user.S
> > @@ -0,0 +1,43 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <asm/asm.h>
> > +#include <asm/asmmacro.h>
> > +#include <asm/export.h>
> > +#include <asm/regdef.h>
> > +
> > +.macro fixup_ex from, to, offset, fix
> > +.if \fix
> > +       .section .fixup, "ax"
> > +\to:   addi.d  v0, a1, \offset
> > +       jr      ra
> > +       .previous
> > +.endif
> > +       .section __ex_table, "a"
> > +       PTR     \from\()b, \to\()b
> > +       .previous
> > +.endm
> > +
> > +/*
> > + * unsigned long __clear_user(void *addr, size_t size)
> > + *
> > + * a0: addr
> > + * a1: size
> > + */
> > +SYM_FUNC_START(__clear_user)
> > +       beqz    a1, 2f
> > +
> > +1:     st.b    zero, a0, 0
> > +       addi.d  a0, a0, 1
> > +       addi.d  a1, a1, -1
> > +       bgt     a1, zero, 1b
> > +
> > +2:     move    v0, a1
> > +       jr      ra
> > +
> > +       fixup_ex 1, 3, 0, 1
> > +SYM_FUNC_END(__clear_user)
> > +
> > +EXPORT_SYMBOL(__clear_user)
> > diff --git a/arch/loongarch/lib/copy_user.S b/arch/loongarch/lib/copy_user.S
> > new file mode 100644
> > index 000000000000..43ed26304954
> > --- /dev/null
> > +++ b/arch/loongarch/lib/copy_user.S
> > @@ -0,0 +1,47 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <asm/asm.h>
> > +#include <asm/asmmacro.h>
> > +#include <asm/export.h>
> > +#include <asm/regdef.h>
> > +
> > +.macro fixup_ex from, to, offset, fix
> > +.if \fix
> > +       .section .fixup, "ax"
> > +\to:   addi.d  v0, a2, \offset
> > +       jr      ra
> > +       .previous
> > +.endif
> > +       .section __ex_table, "a"
> > +       PTR     \from\()b, \to\()b
> > +       .previous
> > +.endm
> > +
> > +/*
> > + * unsigned long __copy_user(void *to, const void *from, size_t n)
> > + *
> > + * a0: to
> > + * a1: from
> > + * a2: n
> > + */
> > +SYM_FUNC_START(__copy_user)
> > +       beqz    a2, 3f
> > +
> > +1:     ld.b    t0, a1, 0
> > +2:     st.b    t0, a0, 0
> > +       addi.d  a0, a0, 1
> > +       addi.d  a1, a1, 1
> > +       addi.d  a2, a2, -1
> > +       bgt     a2, zero, 1b
> > +
> > +3:     move    v0, a2
> > +       jr      ra
> > +
> > +       fixup_ex 1, 4, 0, 1
> > +       fixup_ex 2, 4, 0, 0
> > +SYM_FUNC_END(__copy_user)
> > +
> > +EXPORT_SYMBOL(__copy_user)
> > diff --git a/arch/loongarch/lib/delay.c b/arch/loongarch/lib/delay.c
> > new file mode 100644
> > index 000000000000..5d856694fcfe
> > --- /dev/null
> > +++ b/arch/loongarch/lib/delay.c
> > @@ -0,0 +1,43 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#include <linux/delay.h>
> > +#include <linux/export.h>
> > +#include <linux/smp.h>
> > +#include <linux/timex.h>
> > +
> > +#include <asm/compiler.h>
> > +#include <asm/processor.h>
> > +
> > +void __delay(unsigned long cycles)
> > +{
> > +       u64 t0 = get_cycles();
> > +
> > +       while ((unsigned long)(get_cycles() - t0) < cycles)
> > +               cpu_relax();
> > +}
> > +EXPORT_SYMBOL(__delay);
> > +
> > +/*
> > + * Division by multiplication: you don't have to worry about
> > + * loss of precision.
> > + *
> > + * Use only for very small delays ( < 1 msec). Should probably use a
> > + * lookup table, really, as the multiplications take much too long with
> > + * short delays.  This is a "reasonable" implementation, though (and the
> > + * first constant multiplications gets optimized away if the delay is
> > + * a constant)
> > + */
> > +
> > +void __udelay(unsigned long us)
> > +{
> > +       __delay((us * 0x000010c7ull * HZ * lpj_fine) >> 32);
> > +}
> > +EXPORT_SYMBOL(__udelay);
> > +
> > +void __ndelay(unsigned long ns)
> > +{
> > +       __delay((ns * 0x00000005ull * HZ * lpj_fine) >> 32);
> > +}
> > +EXPORT_SYMBOL(__ndelay);
> > diff --git a/arch/loongarch/lib/dump_tlb.c b/arch/loongarch/lib/dump_tlb.c
> > new file mode 100644
> > index 000000000000..cda2c6bc7f09
> > --- /dev/null
> > +++ b/arch/loongarch/lib/dump_tlb.c
> > @@ -0,0 +1,111 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + *
> > + * Derived from MIPS:
> > + * Copyright (C) 1994, 1995 by Waldorf Electronics, written by Ralf Baechle.
> > + * Copyright (C) 1999 by Silicon Graphics, Inc.
> > + */
> > +#include <linux/kernel.h>
> > +#include <linux/mm.h>
> > +
> > +#include <asm/loongarch.h>
> > +#include <asm/page.h>
> > +#include <asm/pgtable.h>
> > +#include <asm/tlb.h>
> > +
> > +void dump_tlb_regs(void)
> > +{
> > +       const int field = 2 * sizeof(unsigned long);
> > +
> > +       pr_info("Index    : %0x\n", read_csr_tlbidx());
> > +       pr_info("PageSize : %0x\n", read_csr_pagesize());
> > +       pr_info("EntryHi  : %0*llx\n", field, read_csr_entryhi());
> > +       pr_info("EntryLo0 : %0*llx\n", field, read_csr_entrylo0());
> > +       pr_info("EntryLo1 : %0*llx\n", field, read_csr_entrylo1());
> > +}
> > +
> > +static void dump_tlb(int first, int last)
> > +{
> > +       unsigned long s_entryhi, entryhi, asid;
> > +       unsigned long long entrylo0, entrylo1, pa;
> > +       unsigned int index;
> > +       unsigned int s_index, s_asid;
> > +       unsigned int pagesize, c0, c1, i;
> > +       unsigned long asidmask = cpu_asid_mask(&current_cpu_data);
> > +       int pwidth = 11;
> > +       int vwidth = 11;
> > +       int asidwidth = DIV_ROUND_UP(ilog2(asidmask) + 1, 4);
> > +
> > +       s_entryhi = read_csr_entryhi();
> > +       s_index = read_csr_tlbidx();
> > +       s_asid = read_csr_asid();
> > +
> > +       for (i = first; i <= last; i++) {
> > +               write_csr_index(i);
> > +               tlb_read();
> > +               pagesize = read_csr_pagesize();
> > +               entryhi  = read_csr_entryhi();
> > +               entrylo0 = read_csr_entrylo0();
> > +               entrylo1 = read_csr_entrylo1();
> > +               index = read_csr_tlbidx();
> > +               asid = read_csr_asid();
> > +
> > +               /* EHINV bit marks entire entry as invalid */
> > +               if (index & CSR_TLBIDX_EHINV)
> > +                       continue;
> > +               /*
> > +                * ASID takes effect in absence of G (global) bit.
> > +                */
> > +               if (!((entrylo0 | entrylo1) & ENTRYLO_G) &&
> > +                   asid != s_asid)
> > +                       continue;
> > +
> > +               /*
> > +                * Only print entries in use
> > +                */
> > +               pr_info("Index: %2d pgsize=%x ", i, (1 << pagesize));
> > +
> > +               c0 = (entrylo0 & ENTRYLO_C) >> ENTRYLO_C_SHIFT;
> > +               c1 = (entrylo1 & ENTRYLO_C) >> ENTRYLO_C_SHIFT;
> > +
> > +               pr_cont("va=%0*lx asid=%0*lx",
> > +                       vwidth, (entryhi & ~0x1fffUL), asidwidth, asid & asidmask);
> > +
> > +               /* NR/NX are in awkward places, so mask them off separately */
> > +               pa = entrylo0 & ~(ENTRYLO_NR | ENTRYLO_NX);
> > +               pa = pa & PAGE_MASK;
> > +               pr_cont("\n\t[");
> > +               pr_cont("ri=%d xi=%d ",
> > +                       (entrylo0 & ENTRYLO_NR) ? 1 : 0,
> > +                       (entrylo0 & ENTRYLO_NX) ? 1 : 0);
> > +               pr_cont("pa=%0*llx c=%d d=%d v=%d g=%d plv=%lld] [",
> > +                       pwidth, pa, c0,
> > +                       (entrylo0 & ENTRYLO_D) ? 1 : 0,
> > +                       (entrylo0 & ENTRYLO_V) ? 1 : 0,
> > +                       (entrylo0 & ENTRYLO_G) ? 1 : 0,
> > +                       (entrylo0 & ENTRYLO_PLV) >> ENTRYLO_PLV_SHIFT);
> > +               /* NR/NX are in awkward places, so mask them off separately */
> > +               pa = entrylo1 & ~(ENTRYLO_NR | ENTRYLO_NX);
> > +               pa = pa & PAGE_MASK;
> > +               pr_cont("ri=%d xi=%d ",
> > +                       (entrylo1 & ENTRYLO_NR) ? 1 : 0,
> > +                       (entrylo1 & ENTRYLO_NX) ? 1 : 0);
> > +               pr_cont("pa=%0*llx c=%d d=%d v=%d g=%d plv=%lld]\n",
> > +                       pwidth, pa, c1,
> > +                       (entrylo1 & ENTRYLO_D) ? 1 : 0,
> > +                       (entrylo1 & ENTRYLO_V) ? 1 : 0,
> > +                       (entrylo1 & ENTRYLO_G) ? 1 : 0,
> > +                       (entrylo1 & ENTRYLO_PLV) >> ENTRYLO_PLV_SHIFT);
> > +       }
> > +       pr_info("\n");
> > +
> > +       write_csr_entryhi(s_entryhi);
> > +       write_csr_tlbidx(s_index);
> > +       write_csr_asid(s_asid);
> > +}
> > +
> > +void dump_tlb_all(void)
> > +{
> > +       dump_tlb(0, current_cpu_data.tlbsize - 1);
> > +}
> > diff --git a/arch/loongarch/lib/memcpy.S b/arch/loongarch/lib/memcpy.S
> > new file mode 100644
> > index 000000000000..d53f1148d26b
> > --- /dev/null
> > +++ b/arch/loongarch/lib/memcpy.S
> > @@ -0,0 +1,32 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <asm/asmmacro.h>
> > +#include <asm/export.h>
> > +#include <asm/regdef.h>
> > +
> > +/*
> > + * void *memcpy(void *dst, const void *src, size_t n)
> > + *
> > + * a0: dst
> > + * a1: src
> > + * a2: n
> > + */
> > +SYM_FUNC_START(memcpy)
> > +       move    a3, a0
> > +       beqz    a2, 2f
> > +
> > +1:     ld.b    t0, a1, 0
> > +       st.b    t0, a0, 0
> > +       addi.d  a0, a0, 1
> > +       addi.d  a1, a1, 1
> > +       addi.d  a2, a2, -1
> > +       bgt     a2, zero, 1b
> > +
> > +2:     move    v0, a3
> > +       jr      ra
> > +SYM_FUNC_END(memcpy)
> > +
> > +EXPORT_SYMBOL(memcpy)
> > diff --git a/arch/loongarch/lib/memmove.S b/arch/loongarch/lib/memmove.S
> > new file mode 100644
> > index 000000000000..18907d83a83b
> > --- /dev/null
> > +++ b/arch/loongarch/lib/memmove.S
> > @@ -0,0 +1,45 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <asm/asmmacro.h>
> > +#include <asm/export.h>
> > +#include <asm/regdef.h>
> > +
> > +/*
> > + * void *rmemcpy(void *dst, const void *src, size_t n)
> > + *
> > + * a0: dst
> > + * a1: src
> > + * a2: n
> > + */
> > +SYM_FUNC_START(rmemcpy)
> > +       move    a3, a0
> > +       beqz    a2, 2f
> > +
> > +       add.d   a0, a0, a2
> > +       add.d   a1, a1, a2
> > +
> > +1:     ld.b    t0, a1, -1
> > +       st.b    t0, a0, -1
> > +       addi.d  a0, a0, -1
> > +       addi.d  a1, a1, -1
> > +       addi.d  a2, a2, -1
> > +       bgt     a2, zero, 1b
> > +
> > +2:     move    v0, a3
> > +       jr      ra
> > +SYM_FUNC_END(rmemcpy)
> Why not directly use:
I want to use "alternative" to provide multi-versions, but now I can
use the generic implemation.

Huacai
>
> lib/string.c:
> #ifndef __HAVE_ARCH_MEMCPY
> /**
>  * memcpy - Copy one area of memory to another
>  * @dest: Where to copy to
>  * @src: Where to copy from
>  * @count: The size of the area.
>  *
>  * You should not use this function to access IO space, use memcpy_toio()
>  * or memcpy_fromio() instead.
>  */
> void *memcpy(void *dest, const void *src, size_t count)
> {
>         char *tmp = dest;
>         const char *s = src;
>
>         while (count--)
>                 *tmp++ = *s++;
>         return dest;
> }
> EXPORT_SYMBOL(memcpy);
> #endif
>
> Do you want to try a C's string implementation?
> https://lore.kernel.org/linux-csky/202204051450.UN2k1raL-lkp@intel.com/T/#Z2e.:..:20220404142354.2792428-1-guoren::40kernel.org:1arch:csky:lib:string.c
>
> > +
> > +SYM_FUNC_START(memmove)
> > +       blt     a0, a1, 1f      /* dst < src, memcpy */
> > +       blt     a1, a0, 2f      /* src < dst, rmemcpy */
> > +       jr      ra              /* dst == src, return */
> > +
> > +1:     b       memcpy
> > +
> > +2:     b       rmemcpy
> > +SYM_FUNC_END(memmove)
> > +
> > +EXPORT_SYMBOL(memmove)
> > diff --git a/arch/loongarch/lib/memset.S b/arch/loongarch/lib/memset.S
> > new file mode 100644
> > index 000000000000..3fc3e7da5263
> > --- /dev/null
> > +++ b/arch/loongarch/lib/memset.S
> > @@ -0,0 +1,30 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <asm/asmmacro.h>
> > +#include <asm/export.h>
> > +#include <asm/regdef.h>
> > +
> > +/*
> > + * void *memset(void *s, int c, size_t n)
> > + *
> > + * a0: s
> > + * a1: c
> > + * a2: n
> > + */
> > +SYM_FUNC_START(memset)
> > +       move    a3, a0
> > +       beqz    a2, 2f
> > +
> > +1:     st.b    a1, a0, 0
> > +       addi.d  a0, a0, 1
> > +       addi.d  a2, a2, -1
> > +       bgt     a2, zero, 1b
> > +
> > +2:     move    v0, a3
> > +       jr      ra
> > +SYM_FUNC_END(memset)
> > +
> > +EXPORT_SYMBOL(memset)
> > --
> > 2.27.0
> >
>
>
> --
> Best Regards
>  Guo Ren
>
> ML: https://lore.kernel.org/linux-csky/

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 05/24] LoongArch: Add build infrastructure
  2022-05-01 10:09   ` WANG Xuerui
@ 2022-05-01 12:41     ` Huacai Chen
  2022-05-01 15:43     ` Xi Ruoyao
  1 sibling, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-05-01 12:41 UTC (permalink / raw)
  To: WANG Xuerui
  Cc: Huacai Chen, Arnd Bergmann, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION, LKML,
	Xuefeng Li, Yanteng Si, Guo Ren, Jiaxun Yang

Hi, Xuerui,

On Sun, May 1, 2022 at 6:09 PM WANG Xuerui <kernel@xen0n.name> wrote:
>
>
> On 4/30/22 17:04, Huacai Chen wrote:
> > This patch adds Kbuild, Makefile, Kconfig and link script for LoongArch
> > build infrastructure.
> >
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > ---
> >   arch/loongarch/.gitignore              |   9 +
> >   arch/loongarch/Kbuild                  |   3 +
> >   arch/loongarch/Kconfig                 | 351 +++++++++++++++++++++++++
> >   arch/loongarch/Kconfig.debug           |   0
> >   arch/loongarch/Makefile                |  99 +++++++
> >   arch/loongarch/include/asm/Kbuild      |  29 ++
> >   arch/loongarch/include/uapi/asm/Kbuild |   2 +
> >   arch/loongarch/kernel/Makefile         |  22 ++
> >   arch/loongarch/kernel/vmlinux.lds.S    | 100 +++++++
> >   arch/loongarch/lib/Makefile            |   7 +
> >   arch/loongarch/mm/Makefile             |   9 +
> >   arch/loongarch/pci/Makefile            |   7 +
> >   scripts/subarch.include                |   2 +-
> >   13 files changed, 639 insertions(+), 1 deletion(-)
> >   create mode 100644 arch/loongarch/.gitignore
> >   create mode 100644 arch/loongarch/Kbuild
> >   create mode 100644 arch/loongarch/Kconfig
> >   create mode 100644 arch/loongarch/Kconfig.debug
> >   create mode 100644 arch/loongarch/Makefile
> >   create mode 100644 arch/loongarch/include/asm/Kbuild
> >   create mode 100644 arch/loongarch/include/uapi/asm/Kbuild
> >   create mode 100644 arch/loongarch/kernel/Makefile
> >   create mode 100644 arch/loongarch/kernel/vmlinux.lds.S
> >   create mode 100644 arch/loongarch/lib/Makefile
> >   create mode 100644 arch/loongarch/mm/Makefile
> >   create mode 100644 arch/loongarch/pci/Makefile
> >
> > diff --git a/arch/loongarch/.gitignore b/arch/loongarch/.gitignore
> > new file mode 100644
> > index 000000000000..fd88d21e7172
> > --- /dev/null
> > +++ b/arch/loongarch/.gitignore
> > @@ -0,0 +1,9 @@
> > +*.lds
> > +*.raw
> > +calc_vmlinuz_load_addr
> > +elf-entry
> > +relocs
> > +vmlinux*
> > +vmlinuz*
> > +
> > +!kernel/vmlinux.lds.S
> This exclude entry is unnecessary?
> > diff --git a/arch/loongarch/Kbuild b/arch/loongarch/Kbuild
> > new file mode 100644
> > index 000000000000..1ad35aabdd16
> > --- /dev/null
> > +++ b/arch/loongarch/Kbuild
> > @@ -0,0 +1,3 @@
> > +obj-y += kernel/
> > +obj-y += mm/
> > +obj-y += vdso/
> > diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> > new file mode 100644
> > index 000000000000..44b763046893
> > --- /dev/null
> > +++ b/arch/loongarch/Kconfig
> > @@ -0,0 +1,351 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +config LOONGARCH
> > +     bool
> > +     default y
> > +     select ACPI_MCFG if ACPI
> > +     select ACPI_SYSTEM_POWER_STATES_SUPPORT if ACPI
> > +     select ARCH_BINFMT_ELF_STATE
> > +     select ARCH_ENABLE_MEMORY_HOTPLUG
> > +     select ARCH_ENABLE_MEMORY_HOTREMOVE
> > +     select ARCH_HAS_ACPI_TABLE_UPGRADE      if ACPI
> > +     select ARCH_HAS_PTE_SPECIAL
> > +     select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
> > +     select ARCH_INLINE_READ_LOCK if !PREEMPTION
> > +     select ARCH_INLINE_READ_LOCK_BH if !PREEMPTION
> > +     select ARCH_INLINE_READ_LOCK_IRQ if !PREEMPTION
> > +     select ARCH_INLINE_READ_LOCK_IRQSAVE if !PREEMPTION
> > +     select ARCH_INLINE_READ_UNLOCK if !PREEMPTION
> > +     select ARCH_INLINE_READ_UNLOCK_BH if !PREEMPTION
> > +     select ARCH_INLINE_READ_UNLOCK_IRQ if !PREEMPTION
> > +     select ARCH_INLINE_READ_UNLOCK_IRQRESTORE if !PREEMPTION
> > +     select ARCH_INLINE_WRITE_LOCK if !PREEMPTION
> > +     select ARCH_INLINE_WRITE_LOCK_BH if !PREEMPTION
> > +     select ARCH_INLINE_WRITE_LOCK_IRQ if !PREEMPTION
> > +     select ARCH_INLINE_WRITE_LOCK_IRQSAVE if !PREEMPTION
> > +     select ARCH_INLINE_WRITE_UNLOCK if !PREEMPTION
> > +     select ARCH_INLINE_WRITE_UNLOCK_BH if !PREEMPTION
> > +     select ARCH_INLINE_WRITE_UNLOCK_IRQ if !PREEMPTION
> > +     select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE if !PREEMPTION
> > +     select ARCH_INLINE_SPIN_TRYLOCK if !PREEMPTION
> > +     select ARCH_INLINE_SPIN_TRYLOCK_BH if !PREEMPTION
> > +     select ARCH_INLINE_SPIN_LOCK if !PREEMPTION
> > +     select ARCH_INLINE_SPIN_LOCK_BH if !PREEMPTION
> > +     select ARCH_INLINE_SPIN_LOCK_IRQ if !PREEMPTION
> > +     select ARCH_INLINE_SPIN_LOCK_IRQSAVE if !PREEMPTION
> > +     select ARCH_INLINE_SPIN_UNLOCK if !PREEMPTION
> > +     select ARCH_INLINE_SPIN_UNLOCK_BH if !PREEMPTION
> > +     select ARCH_INLINE_SPIN_UNLOCK_IRQ if !PREEMPTION
> > +     select ARCH_INLINE_SPIN_UNLOCK_IRQRESTORE if !PREEMPTION
> > +     select ARCH_MIGHT_HAVE_PC_PARPORT
> > +     select ARCH_MIGHT_HAVE_PC_SERIO
> > +     select ARCH_SPARSEMEM_ENABLE
> > +     select ARCH_SUPPORTS_ACPI
> > +     select ARCH_SUPPORTS_ATOMIC_RMW
> > +     select ARCH_SUPPORTS_HUGETLBFS
> > +     select ARCH_USE_BUILTIN_BSWAP
> > +     select ARCH_USE_CMPXCHG_LOCKREF
> > +     select ARCH_USE_QUEUED_RWLOCKS
> > +     select ARCH_USE_QUEUED_SPINLOCKS
> > +     select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT
> > +     select ARCH_WANTS_NO_INSTR
> > +     select BUILDTIME_TABLE_SORT
> > +     select COMMON_CLK
> > +     select GENERIC_CLOCKEVENTS
> > +     select GENERIC_CMOS_UPDATE
> > +     select GENERIC_CPU_AUTOPROBE
> > +     select GENERIC_ENTRY
> > +     select GENERIC_FIND_FIRST_BIT
> > +     select GENERIC_GETTIMEOFDAY
> > +     select GENERIC_IRQ_MULTI_HANDLER
> > +     select GENERIC_IRQ_PROBE
> > +     select GENERIC_IRQ_SHOW
> > +     select GENERIC_LIB_ASHLDI3
> > +     select GENERIC_LIB_ASHRDI3
> > +     select GENERIC_LIB_CMPDI2
> > +     select GENERIC_LIB_LSHRDI3
> > +     select GENERIC_LIB_UCMPDI2
> > +     select GENERIC_PCI_IOMAP
> > +     select GENERIC_SCHED_CLOCK
> > +     select GENERIC_TIME_VSYSCALL
> > +     select GPIOLIB
> > +     select HAVE_ARCH_AUDITSYSCALL
> > +     select HAVE_ARCH_COMPILER_H
> > +     select HAVE_ARCH_MMAP_RND_BITS if MMU
> > +     select HAVE_ARCH_SECCOMP_FILTER
> > +     select HAVE_ARCH_TRACEHOOK
> > +     select HAVE_ARCH_TRANSPARENT_HUGEPAGE
> > +     select HAVE_ASM_MODVERSIONS
> > +     select HAVE_CONTEXT_TRACKING
> > +     select HAVE_COPY_THREAD_TLS
> > +     select HAVE_DEBUG_KMEMLEAK
> > +     select HAVE_DEBUG_STACKOVERFLOW
> > +     select HAVE_DMA_CONTIGUOUS
> > +     select HAVE_EXIT_THREAD
> > +     select HAVE_FAST_GUP
> > +     select HAVE_GENERIC_VDSO
> > +     select HAVE_IOREMAP_PROT
> > +     select HAVE_IRQ_EXIT_ON_IRQ_STACK
> > +     select HAVE_IRQ_TIME_ACCOUNTING
> > +     select HAVE_MEMBLOCK
> > +     select HAVE_MEMBLOCK_NODE_MAP
> > +     select HAVE_MOD_ARCH_SPECIFIC
> > +     select HAVE_NMI
> > +     select HAVE_PCI
> > +     select HAVE_PERF_EVENTS
> > +     select HAVE_REGS_AND_STACK_ACCESS_API
> > +     select HAVE_RSEQ
> > +     select HAVE_SYSCALL_TRACEPOINTS
> > +     select HAVE_TIF_NOHZ
> > +     select HAVE_VIRT_CPU_ACCOUNTING_GEN
> > +     select IRQ_FORCED_THREADING
> > +     select IRQ_LOONGARCH_CPU
> > +     select MODULES_USE_ELF_RELA if MODULES
> > +     select PCI
> > +     select PCI_DOMAINS_GENERIC
> > +     select PCI_ECAM if ACPI
> > +     select PCI_MSI_ARCH_FALLBACKS
> > +     select PERF_USE_VMALLOC
> > +     select RTC_LIB
> > +     select SPARSE_IRQ
> > +     select SYSCTL_EXCEPTION_TRACE
> > +     select SWIOTLB
> > +     select TRACE_IRQFLAGS_SUPPORT
> > +     select ZONE_DMA32
> > +
> > +config 32BIT
> > +     bool
> > +
> > +config 64BIT
> > +     def_bool y
> > +
> > +config CPU_HAS_FPU
> > +     bool
> > +     default y
> > +
> > +config CPU_HAS_PREFETCH
> > +     bool
> > +     default y
> > +
> > +config GENERIC_CALIBRATE_DELAY
> > +     def_bool y
> > +
> > +config GENERIC_CSUM
> > +     def_bool y
> > +
> > +config GENERIC_HWEIGHT
> > +     def_bool y
> > +
> > +config L1_CACHE_SHIFT
> > +     int
> > +     default "6"
> > +
> > +config LOCKDEP_SUPPORT
> > +     bool
> > +     default y
> > +
> > +config MACH_LOONGSON32
> > +     def_bool 32BIT
> > +
> > +config MACH_LOONGSON64
> > +     def_bool 64BIT
> These two config symbols are not used anywhere in arch/loongarch, but
> from a quick grep it seems they're sharing the names of the MIPS config
> symbols, on purpose, maybe for sharing code between the MIPS-era
> Loongson models and the LoongArch models. If so, a comment explaining
> this could be beneficial.
> > +
> > +config PAGE_SIZE_4KB
> > +     bool
> > +
> > +config PAGE_SIZE_16KB
> > +     bool
> > +
> > +config PAGE_SIZE_64KB
> > +     bool
> > +
> > +config PGTABLE_2LEVEL
> > +     bool
> > +
> > +config PGTABLE_3LEVEL
> > +     bool
> > +
> > +config PGTABLE_4LEVEL
> > +     bool
> > +
> > +config PGTABLE_LEVELS
> > +     int
> > +     default 2 if PGTABLE_2LEVEL
> > +     default 3 if PGTABLE_3LEVEL
> > +     default 4 if PGTABLE_4LEVEL
> > +
> > +config SCHED_OMIT_FRAME_POINTER
> > +     bool
> > +     default y
> > +
> > +menu "Kernel type"
> > +
> > +source "kernel/Kconfig.hz"
> > +
> > +choice
> > +     prompt "Page Table Layout"
> > +     default 16KB_2LEVEL if 32BIT
> > +     default 16KB_3LEVEL if 64BIT
> > +     help
> > +       Allows choosing the page table layout, which is a combination
> > +       of page size and page table levels. The virtual memory address
> > +       space bits are determined by the page table layout.
> "The size of virtual memory address space"?
> > +
> > +config 4KB_3LEVEL
> > +     bool "4KB with 3 levels"
> > +     select PAGE_SIZE_4KB
> > +     select PGTABLE_3LEVEL
> > +     help
> > +       This option selects 4KB page size with 3 level page tables, which
> > +       support a maximum 39 bits of application virtual memory.
> "a maximum of XX bits" -- similarly for all occurrences below.
OK, most of your suggestions will be taken.

Huacai
> > +
> > +config 4KB_4LEVEL
> > +     bool "4KB with 4 levels"
> > +     select PAGE_SIZE_4KB
> > +     select PGTABLE_4LEVEL
> > +     help
> > +       This option selects 4KB page size with 4 level page tables, which
> > +       support a maximum 48 bits of application virtual memory.
> > +
> > +config 16KB_2LEVEL
> > +     bool "16KB with 2 levels"
> > +     select PAGE_SIZE_16KB
> > +     select PGTABLE_2LEVEL
> > +     help
> > +       This option selects 16KB page size with 2 level page tables, which
> > +       support a maximum 36 bits of application virtual memory.
> > +
> > +config 16KB_3LEVEL
> > +     bool "16KB with 3 levels"
> > +     select PAGE_SIZE_16KB
> > +     select PGTABLE_3LEVEL
> > +     help
> > +       This option selects 16KB page size with 3 level page tables, which
> > +       support a maximum 47 bits of application virtual memory.
> > +
> > +config 64KB_2LEVEL
> > +     bool "64KB with 2 levels"
> > +     select PAGE_SIZE_64KB
> > +     select PGTABLE_2LEVEL
> > +     help
> > +       This option selects 64KB page size with 2 level page tables, which
> > +       support a maximum 42 bits of application virtual memory.
> > +
> > +config 64KB_3LEVEL
> > +     bool "64KB with 3 levels"
> > +     select PAGE_SIZE_64KB
> > +     select PGTABLE_3LEVEL
> > +     help
> > +       This option selects 64KB page size with 3 level page tables, which
> > +       support a maximum 55 bits of application virtual memory.
> > +
> > +endchoice
> > +
> > +config DMI
> > +     bool "Enable DMI scanning"
> > +     select DMI_SCAN_MACHINE_NON_EFI_FALLBACK
> > +     default y
> > +     help
> > +       Enabled scanning of DMI to identify machine quirks. Say Y
> Should be "Enable scanning ..." but the arch/x86 and arch/mips versions
> of this text all have this typo. Might be wise to fix here... then fix
> the other two later.
> > +       here unless you have verified that your setup is not
> > +       affected by entries in the DMI blacklist. Required by PNP
> > +       BIOS code.
> Do we have a "PNP BIOS"? I know this is also copied text, but we may
> tweak it to suit our platform.
> > +
> > +config EFI
> > +     bool "EFI runtime service support"
> > +     select UCS2_STRING
> > +     select EFI_RUNTIME_WRAPPERS
> > +     help
> > +       This enables the kernel to use EFI runtime services that are
> > +       available (such as the EFI variable services).
> > +
> > +       This option is only useful on systems that have EFI firmware.
> > +       In addition, you should use the latest ELILO loader available
> > +       at <http://elilo.sourceforge.net> in order to take advantage
> > +       of EFI runtime services. However, even with this option, the
> Remove mention of ELILO?
> > +       resultant kernel should continue to boot on existing non-EFI
> > +       platforms.
> > +
> > +config FORCE_MAX_ZONEORDER
> > +     int "Maximum zone order"
> > +     range 14 64 if PAGE_SIZE_64KB
> > +     default "14" if PAGE_SIZE_64KB
> > +     range 12 64 if PAGE_SIZE_16KB
> > +     default "12" if PAGE_SIZE_16KB
> > +     range 11 64
> > +     default "11"
> > +     help
> > +       The kernel memory allocator divides physically contiguous memory
> > +       blocks into "zones", where each zone is a power of two number of
> > +       pages.  This option selects the largest power of two that the kernel
> > +       keeps in the memory allocator.  If you need to allocate very large
> > +       blocks of physically contiguous memory, then you may need to
> > +       increase this value.
> > +
> > +       This config option is actually maximum order plus one. For example,
> > +       a value of 11 means that the largest free memory block is 2^10 pages.
> > +
> > +       The page size is not necessarily 4KB.  Keep this in mind
> > +       when choosing a value for this option.
> > +
> > +config SECCOMP
> > +     bool "Enable seccomp to safely compute untrusted bytecode"
> > +     depends on PROC_FS
> > +     default y
> > +     help
> > +       This kernel feature is useful for number crunching applications
> > +       that may need to compute untrusted bytecode during their
> > +       execution. By using pipes or other transports made available to
> > +       the process as file descriptors supporting the read/write
> > +       syscalls, it's possible to isolate those applications in
> > +       their own address space using seccomp. Once seccomp is
> > +       enabled via /proc/<pid>/seccomp, it cannot be disabled
> > +       and the task is only allowed to execute a few safe syscalls
> > +       defined by each seccomp mode.
> > +
> > +       If unsure, say Y. Only embedded should say N here.
> > +
> > +endmenu
> > +
> > +config ARCH_SELECT_MEMORY_MODEL
> > +     def_bool y
> > +
> > +config ARCH_FLATMEM_ENABLE
> > +     def_bool y
> > +
> > +config ARCH_SPARSEMEM_ENABLE
> > +     def_bool y
> > +     help
> > +       Say Y to support efficient handling of sparse physical memory,
> > +       for architectures which are either NUMA (Non-Uniform Memory Access)
> > +       or have huge holes in the physical address space for other reasons.
> > +       See <file:Documentation/vm/numa.rst> for more.
> > +
> > +config ARCH_ENABLE_THP_MIGRATION
> > +     def_bool y
> > +     depends on TRANSPARENT_HUGEPAGE
> > +
> > +config ARCH_MEMORY_PROBE
> > +     def_bool y
> > +     depends on MEMORY_HOTPLUG
> > +
> > +config MMU
> > +     bool
> > +     default y
> > +
> > +config ARCH_MMAP_RND_BITS_MIN
> > +     default 12
> > +
> > +config ARCH_MMAP_RND_BITS_MAX
> > +     default 18
> > +
> > +menu "Bus options"
> > +
> > +endmenu
> > +
> > +menu "Power management options"
> > +
> > +source "drivers/acpi/Kconfig"
> > +
> > +endmenu
> > +
> > +source "drivers/firmware/Kconfig"
> > diff --git a/arch/loongarch/Kconfig.debug b/arch/loongarch/Kconfig.debug
> > new file mode 100644
> > index 000000000000..e69de29bb2d1
> > diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
> > new file mode 100644
> > index 000000000000..0a40e79b3265
> > --- /dev/null
> > +++ b/arch/loongarch/Makefile
> > @@ -0,0 +1,99 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +#
> > +# Author: Huacai Chen <chenhuacai@loongson.cn>
> > +# Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > +
> > +#
> > +# Select the object file format to substitute into the linker script.
> > +#
> > +64bit-tool-archpref  = loongarch64
> > +32bit-bfd            = elf32-loongarch
> > +64bit-bfd            = elf64-loongarch
> > +32bit-emul           = elf32loongarch
> > +64bit-emul           = elf64loongarch
> > +
> > +ifdef CONFIG_64BIT
> > +tool-archpref                = $(64bit-tool-archpref)
> > +UTS_MACHINE          := loongarch64
> > +endif
> > +
> > +ifneq ($(SUBARCH),$(ARCH))
> > +  ifeq ($(CROSS_COMPILE),)
> > +    CROSS_COMPILE := $(call cc-cross-prefix, $(tool-archpref)-linux-  $(tool-archpref)-linux-gnu-  $(tool-archpref)-unknown-linux-gnu-)
> > +  endif
> > +endif
> > +
> > +cflags-y += $(call cc-option, -mno-check-zero-division)
> > +
> > +ifdef CONFIG_64BIT
> > +ld-emul                      = $(64bit-emul)
> > +cflags-y             += -mabi=lp64s
> > +endif
> > +
> > +all-y                        := vmlinux
> > +
> > +#
> > +# GCC uses -G0 -mabicalls -fpic as default.  We don't want PIC in the kernel
> > +# code since it only slows down the whole thing.  At some point we might make
> > +# use of global pointer optimizations but their use of $r2 conflicts with
> > +# the current pointer optimization.
> LoongArch doesn't have any notion of "abicalls", please remove the whole
> MIPS legacy... or at least replace with something suitable for LoongArch.
> > +#
> > +cflags-y                     += -G0 -pipe
> > +cflags-y                     += -msoft-float
> > +LDFLAGS_vmlinux                      += -G0 -static -n -nostdlib
> > +KBUILD_AFLAGS_KERNEL         += -Wa,-mla-global-with-pcrel
> > +KBUILD_CFLAGS_KERNEL         += -Wa,-mla-global-with-pcrel
> > +KBUILD_AFLAGS_MODULE         += -Wa,-mla-global-with-abs
> > +KBUILD_CFLAGS_MODULE         += -fplt -Wa,-mla-global-with-abs,-mla-local-with-abs
> These switches are the ones that should receive more love via
> comments... they are here to tell the assembler to emit the "la.global"
> and "la.local" pseudo-insns in a particular "flavor". Why not simply use
> the default? This needs explanation!
> > +
> > +cflags-y += -ffreestanding
> > +cflags-y += $(call as-option,-Wa$(comma)-mno-fix-loongson3-llsc,)
> Unfortunately we're still working around the LL/SC hardware issue even
> after migrating to LoongArch... might be better to add a comment too.
> (something along the line of "we work around the issue manually in the
> handwritten assembly, so no automatic workarounds should kick in")
> > +
> > +load-y               = 0x9000000000200000
> > +bootvars-y   = VMLINUX_LOAD_ADDRESS=$(load-y)
> > +
> > +drivers-$(CONFIG_PCI)                += arch/loongarch/pci/
> > +
> > +KBUILD_AFLAGS        += $(cflags-y)
> > +KBUILD_CFLAGS        += $(cflags-y)
> > +KBUILD_CPPFLAGS += -DVMLINUX_LOAD_ADDRESS=$(load-y)
> > +
> > +# This is required to get dwarf unwinding tables into .debug_frame
> > +# instead of .eh_frame so we don't discard them.
> > +KBUILD_CFLAGS += -fno-asynchronous-unwind-tables
> > +KBUILD_CFLAGS += -isystem $(shell $(CC) -print-file-name=include)
> > +KBUILD_CFLAGS += $(call cc-option,-mstrict-align)
> Explain reason of this -mstrict-align request -- it's because not all
> LoongArch cores support unaligned accesses, and as kernel we can't rely
> on others to provide emulation for these accesses.
> > +
> > +KBUILD_LDFLAGS       += -m $(ld-emul)
> > +
> > +ifdef CONFIG_LOONGARCH
> > +CHECKFLAGS += $(shell $(CC) $(KBUILD_CFLAGS) -dM -E -x c /dev/null | \
> > +     egrep -vw '__GNUC_(MINOR_|PATCHLEVEL_)?_' | \
> > +     sed -e "s/^\#define /-D'/" -e "s/ /'='/" -e "s/$$/'/" -e 's/\$$/&&/g')
> > +endif
> > +
> > +head-y := arch/loongarch/kernel/head.o
> > +
> > +libs-y += arch/loongarch/lib/
> > +
> > +prepare: vdso_prepare
> > +vdso_prepare: prepare0
> > +     $(Q)$(MAKE) $(build)=arch/loongarch/vdso include/generated/vdso-offsets.h
> > +
> > +PHONY += vdso_install
> > +vdso_install:
> > +     $(Q)$(MAKE) $(build)=arch/loongarch/vdso $@
> > +
> > +all: $(all-y)
> > +
> > +CLEAN_FILES += vmlinux
> > +
> > +install:
> > +     $(Q)install -D -m 755 vmlinux $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
> > +     $(Q)install -D -m 644 .config $(INSTALL_PATH)/config-$(KERNELRELEASE)
> > +     $(Q)install -D -m 644 System.map $(INSTALL_PATH)/System.map-$(KERNELRELEASE)
> > +
> > +define archhelp
> > +     echo '  install              - install kernel into $(INSTALL_PATH)'
> > +     echo
> > +endef
> > diff --git a/arch/loongarch/include/asm/Kbuild b/arch/loongarch/include/asm/Kbuild
> > new file mode 100644
> > index 000000000000..a0eed6076c79
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/Kbuild
> > @@ -0,0 +1,29 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +generic-y += dma-contiguous.h
> > +generic-y += export.h
> > +generic-y += mcs_spinlock.h
> > +generic-y += parport.h
> > +generic-y += early_ioremap.h
> > +generic-y += qrwlock.h
> > +generic-y += qspinlock.h
> > +generic-y += rwsem.h
> > +generic-y += segment.h
> > +generic-y += user.h
> > +generic-y += stat.h
> > +generic-y += fcntl.h
> > +generic-y += ioctl.h
> > +generic-y += ioctls.h
> > +generic-y += mman.h
> > +generic-y += msgbuf.h
> > +generic-y += sembuf.h
> > +generic-y += shmbuf.h
> > +generic-y += statfs.h
> > +generic-y += socket.h
> > +generic-y += sockios.h
> > +generic-y += termios.h
> > +generic-y += termbits.h
> > +generic-y += poll.h
> > +generic-y += param.h
> > +generic-y += posix_types.h
> > +generic-y += resource.h
> > +generic-y += kvm_para.h
> > diff --git a/arch/loongarch/include/uapi/asm/Kbuild b/arch/loongarch/include/uapi/asm/Kbuild
> > new file mode 100644
> > index 000000000000..4aa680ca2e5f
> > --- /dev/null
> > +++ b/arch/loongarch/include/uapi/asm/Kbuild
> > @@ -0,0 +1,2 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +generic-y += kvm_para.h
> > diff --git a/arch/loongarch/kernel/Makefile b/arch/loongarch/kernel/Makefile
> > new file mode 100644
> > index 000000000000..ead27a11e8e0
> > --- /dev/null
> > +++ b/arch/loongarch/kernel/Makefile
> > @@ -0,0 +1,22 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +#
> > +# Makefile for the Linux/LoongArch kernel.
> > +#
> > +
> > +extra-y              := head.o vmlinux.lds
> > +
> > +obj-y                += cpu-probe.o cacheinfo.o cmdline.o env.o setup.o entry.o genex.o \
> > +                traps.o irq.o idle.o process.o dma.o mem.o io.o reset.o switch.o \
> > +                elf.o rtc.o syscall.o signal.o time.o topology.o cmpxchg.o \
> > +                inst.o ptrace.o vdso.o
> > +
> > +obj-$(CONFIG_ACPI)           += acpi.o
> > +obj-$(CONFIG_EFI)            += efi.o
> > +
> > +obj-$(CONFIG_CPU_HAS_FPU)    += fpu.o
> > +
> > +obj-$(CONFIG_MODULES)                += module.o module-sections.o
> > +
> > +obj-$(CONFIG_PROC_FS)                += proc.o
> > +
> > +CPPFLAGS_vmlinux.lds         := $(KBUILD_CFLAGS)
> > diff --git a/arch/loongarch/kernel/vmlinux.lds.S b/arch/loongarch/kernel/vmlinux.lds.S
> > new file mode 100644
> > index 000000000000..02abfaaa4892
> > --- /dev/null
> > +++ b/arch/loongarch/kernel/vmlinux.lds.S
> > @@ -0,0 +1,100 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#include <linux/sizes.h>
> > +#include <asm/asm-offsets.h>
> > +#include <asm/thread_info.h>
> > +
> > +#define PAGE_SIZE _PAGE_SIZE
> > +
> > +/*
> > + * Put .bss..swapper_pg_dir as the first thing in .bss. This will
> > + * ensure that it has .bss alignment (64K).
> > + */
> > +#define BSS_FIRST_SECTIONS *(.bss..swapper_pg_dir)
> > +
> > +#include <asm-generic/vmlinux.lds.h>
> > +
> > +OUTPUT_ARCH(loongarch)
> > +ENTRY(kernel_entry)
> > +PHDRS {
> > +     text PT_LOAD FLAGS(7);  /* RWX */
> > +     note PT_NOTE FLAGS(4);  /* R__ */
> > +}
> > +
> > +jiffies       = jiffies_64;
> > +
> > +SECTIONS
> > +{
> > +     . = VMLINUX_LOAD_ADDRESS;
> > +
> > +     _text = .;
> > +     .text : {
> > +             TEXT_TEXT
> > +             SCHED_TEXT
> > +             CPUIDLE_TEXT
> > +             LOCK_TEXT
> > +             KPROBES_TEXT
> > +             IRQENTRY_TEXT
> > +             SOFTIRQENTRY_TEXT
> > +             *(.fixup)
> > +             *(.gnu.warning)
> > +     } :text = 0
> > +     _etext = .;
> > +
> > +     EXCEPTION_TABLE(16)
> > +
> > +     . = ALIGN(PAGE_SIZE);
> > +     __init_begin = .;
> > +     __inittext_begin = .;
> > +
> > +     INIT_TEXT_SECTION(PAGE_SIZE)
> > +     .exit.text : {
> > +             EXIT_TEXT
> > +     }
> > +
> > +     __inittext_end = .;
> > +
> > +     __initdata_begin = .;
> > +
> > +     INIT_DATA_SECTION(16)
> > +     .exit.data : {
> > +             EXIT_DATA
> > +     }
> > +
> > +     __initdata_end = .;
> > +
> > +     __init_end = .;
> > +
> > +     _sdata = .;
> > +     RO_DATA(4096)
> > +     RW_DATA(1 << CONFIG_L1_CACHE_SHIFT, PAGE_SIZE, THREAD_SIZE)
> > +
> > +     .sdata : {
> > +             *(.sdata)
> > +     }
> > +
> > +     . = ALIGN(SZ_64K);
> > +     _edata =  .;
> > +
> > +     BSS_SECTION(0, SZ_64K, 8)
> > +
> > +     _end = .;
> > +
> > +     STABS_DEBUG
> > +     DWARF_DEBUG
> > +
> > +     .gptab.sdata : {
> > +             *(.gptab.data)
> > +             *(.gptab.sdata)
> > +     }
> > +     .gptab.sbss : {
> > +             *(.gptab.bss)
> > +             *(.gptab.sbss)
> > +     }
> > +
> > +     DISCARDS
> > +     /DISCARD/ : {
> > +             *(.gnu.attributes)
> > +             *(.options)
> > +             *(.eh_frame)
> > +     }
> > +}
> > diff --git a/arch/loongarch/lib/Makefile b/arch/loongarch/lib/Makefile
> > new file mode 100644
> > index 000000000000..7f32f3e4a6ec
> > --- /dev/null
> > +++ b/arch/loongarch/lib/Makefile
> > @@ -0,0 +1,7 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +#
> > +# Makefile for LoongArch-specific library files..
> One extra period at end of line.
> > +#
> > +
> > +lib-y        += delay.o memset.o memcpy.o memmove.o \
> > +        clear_user.o copy_user.o dump_tlb.o
> > diff --git a/arch/loongarch/mm/Makefile b/arch/loongarch/mm/Makefile
> > new file mode 100644
> > index 000000000000..8ffc6383f836
> > --- /dev/null
> > +++ b/arch/loongarch/mm/Makefile
> > @@ -0,0 +1,9 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +#
> > +# Makefile for the Linux/LoongArch-specific parts of the memory manager.
> > +#
> > +
> > +obj-y                                += init.o cache.o tlb.o tlbex.o extable.o \
> > +                                fault.o ioremap.o maccess.o mmap.o pgtable.o page.o
> > +
> > +obj-$(CONFIG_HUGETLB_PAGE)   += hugetlbpage.o
> > diff --git a/arch/loongarch/pci/Makefile b/arch/loongarch/pci/Makefile
> > new file mode 100644
> > index 000000000000..8101ef3df71c
> > --- /dev/null
> > +++ b/arch/loongarch/pci/Makefile
> > @@ -0,0 +1,7 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +#
> > +# Makefile for the PCI specific kernel interface routines under Linux.
> > +#
> > +
> > +obj-y                                += pci.o
> > +obj-$(CONFIG_ACPI)           += acpi.o
> > diff --git a/scripts/subarch.include b/scripts/subarch.include
> > index 776849a3c500..4bd327d0ae42 100644
> > --- a/scripts/subarch.include
> > +++ b/scripts/subarch.include
> > @@ -10,4 +10,4 @@ SUBARCH := $(shell uname -m | sed -e s/i.86/x86/ -e s/x86_64/x86/ \
> >                                 -e s/s390x/s390/ \
> >                                 -e s/ppc.*/powerpc/ -e s/mips.*/mips/ \
> >                                 -e s/sh[234].*/sh/ -e s/aarch64.*/arm64/ \
> > -                               -e s/riscv.*/riscv/)
> > +                               -e s/riscv.*/riscv/ -e s/loongarch.*/loongarch/)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 07/24] LoongArch: Add atomic/locking headers
  2022-05-01 11:16   ` WANG Xuerui
@ 2022-05-01 13:16     ` Huacai Chen
  0 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-05-01 13:16 UTC (permalink / raw)
  To: WANG Xuerui
  Cc: Huacai Chen, Arnd Bergmann, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION, LKML,
	Xuefeng Li, Yanteng Si, Guo Ren, Jiaxun Yang

Hi, Xuerui,

On Sun, May 1, 2022 at 7:16 PM WANG Xuerui <kernel@xen0n.name> wrote:
>
>
> On 4/30/22 17:05, Huacai Chen wrote:
> > This patch adds common headers (atomic, bitops, barrier and locking)
> > for basic LoongArch support.
> >
> > LoongArch has no native sub-word xchg/cmpxchg instructions now, but
> > LoongArch-based CPUs support NUMA (e.g., quad-core Loongson-3A5000
> > supports as many as 16 nodes, 64 cores in total). So, we emulate sub-
> > word xchg/cmpxchg in software and use qspinlock/qrwlock rather than
> > ticket locks.
> I'd leave the details for others more familiar with the intricate art of
> locking to review; here's only a couple minor suggestions.
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > ---
> >   arch/loongarch/include/asm/atomic.h         | 358 ++++++++++++++++++++
> >   arch/loongarch/include/asm/barrier.h        |  51 +++
> >   arch/loongarch/include/asm/bitops.h         |  33 ++
> >   arch/loongarch/include/asm/bitrev.h         |  34 ++
> >   arch/loongarch/include/asm/cmpxchg.h        | 135 ++++++++
> >   arch/loongarch/include/asm/local.h          | 138 ++++++++
> >   arch/loongarch/include/asm/percpu.h         |  20 ++
> >   arch/loongarch/include/asm/spinlock.h       |  12 +
> >   arch/loongarch/include/asm/spinlock_types.h |  11 +
> >   9 files changed, 792 insertions(+)
> >   create mode 100644 arch/loongarch/include/asm/atomic.h
> >   create mode 100644 arch/loongarch/include/asm/barrier.h
> >   create mode 100644 arch/loongarch/include/asm/bitops.h
> >   create mode 100644 arch/loongarch/include/asm/bitrev.h
> >   create mode 100644 arch/loongarch/include/asm/cmpxchg.h
> >   create mode 100644 arch/loongarch/include/asm/local.h
> >   create mode 100644 arch/loongarch/include/asm/percpu.h
> >   create mode 100644 arch/loongarch/include/asm/spinlock.h
> >   create mode 100644 arch/loongarch/include/asm/spinlock_types.h
> >
> > diff --git a/arch/loongarch/include/asm/atomic.h b/arch/loongarch/include/asm/atomic.h
> > new file mode 100644
> > index 000000000000..f0ed7f9c08c9
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/atomic.h
> > @@ -0,0 +1,358 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Atomic operations.
> > + *
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef _ASM_ATOMIC_H
> > +#define _ASM_ATOMIC_H
> > +
> > +#include <linux/types.h>
> > +#include <asm/barrier.h>
> > +#include <asm/cmpxchg.h>
> > +#include <asm/compiler.h>
> > +
> > +#if _LOONGARCH_SZLONG == 32
>
> Please don't use the MIPS-like macros, as they *may* go away (once my
> https://github.com/loongson/LoongArch-Documentation/pull/28 is merged);
> you may use the architecture-independent macro __SIZEOF_LONG__ instead
> (this would become "__SIZEOF_LONG__ == 4"). Or use
> __loongarch32/__loongarch64.
OK, thanks.

Huacai
>
> > +#define __LL         "ll.w   "
> > +#define __SC         "sc.w   "
> > +#define __AMADD              "amadd.w        "
> > +#define __AMAND_SYNC "amand_db.w     "
> "__AMADD_DB" would better match the instruction mnemonic... IIRC
> "amadd_sync" is the old LoongISA-era name!
> > +#define __AMOR_SYNC  "amor_db.w      "
> > +#define __AMXOR_SYNC "amxor_db.w     "
> > +#elif _LOONGARCH_SZLONG == 64
> > +#define __LL         "ll.d   "
> > +#define __SC         "sc.d   "
> > +#define __AMADD              "amadd.d        "
> > +#define __AMAND_SYNC "amand_db.d     "
> > +#define __AMOR_SYNC  "amor_db.d      "
> > +#define __AMXOR_SYNC "amxor_db.d     "
> > +#endif
> > +
> > +#define ATOMIC_INIT(i)         { (i) }
> > +
> > +/*
> > + * arch_atomic_read - read atomic variable
> > + * @v: pointer of type atomic_t
> > + *
> > + * Atomically reads the value of @v.
> > + */
> > +#define arch_atomic_read(v)  READ_ONCE((v)->counter)
> > +
> > +/*
> > + * arch_atomic_set - set atomic variable
> > + * @v: pointer of type atomic_t
> > + * @i: required value
> > + *
> > + * Atomically sets the value of @v to @i.
> > + */
> > +#define arch_atomic_set(v, i)        WRITE_ONCE((v)->counter, (i))
> > +
> > +#define ATOMIC_OP(op, I, asm_op)                                     \
> > +static inline void arch_atomic_##op(int i, atomic_t *v)                      \
> > +{                                                                    \
> > +     __asm__ __volatile__(                                           \
> > +     "am"#asm_op"_db.w" " $zero, %1, %0      \n"                     \
> > +     : "+ZB" (v->counter)                                            \
> > +     : "r" (I)                                                       \
> > +     : "memory");                                                    \
> > +}
> > +
> > +#define ATOMIC_OP_RETURN(op, I, asm_op, c_op)                                \
> > +static inline int arch_atomic_##op##_return_relaxed(int i, atomic_t *v)      \
> > +{                                                                    \
> > +     int result;                                                     \
> > +                                                                     \
> > +     __asm__ __volatile__(                                           \
> > +     "am"#asm_op"_db.w" " %1, %2, %0         \n"                     \
> > +     : "+ZB" (v->counter), "=&r" (result)                            \
> > +     : "r" (I)                                                       \
> > +     : "memory");                                                    \
> > +                                                                     \
> > +     return result c_op I;                                           \
> > +}
> > +
> > +#define ATOMIC_FETCH_OP(op, I, asm_op)                                       \
> > +static inline int arch_atomic_fetch_##op##_relaxed(int i, atomic_t *v)       \
> > +{                                                                    \
> > +     int result;                                                     \
> > +                                                                     \
> > +     __asm__ __volatile__(                                           \
> > +     "am"#asm_op"_db.w" " %1, %2, %0         \n"                     \
> > +     : "+ZB" (v->counter), "=&r" (result)                            \
> > +     : "r" (I)                                                       \
> > +     : "memory");                                                    \
> > +                                                                     \
> > +     return result;                                                  \
> > +}
> > +
> > +#define ATOMIC_OPS(op, I, asm_op, c_op)                                      \
> > +     ATOMIC_OP(op, I, asm_op)                                        \
> > +     ATOMIC_OP_RETURN(op, I, asm_op, c_op)                           \
> > +     ATOMIC_FETCH_OP(op, I, asm_op)
> > +
> > +ATOMIC_OPS(add, i, add, +)
> > +ATOMIC_OPS(sub, -i, add, +)
> > +
> > +#define arch_atomic_add_return_relaxed       arch_atomic_add_return_relaxed
> > +#define arch_atomic_sub_return_relaxed       arch_atomic_sub_return_relaxed
> > +#define arch_atomic_fetch_add_relaxed        arch_atomic_fetch_add_relaxed
> > +#define arch_atomic_fetch_sub_relaxed        arch_atomic_fetch_sub_relaxed
> > +
> > +#undef ATOMIC_OPS
> > +
> > +#define ATOMIC_OPS(op, I, asm_op)                                    \
> > +     ATOMIC_OP(op, I, asm_op)                                        \
> > +     ATOMIC_FETCH_OP(op, I, asm_op)
> > +
> > +ATOMIC_OPS(and, i, and)
> > +ATOMIC_OPS(or, i, or)
> > +ATOMIC_OPS(xor, i, xor)
> > +
> > +#define arch_atomic_fetch_and_relaxed        arch_atomic_fetch_and_relaxed
> > +#define arch_atomic_fetch_or_relaxed arch_atomic_fetch_or_relaxed
> > +#define arch_atomic_fetch_xor_relaxed        arch_atomic_fetch_xor_relaxed
> > +
> > +#undef ATOMIC_OPS
> > +#undef ATOMIC_FETCH_OP
> > +#undef ATOMIC_OP_RETURN
> > +#undef ATOMIC_OP
> > +
> > +static inline int arch_atomic_fetch_add_unless(atomic_t *v, int a, int u)
> > +{
> > +       int prev, rc;
> > +
> > +     __asm__ __volatile__ (
> > +             "0:     ll.w    %[p],  %[c]\n"
> > +             "       beq     %[p],  %[u], 1f\n"
> > +             "       add.w   %[rc], %[p], %[a]\n"
> > +             "       sc.w    %[rc], %[c]\n"
> > +             "       beqz    %[rc], 0b\n"
> > +             "       b       2f\n"
> > +             "1:\n"
> > +             __WEAK_LLSC_MB
> > +             "2:\n"
> > +             : [p]"=&r" (prev), [rc]"=&r" (rc),
> > +               [c]"=ZB" (v->counter)
> > +             : [a]"r" (a), [u]"r" (u)
> > +             : "memory");
> > +
> > +     return prev;
> > +}
> > +#define arch_atomic_fetch_add_unless arch_atomic_fetch_add_unless
> > +
> > +/*
> > + * arch_atomic_sub_if_positive - conditionally subtract integer from atomic variable
> > + * @i: integer value to subtract
> > + * @v: pointer of type atomic_t
> > + *
> > + * Atomically test @v and subtract @i if @v is greater or equal than @i.
> > + * The function returns the old value of @v minus @i.
> > + */
> > +static inline int arch_atomic_sub_if_positive(int i, atomic_t *v)
> > +{
> > +     int result;
> > +     int temp;
> > +
> > +     if (__builtin_constant_p(i)) {
> > +             __asm__ __volatile__(
> > +             "1:     ll.w    %1, %2          # atomic_sub_if_positive\n"
> > +             "       addi.w  %0, %1, %3                              \n"
> > +             "       or      %1, %0, $zero                           \n"
> > +             "       blt     %0, $zero, 2f                           \n"
> > +             "       sc.w    %1, %2                                  \n"
> > +             "       beq     $zero, %1, 1b                           \n"
> > +             "2:                                                     \n"
> > +             : "=&r" (result), "=&r" (temp),
> > +               "+" GCC_OFF_SMALL_ASM() (v->counter)
> > +             : "I" (-i));
> > +     } else {
> > +             __asm__ __volatile__(
> > +             "1:     ll.w    %1, %2          # atomic_sub_if_positive\n"
> > +             "       sub.w   %0, %1, %3                              \n"
> > +             "       or      %1, %0, $zero                           \n"
> > +             "       blt     %0, $zero, 2f                           \n"
> > +             "       sc.w    %1, %2                                  \n"
> > +             "       beq     $zero, %1, 1b                           \n"
> > +             "2:                                                     \n"
> > +             : "=&r" (result), "=&r" (temp),
> > +               "+" GCC_OFF_SMALL_ASM() (v->counter)
> > +             : "r" (i));
> > +     }
> > +
> > +     return result;
> > +}
> > +
> > +#define arch_atomic_cmpxchg(v, o, n) (arch_cmpxchg(&((v)->counter), (o), (n)))
> > +#define arch_atomic_xchg(v, new) (arch_xchg(&((v)->counter), (new)))
> > +
> > +/*
> > + * arch_atomic_dec_if_positive - decrement by 1 if old value positive
> > + * @v: pointer of type atomic_t
> > + */
> > +#define arch_atomic_dec_if_positive(v)       arch_atomic_sub_if_positive(1, v)
> > +
> > +#ifdef CONFIG_64BIT
> > +
> > +#define ATOMIC64_INIT(i)    { (i) }
> > +
> > +/*
> > + * arch_atomic64_read - read atomic variable
> > + * @v: pointer of type atomic64_t
> > + *
> > + */
> > +#define arch_atomic64_read(v)        READ_ONCE((v)->counter)
> > +
> > +/*
> > + * arch_atomic64_set - set atomic variable
> > + * @v: pointer of type atomic64_t
> > + * @i: required value
> > + */
> > +#define arch_atomic64_set(v, i)      WRITE_ONCE((v)->counter, (i))
> > +
> > +#define ATOMIC64_OP(op, I, asm_op)                                   \
> > +static inline void arch_atomic64_##op(long i, atomic64_t *v)         \
> > +{                                                                    \
> > +     __asm__ __volatile__(                                           \
> > +     "am"#asm_op"_db.d " " $zero, %1, %0     \n"                     \
> > +     : "+ZB" (v->counter)                                            \
> > +     : "r" (I)                                                       \
> > +     : "memory");                                                    \
> > +}
> > +
> > +#define ATOMIC64_OP_RETURN(op, I, asm_op, c_op)                                      \
> > +static inline long arch_atomic64_##op##_return_relaxed(long i, atomic64_t *v)        \
> > +{                                                                            \
> > +     long result;                                                            \
> > +     __asm__ __volatile__(                                                   \
> > +     "am"#asm_op"_db.d " " %1, %2, %0                \n"                     \
> > +     : "+ZB" (v->counter), "=&r" (result)                                    \
> > +     : "r" (I)                                                               \
> > +     : "memory");                                                            \
> > +                                                                             \
> > +     return result c_op I;                                                   \
> > +}
> > +
> > +#define ATOMIC64_FETCH_OP(op, I, asm_op)                                     \
> > +static inline long arch_atomic64_fetch_##op##_relaxed(long i, atomic64_t *v) \
> > +{                                                                            \
> > +     long result;                                                            \
> > +                                                                             \
> > +     __asm__ __volatile__(                                                   \
> > +     "am"#asm_op"_db.d " " %1, %2, %0                \n"                     \
> > +     : "+ZB" (v->counter), "=&r" (result)                                    \
> > +     : "r" (I)                                                               \
> > +     : "memory");                                                            \
> > +                                                                             \
> > +     return result;                                                          \
> > +}
> > +
> > +#define ATOMIC64_OPS(op, I, asm_op, c_op)                                  \
> > +     ATOMIC64_OP(op, I, asm_op)                                            \
> > +     ATOMIC64_OP_RETURN(op, I, asm_op, c_op)                               \
> > +     ATOMIC64_FETCH_OP(op, I, asm_op)
> > +
> > +ATOMIC64_OPS(add, i, add, +)
> > +ATOMIC64_OPS(sub, -i, add, +)
> > +
> > +#define arch_atomic64_add_return_relaxed     arch_atomic64_add_return_relaxed
> > +#define arch_atomic64_sub_return_relaxed     arch_atomic64_sub_return_relaxed
> > +#define arch_atomic64_fetch_add_relaxed              arch_atomic64_fetch_add_relaxed
> > +#define arch_atomic64_fetch_sub_relaxed              arch_atomic64_fetch_sub_relaxed
> > +
> > +#undef ATOMIC64_OPS
> > +
> > +#define ATOMIC64_OPS(op, I, asm_op)                                        \
> > +     ATOMIC64_OP(op, I, asm_op)                                            \
> > +     ATOMIC64_FETCH_OP(op, I, asm_op)
> > +
> > +ATOMIC64_OPS(and, i, and)
> > +ATOMIC64_OPS(or, i, or)
> > +ATOMIC64_OPS(xor, i, xor)
> > +
> > +#define arch_atomic64_fetch_and_relaxed      arch_atomic64_fetch_and_relaxed
> > +#define arch_atomic64_fetch_or_relaxed       arch_atomic64_fetch_or_relaxed
> > +#define arch_atomic64_fetch_xor_relaxed      arch_atomic64_fetch_xor_relaxed
> > +
> > +#undef ATOMIC64_OPS
> > +#undef ATOMIC64_FETCH_OP
> > +#undef ATOMIC64_OP_RETURN
> > +#undef ATOMIC64_OP
> > +
> > +static inline long arch_atomic64_fetch_add_unless(atomic64_t *v, long a, long u)
> > +{
> > +       long prev, rc;
> > +
> > +     __asm__ __volatile__ (
> > +             "0:     ll.d    %[p],  %[c]\n"
> > +             "       beq     %[p],  %[u], 1f\n"
> > +             "       add.d   %[rc], %[p], %[a]\n"
> > +             "       sc.d    %[rc], %[c]\n"
> > +             "       beqz    %[rc], 0b\n"
> > +             "       b       2f\n"
> > +             "1:\n"
> > +             __WEAK_LLSC_MB
> > +             "2:\n"
> > +             : [p]"=&r" (prev), [rc]"=&r" (rc),
> > +               [c] "=ZB" (v->counter)
> > +             : [a]"r" (a), [u]"r" (u)
> > +             : "memory");
> > +
> > +     return prev;
> > +}
> > +#define arch_atomic64_fetch_add_unless arch_atomic64_fetch_add_unless
> > +
> > +/*
> > + * arch_atomic64_sub_if_positive - conditionally subtract integer from atomic variable
> > + * @i: integer value to subtract
> > + * @v: pointer of type atomic64_t
> > + *
> > + * Atomically test @v and subtract @i if @v is greater or equal than @i.
> > + * The function returns the old value of @v minus @i.
> > + */
> > +static inline long arch_atomic64_sub_if_positive(long i, atomic64_t *v)
> > +{
> > +     long result;
> > +     long temp;
> > +
> > +     if (__builtin_constant_p(i)) {
> > +             __asm__ __volatile__(
> > +             "1:     ll.d    %1, %2  # atomic64_sub_if_positive      \n"
> > +             "       addi.d  %0, %1, %3                              \n"
> > +             "       or      %1, %0, $zero                           \n"
> > +             "       blt     %0, $zero, 2f                           \n"
> > +             "       sc.d    %1, %2                                  \n"
> > +             "       beq     %1, $zero, 1b                           \n"
> > +             "2:                                                     \n"
> > +             : "=&r" (result), "=&r" (temp),
> > +               "+" GCC_OFF_SMALL_ASM() (v->counter)
> > +             : "I" (-i));
> > +     } else {
> > +             __asm__ __volatile__(
> > +             "1:     ll.d    %1, %2  # atomic64_sub_if_positive      \n"
> > +             "       sub.d   %0, %1, %3                              \n"
> > +             "       or      %1, %0, $zero                           \n"
> > +             "       blt     %0, $zero, 2f                           \n"
> > +             "       sc.d    %1, %2                                  \n"
> > +             "       beq     %1, $zero, 1b                           \n"
> > +             "2:                                                     \n"
> > +             : "=&r" (result), "=&r" (temp),
> > +               "+" GCC_OFF_SMALL_ASM() (v->counter)
> > +             : "r" (i));
> > +     }
> > +
> > +     return result;
> > +}
> > +
> > +#define arch_atomic64_cmpxchg(v, o, n) \
> > +     ((__typeof__((v)->counter))arch_cmpxchg(&((v)->counter), (o), (n)))
> > +#define arch_atomic64_xchg(v, new) (arch_xchg(&((v)->counter), (new)))
> > +
> > +/*
> > + * arch_atomic64_dec_if_positive - decrement by 1 if old value positive
> > + * @v: pointer of type atomic64_t
> > + */
> > +#define arch_atomic64_dec_if_positive(v)     arch_atomic64_sub_if_positive(1, v)
> > +
> > +#endif /* CONFIG_64BIT */
> > +
> > +#endif /* _ASM_ATOMIC_H */
> > diff --git a/arch/loongarch/include/asm/barrier.h b/arch/loongarch/include/asm/barrier.h
> > new file mode 100644
> > index 000000000000..cc6c7e3f5ce6
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/barrier.h
> > @@ -0,0 +1,51 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef __ASM_BARRIER_H
> > +#define __ASM_BARRIER_H
> > +
> > +#define __sync()     __asm__ __volatile__("dbar 0" : : : "memory")
> > +
> > +#define fast_wmb()   __sync()
> > +#define fast_rmb()   __sync()
> > +#define fast_mb()    __sync()
> > +#define fast_iob()   __sync()
> > +#define wbflush()    __sync()
> > +
> > +#define wmb()                fast_wmb()
> > +#define rmb()                fast_rmb()
> > +#define mb()         fast_mb()
> > +#define iob()                fast_iob()
> > +
> > +/**
> > + * array_index_mask_nospec() - generate a ~0 mask when index < size, 0 otherwise
> > + * @index: array element index
> > + * @size: number of elements in array
> > + *
> > + * Returns:
> > + *     0 - (@index < @size)
> > + */
> > +#define array_index_mask_nospec array_index_mask_nospec
> > +static inline unsigned long array_index_mask_nospec(unsigned long index,
> > +                                                 unsigned long size)
> > +{
> > +     unsigned long mask;
> > +
> > +     __asm__ __volatile__(
> > +             "sltu   %0, %1, %2\n\t"
> > +#if (_LOONGARCH_SZLONG == 32)
> > +             "sub.w  %0, $r0, %0\n\t"
> > +#elif (_LOONGARCH_SZLONG == 64)
> > +             "sub.d  %0, $r0, %0\n\t"
> > +#endif
> > +             : "=r" (mask)
> > +             : "r" (index), "r" (size)
> > +             :);
> > +
> > +     return mask;
> > +}
> > +
> > +#include <asm-generic/barrier.h>
> > +
> > +#endif /* __ASM_BARRIER_H */
> > diff --git a/arch/loongarch/include/asm/bitops.h b/arch/loongarch/include/asm/bitops.h
> > new file mode 100644
> > index 000000000000..69e00f8d8034
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/bitops.h
> > @@ -0,0 +1,33 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef _ASM_BITOPS_H
> > +#define _ASM_BITOPS_H
> > +
> > +#include <linux/compiler.h>
> > +
> > +#ifndef _LINUX_BITOPS_H
> > +#error only <linux/bitops.h> can be included directly
> > +#endif
> > +
> > +#include <asm/barrier.h>
> > +
> > +#include <asm-generic/bitops/builtin-ffs.h>
> > +#include <asm-generic/bitops/builtin-fls.h>
> > +#include <asm-generic/bitops/builtin-__ffs.h>
> > +#include <asm-generic/bitops/builtin-__fls.h>
> > +
> > +#include <asm-generic/bitops/ffz.h>
> > +#include <asm-generic/bitops/fls64.h>
> > +
> > +#include <asm-generic/bitops/sched.h>
> > +#include <asm-generic/bitops/hweight.h>
> > +
> > +#include <asm-generic/bitops/atomic.h>
> > +#include <asm-generic/bitops/non-atomic.h>
> > +#include <asm-generic/bitops/lock.h>
> > +#include <asm-generic/bitops/le.h>
> > +#include <asm-generic/bitops/ext2-atomic.h>
> > +
> > +#endif /* _ASM_BITOPS_H */
> > diff --git a/arch/loongarch/include/asm/bitrev.h b/arch/loongarch/include/asm/bitrev.h
> > new file mode 100644
> > index 000000000000..46f275b9cdf7
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/bitrev.h
> > @@ -0,0 +1,34 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef __LOONGARCH_ASM_BITREV_H__
> > +#define __LOONGARCH_ASM_BITREV_H__
> > +
> > +#include <linux/swab.h>
> > +
> > +static __always_inline __attribute_const__ u32 __arch_bitrev32(u32 x)
> > +{
> > +     u32 ret;
> > +
> > +     asm("bitrev.4b  %0, %1" : "=r"(ret) : "r"(__swab32(x)));
> > +     return ret;
> > +}
> > +
> > +static __always_inline __attribute_const__ u16 __arch_bitrev16(u16 x)
> > +{
> > +     u16 ret;
> > +
> > +     asm("bitrev.4b  %0, %1" : "=r"(ret) : "r"(__swab16(x)));
> > +     return ret;
> > +}
> > +
> > +static __always_inline __attribute_const__ u8 __arch_bitrev8(u8 x)
> > +{
> > +     u8 ret;
> > +
> > +     asm("bitrev.4b  %0, %1" : "=r"(ret) : "r"(x));
> > +     return ret;
> > +}
> > +
> > +#endif /* __LOONGARCH_ASM_BITREV_H__ */
> > diff --git a/arch/loongarch/include/asm/cmpxchg.h b/arch/loongarch/include/asm/cmpxchg.h
> > new file mode 100644
> > index 000000000000..69c3e2b7827d
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/cmpxchg.h
> > @@ -0,0 +1,135 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef __ASM_CMPXCHG_H
> > +#define __ASM_CMPXCHG_H
> > +
> > +#include <linux/build_bug.h>
> > +
> > +#define __xchg_asm(amswap_db, m, val)                \
> > +({                                           \
> > +             __typeof(val) __ret;            \
> > +                                             \
> > +             __asm__ __volatile__ (          \
> > +             " "amswap_db" %1, %z2, %0 \n"   \
> > +             : "+ZB" (*m), "=&r" (__ret)     \
> > +             : "Jr" (val)                    \
> > +             : "memory");                    \
> > +                                             \
> > +             __ret;                          \
> > +})
> > +
> > +extern unsigned long __xchg_small(volatile void *ptr, unsigned long x,
> > +                               unsigned int size);
> > +
> > +static inline unsigned long __xchg(volatile void *ptr, unsigned long x,
> > +                                int size)
> > +{
> > +     switch (size) {
> > +     case 1:
> > +     case 2:
> > +             return __xchg_small(ptr, x, size);
> > +
> > +     case 4:
> > +             return __xchg_asm("amswap_db.w", (volatile u32 *)ptr, (u32)x);
> > +
> > +     case 8:
> > +             return __xchg_asm("amswap_db.d", (volatile u64 *)ptr, (u64)x);
> > +
> > +     default:
> > +             BUILD_BUG();
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +#define arch_xchg(ptr, x)                                            \
> > +({                                                                   \
> > +     __typeof__(*(ptr)) __res;                                       \
> > +                                                                     \
> > +     __res = (__typeof__(*(ptr)))                                    \
> > +             __xchg((ptr), (unsigned long)(x), sizeof(*(ptr)));      \
> > +                                                                     \
> > +     __res;                                                          \
> > +})
> > +
> > +#define __cmpxchg_asm(ld, st, m, old, new)                           \
> > +({                                                                   \
> > +     __typeof(old) __ret;                                            \
> > +                                                                     \
> > +     __asm__ __volatile__(                                           \
> > +     "1:     " ld "  %0, %2          # __cmpxchg_asm \n"             \
> > +     "       bne     %0, %z3, 2f                     \n"             \
> > +     "       or      $t0, %z4, $zero                 \n"             \
> > +     "       " st "  $t0, %1                         \n"             \
> > +     "       beq     $zero, $t0, 1b                  \n"             \
> > +     "2:                                             \n"             \
> > +     : "=&r" (__ret), "=ZB"(*m)                                      \
> > +     : "ZB"(*m), "Jr" (old), "Jr" (new)                              \
> > +     : "t0", "memory");                                              \
> > +                                                                     \
> > +     __ret;                                                          \
> > +})
> > +
> > +extern unsigned long __cmpxchg_small(volatile void *ptr, unsigned long old,
> > +                                  unsigned long new, unsigned int size);
> > +
> > +static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old,
> > +                                   unsigned long new, unsigned int size)
> > +{
> > +     switch (size) {
> > +     case 1:
> > +     case 2:
> > +             return __cmpxchg_small(ptr, old, new, size);
> > +
> > +     case 4:
> > +             return __cmpxchg_asm("ll.w", "sc.w", (volatile u32 *)ptr,
> > +                                  (u32)old, new);
> > +
> > +     case 8:
> > +             return __cmpxchg_asm("ll.d", "sc.d", (volatile u64 *)ptr,
> > +                                  (u64)old, new);
> > +
> > +     default:
> > +             BUILD_BUG();
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +#define arch_cmpxchg_local(ptr, old, new)                            \
> > +     ((__typeof__(*(ptr)))                                           \
> > +             __cmpxchg((ptr),                                        \
> > +                       (unsigned long)(__typeof__(*(ptr)))(old),     \
> > +                       (unsigned long)(__typeof__(*(ptr)))(new),     \
> > +                       sizeof(*(ptr))))
> > +
> > +#define arch_cmpxchg(ptr, old, new)                                  \
> > +({                                                                   \
> > +     __typeof__(*(ptr)) __res;                                       \
> > +                                                                     \
> > +     __res = arch_cmpxchg_local((ptr), (old), (new));                \
> > +                                                                     \
> > +     __res;                                                          \
> > +})
> > +
> > +#ifdef CONFIG_64BIT
> > +#define arch_cmpxchg64_local(ptr, o, n)                                      \
> > +  ({                                                                 \
> > +     BUILD_BUG_ON(sizeof(*(ptr)) != 8);                              \
> > +     arch_cmpxchg_local((ptr), (o), (n));                            \
> > +  })
> > +
> > +#define arch_cmpxchg64(ptr, o, n)                                    \
> > +  ({                                                                 \
> > +     BUILD_BUG_ON(sizeof(*(ptr)) != 8);                              \
> > +     arch_cmpxchg((ptr), (o), (n));                                  \
> > +  })
> > +#else
> > +#include <asm-generic/cmpxchg-local.h>
> > +#define arch_cmpxchg64_local(ptr, o, n) __generic_cmpxchg64_local((ptr), (o), (n))
> > +#define arch_cmpxchg64(ptr, o, n) arch_cmpxchg64_local((ptr), (o), (n))
> > +#endif
> > +
> > +#endif /* __ASM_CMPXCHG_H */
> > diff --git a/arch/loongarch/include/asm/local.h b/arch/loongarch/include/asm/local.h
> > new file mode 100644
> > index 000000000000..2052a2267337
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/local.h
> > @@ -0,0 +1,138 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef _ARCH_LOONGARCH_LOCAL_H
> > +#define _ARCH_LOONGARCH_LOCAL_H
> > +
> > +#include <linux/percpu.h>
> > +#include <linux/bitops.h>
> > +#include <linux/atomic.h>
> > +#include <asm/cmpxchg.h>
> > +#include <asm/compiler.h>
> > +
> > +typedef struct {
> > +     atomic_long_t a;
> > +} local_t;
> > +
> > +#define LOCAL_INIT(i)        { ATOMIC_LONG_INIT(i) }
> > +
> > +#define local_read(l)        atomic_long_read(&(l)->a)
> > +#define local_set(l, i) atomic_long_set(&(l)->a, (i))
> > +
> > +#define local_add(i, l) atomic_long_add((i), (&(l)->a))
> > +#define local_sub(i, l) atomic_long_sub((i), (&(l)->a))
> > +#define local_inc(l) atomic_long_inc(&(l)->a)
> > +#define local_dec(l) atomic_long_dec(&(l)->a)
> > +
> > +/*
> > + * Same as above, but return the result value
> > + */
> > +static inline long local_add_return(long i, local_t *l)
> > +{
> > +     unsigned long result;
> > +
> > +     __asm__ __volatile__(
> > +     "   " __AMADD " %1, %2, %0      \n"
> > +     : "+ZB" (l->a.counter), "=&r" (result)
> > +     : "r" (i)
> > +     : "memory");
> > +     result = result + i;
> > +
> > +     return result;
> > +}
> > +
> > +static inline long local_sub_return(long i, local_t *l)
> > +{
> > +     unsigned long result;
> > +
> > +     __asm__ __volatile__(
> > +     "   " __AMADD "%1, %2, %0       \n"
> > +     : "+ZB" (l->a.counter), "=&r" (result)
> > +     : "r" (-i)
> > +     : "memory");
> > +
> > +     result = result - i;
> > +
> > +     return result;
> > +}
> > +
> > +#define local_cmpxchg(l, o, n) \
> > +     ((long)cmpxchg_local(&((l)->a.counter), (o), (n)))
> > +#define local_xchg(l, n) (atomic_long_xchg((&(l)->a), (n)))
> > +
> > +/**
> > + * local_add_unless - add unless the number is a given value
> > + * @l: pointer of type local_t
> > + * @a: the amount to add to l...
> > + * @u: ...unless l is equal to u.
> > + *
> > + * Atomically adds @a to @l, so long as it was not @u.
> > + * Returns non-zero if @l was not @u, and zero otherwise.
> > + */
> > +#define local_add_unless(l, a, u)                            \
> > +({                                                           \
> > +     long c, old;                                            \
> > +     c = local_read(l);                                      \
> > +     while (c != (u) && (old = local_cmpxchg((l), c, c + (a))) != c) \
> > +             c = old;                                        \
> > +     c != (u);                                               \
> > +})
> > +#define local_inc_not_zero(l) local_add_unless((l), 1, 0)
> > +
> > +#define local_dec_return(l) local_sub_return(1, (l))
> > +#define local_inc_return(l) local_add_return(1, (l))
> > +
> > +/*
> > + * local_sub_and_test - subtract value from variable and test result
> > + * @i: integer value to subtract
> > + * @l: pointer of type local_t
> > + *
> > + * Atomically subtracts @i from @l and returns
> > + * true if the result is zero, or false for all
> > + * other cases.
> > + */
> > +#define local_sub_and_test(i, l) (local_sub_return((i), (l)) == 0)
> > +
> > +/*
> > + * local_inc_and_test - increment and test
> > + * @l: pointer of type local_t
> > + *
> > + * Atomically increments @l by 1
> > + * and returns true if the result is zero, or false for all
> > + * other cases.
> > + */
> > +#define local_inc_and_test(l) (local_inc_return(l) == 0)
> > +
> > +/*
> > + * local_dec_and_test - decrement by 1 and test
> > + * @l: pointer of type local_t
> > + *
> > + * Atomically decrements @l by 1 and
> > + * returns true if the result is 0, or false for all other
> > + * cases.
> > + */
> > +#define local_dec_and_test(l) (local_sub_return(1, (l)) == 0)
> > +
> > +/*
> > + * local_add_negative - add and test if negative
> > + * @l: pointer of type local_t
> > + * @i: integer value to add
> > + *
> > + * Atomically adds @i to @l and returns true
> > + * if the result is negative, or false when
> > + * result is greater than or equal to zero.
> > + */
> > +#define local_add_negative(i, l) (local_add_return(i, (l)) < 0)
> > +
> > +/* Use these for per-cpu local_t variables: on some archs they are
> > + * much more efficient than these naive implementations.  Note they take
> > + * a variable, not an address.
> > + */
> > +
> > +#define __local_inc(l)               ((l)->a.counter++)
> > +#define __local_dec(l)               ((l)->a.counter++)
> > +#define __local_add(i, l)    ((l)->a.counter += (i))
> > +#define __local_sub(i, l)    ((l)->a.counter -= (i))
> > +
> > +#endif /* _ARCH_LOONGARCH_LOCAL_H */
> > diff --git a/arch/loongarch/include/asm/percpu.h b/arch/loongarch/include/asm/percpu.h
> > new file mode 100644
> > index 000000000000..7d5b22ebd834
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/percpu.h
> > @@ -0,0 +1,20 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef __ASM_PERCPU_H
> > +#define __ASM_PERCPU_H
> > +
> > +/* Use r21 for fast access */
> > +register unsigned long __my_cpu_offset __asm__("$r21");
> > +
> > +static inline void set_my_cpu_offset(unsigned long off)
> > +{
> > +     __my_cpu_offset = off;
> > +     csr_writeq(off, PERCPU_BASE_KS);
> > +}
> > +#define __my_cpu_offset __my_cpu_offset
> > +
> > +#include <asm-generic/percpu.h>
> > +
> > +#endif /* __ASM_PERCPU_H */
> > diff --git a/arch/loongarch/include/asm/spinlock.h b/arch/loongarch/include/asm/spinlock.h
> > new file mode 100644
> > index 000000000000..7cb3476999be
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/spinlock.h
> > @@ -0,0 +1,12 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef _ASM_SPINLOCK_H
> > +#define _ASM_SPINLOCK_H
> > +
> > +#include <asm/processor.h>
> > +#include <asm/qspinlock.h>
> > +#include <asm/qrwlock.h>
> > +
> > +#endif /* _ASM_SPINLOCK_H */
> > diff --git a/arch/loongarch/include/asm/spinlock_types.h b/arch/loongarch/include/asm/spinlock_types.h
> > new file mode 100644
> > index 000000000000..7458d036c161
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/spinlock_types.h
> > @@ -0,0 +1,11 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef _ASM_SPINLOCK_TYPES_H
> > +#define _ASM_SPINLOCK_TYPES_H
> > +
> > +#include <asm-generic/qspinlock_types.h>
> > +#include <asm-generic/qrwlock_types.h>
> > +
> > +#endif

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 08/24] LoongArch: Add other common headers
  2022-05-01 11:39   ` WANG Xuerui
@ 2022-05-01 14:26     ` Huacai Chen
  0 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-05-01 14:26 UTC (permalink / raw)
  To: WANG Xuerui
  Cc: Huacai Chen, Arnd Bergmann, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION, LKML,
	Xuefeng Li, Yanteng Si, Guo Ren, Jiaxun Yang

Hi, Xuerui,

On Sun, May 1, 2022 at 7:39 PM WANG Xuerui <kernel@xen0n.name> wrote:
>
>
> On 4/30/22 17:05, Huacai Chen wrote:
> > This patch adds some other common headers for basic LoongArch support.
> >
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > ---
> >   arch/loongarch/include/asm/asm-prototypes.h   |   7 +
> >   arch/loongarch/include/asm/asm.h              | 190 +++++++++++
> >   arch/loongarch/include/asm/asmmacro.h         | 294 ++++++++++++++++++
> >   arch/loongarch/include/asm/clocksource.h      |  12 +
> >   arch/loongarch/include/asm/compiler.h         |  15 +
> >   arch/loongarch/include/asm/inst.h             |  63 ++++
> >   arch/loongarch/include/asm/linkage.h          |  36 +++
> >   arch/loongarch/include/asm/perf_event.h       |  10 +
> >   arch/loongarch/include/asm/prefetch.h         |  29 ++
> >   arch/loongarch/include/asm/serial.h           |  11 +
> >   arch/loongarch/include/asm/time.h             |  50 +++
> >   arch/loongarch/include/asm/timex.h            |  31 ++
> >   arch/loongarch/include/asm/topology.h         |  15 +
> >   arch/loongarch/include/asm/types.h            |  33 ++
> >   arch/loongarch/include/uapi/asm/bitfield.h    |  15 +
> >   arch/loongarch/include/uapi/asm/bitsperlong.h |   9 +
> >   arch/loongarch/include/uapi/asm/byteorder.h   |  13 +
> >   arch/loongarch/include/uapi/asm/inst.h        |  57 ++++
> >   arch/loongarch/include/uapi/asm/reg.h         |  59 ++++
> >   tools/include/uapi/asm/bitsperlong.h          |   2 +
> >   20 files changed, 951 insertions(+)
> >   create mode 100644 arch/loongarch/include/asm/asm-prototypes.h
> >   create mode 100644 arch/loongarch/include/asm/asm.h
> >   create mode 100644 arch/loongarch/include/asm/asmmacro.h
> >   create mode 100644 arch/loongarch/include/asm/clocksource.h
> >   create mode 100644 arch/loongarch/include/asm/compiler.h
> >   create mode 100644 arch/loongarch/include/asm/inst.h
> >   create mode 100644 arch/loongarch/include/asm/linkage.h
> >   create mode 100644 arch/loongarch/include/asm/perf_event.h
> >   create mode 100644 arch/loongarch/include/asm/prefetch.h
> >   create mode 100644 arch/loongarch/include/asm/serial.h
> >   create mode 100644 arch/loongarch/include/asm/time.h
> >   create mode 100644 arch/loongarch/include/asm/timex.h
> >   create mode 100644 arch/loongarch/include/asm/topology.h
> >   create mode 100644 arch/loongarch/include/asm/types.h
> >   create mode 100644 arch/loongarch/include/uapi/asm/bitfield.h
> >   create mode 100644 arch/loongarch/include/uapi/asm/bitsperlong.h
> >   create mode 100644 arch/loongarch/include/uapi/asm/byteorder.h
> >   create mode 100644 arch/loongarch/include/uapi/asm/inst.h
> >   create mode 100644 arch/loongarch/include/uapi/asm/reg.h
> >
> > diff --git a/arch/loongarch/include/asm/asm-prototypes.h b/arch/loongarch/include/asm/asm-prototypes.h
> > new file mode 100644
> > index 000000000000..ed06d3997420
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/asm-prototypes.h
> > @@ -0,0 +1,7 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#include <linux/uaccess.h>
> > +#include <asm/fpu.h>
> > +#include <asm/mmu_context.h>
> > +#include <asm/page.h>
> > +#include <asm/ftrace.h>
> > +#include <asm-generic/asm-prototypes.h>
> > diff --git a/arch/loongarch/include/asm/asm.h b/arch/loongarch/include/asm/asm.h
> > new file mode 100644
> > index 000000000000..6de8f9e6a21e
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/asm.h
> > @@ -0,0 +1,190 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Some useful macros for LoongArch assembler code
> > + *
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + *
> > + * Derived from MIPS:
> > + * Copyright (C) 1995, 1996, 1997, 1999, 2001 by Ralf Baechle
> > + * Copyright (C) 1999 by Silicon Graphics, Inc.
> > + * Copyright (C) 2001 MIPS Technologies, Inc.
> > + * Copyright (C) 2002  Maciej W. Rozycki
> > + */
> > +#ifndef __ASM_ASM_H
> > +#define __ASM_ASM_H
> > +
> > +/* LoongArch pref instruction. */
> > +#ifdef CONFIG_CPU_HAS_PREFETCH
> > +
> > +#define PREF(hint, addr, offs)                               \
> > +             preld   hint, addr, offs;               \
> > +
> > +#define PREFX(hint, addr, index)                     \
> > +             preldx  hint, addr, index;              \
> > +
> > +#else /* !CONFIG_CPU_HAS_PREFETCH */
> > +
> > +#define PREF(hint, addr, offs)
> > +#define PREFX(hint, addr, index)
> > +
> > +#endif /* !CONFIG_CPU_HAS_PREFETCH */
> > +
> > +/*
> > + * Stack alignment
> > + */
> > +#define ALSZ 0xf
> > +#define ALMASK       ~ALSZ
> Name is too ugly... why not simply "STACK_ALIGNMENT"?
OK, thanks.

> > +
> > +/*
> > + * Macros to handle different pointer/register sizes for 32/64-bit code
> > + */
> > +
> > +/*
> > + * Size of a register
> > + */
> > +#ifndef __loongarch64
> > +#define SZREG        4
> > +#else
> > +#define SZREG        8
> > +#endif
> Better use something like the __REG_SEL in arch/riscv (and for all
> definitions below). This way we don't have to repeat the symbol name twice.
> > +
> > +/*
> > + * Use the following macros in assemblercode to load/store registers,
> > + * pointers etc.
> > + */
> > +#if (SZREG == 4)
> > +#define REG_L                ld.w
> > +#define REG_S                st.w
> > +#define REG_ADDU     add.w
> > +#define REG_SUBU     sub.w
>
> Please don't "ADDU"; just "ADD". The U suffix clearly means "unsigned"
> in LoongArch instruction mnemonics, while in MIPS "addu" the "u"
> actually means "unchecked for overflow" (see the MIPS manual about this
> misnomer).
>
> Similarly for "SUBU".
OK, thanks.

>
> > +#else /* SZREG == 8 */
> > +#define REG_L                ld.d
> > +#define REG_S                st.d
> > +#define REG_ADDU     add.d
> > +#define REG_SUBU     sub.d
> > +#endif
> > +
> > +/*
> > + * How to add/sub/load/store/shift C int variables.
> > + */
> > +#if (_LOONGARCH_SZINT == 32)
> > +#define INT_ADDU     add.w
> > +#define INT_ADDIU    addi.w
> > +#define INT_SUBU     sub.w
> > +#define INT_L                ld.w
> > +#define INT_S                st.w
> > +#define INT_SLL              slli.w
> > +#define INT_SLLV     sll.w
> > +#define INT_SRL              srli.w
> > +#define INT_SRLV     srl.w
> > +#define INT_SRA              srai.w
> > +#define INT_SRAV     sra.w
> Again, please don't carry MIPS names over.
> > +#endif
> > +
> > +#if (_LOONGARCH_SZINT == 64)
> > +#define INT_ADDU     add.d
> > +#define INT_ADDIU    addi.d
> > +#define INT_SUBU     sub.d
> > +#define INT_L                ld.d
> > +#define INT_S                st.d
> > +#define INT_SLL              slli.d
> > +#define INT_SLLV     sll.d
> > +#define INT_SRL              srli.d
> > +#define INT_SRLV     srl.d
> > +#define INT_SRA              sra.w
> > +#define INT_SRAV     sra.d
> > +#endif
> > +
> > +/*
> > + * How to add/sub/load/store/shift C long variables.
> > + */
> > +#if (_LOONGARCH_SZLONG == 32)
> > +#define LONG_ADDU    add.w
> > +#define LONG_ADDIU   addi.w
> > +#define LONG_SUBU    sub.w
> > +#define LONG_L               ld.w
> > +#define LONG_S               st.w
> > +#define LONG_SP              swp
> Is this a typo?
OK, this can be removed.

> > +#define LONG_SLL     slli.w
> > +#define LONG_SLLV    sll.w
> > +#define LONG_SRL     srli.w
> > +#define LONG_SRLV    srl.w
> > +#define LONG_SRA     srai.w
> > +#define LONG_SRAV    sra.w
> > +
> > +#ifdef __ASSEMBLY__
> > +#define LONG         .word
> > +#endif
> > +#define LONGSIZE     4
> > +#define LONGMASK     3
> > +#define LONGLOG              2
> > +#endif
> > +
> > +#if (_LOONGARCH_SZLONG == 64)
> > +#define LONG_ADDU    add.d
> > +#define LONG_ADDIU   addi.d
> > +#define LONG_SUBU    sub.d
> > +#define LONG_L               ld.d
> > +#define LONG_S               st.d
> > +#define LONG_SP              sdp
> > +#define LONG_SLL     slli.d
> > +#define LONG_SLLV    sll.d
> > +#define LONG_SRL     srli.d
> > +#define LONG_SRLV    srl.d
> > +#define LONG_SRA     sra.w
> > +#define LONG_SRAV    sra.d
> > +
> > +#ifdef __ASSEMBLY__
> > +#define LONG         .dword
> > +#endif
> > +#define LONGSIZE     8
> > +#define LONGMASK     7
> > +#define LONGLOG              3
> > +#endif
> > +
> > +/*
> > + * How to add/sub/load/store/shift pointers.
> > + */
> > +#if (_LOONGARCH_SZPTR == 32)
> > +#define PTR_ADDU     add.w
> > +#define PTR_ADDIU    addi.w
> > +#define PTR_SUBU     sub.w
> > +#define PTR_L                ld.w
> > +#define PTR_S                st.w
> > +#define PTR_LI               li.w
> > +#define PTR_SLL              slli.w
> > +#define PTR_SLLV     sll.w
> > +#define PTR_SRL              srli.w
> > +#define PTR_SRLV     srl.w
> > +#define PTR_SRA              srai.w
> > +#define PTR_SRAV     sra.w
> > +
> > +#define PTR_SCALESHIFT       2
> > +
> > +#define PTR          .word
> > +#define PTRSIZE              4
> > +#define PTRLOG               2
> > +#endif
> > +
> > +#if (_LOONGARCH_SZPTR == 64)
> > +#define PTR_ADDU     add.d
> > +#define PTR_ADDIU    addi.d
> > +#define PTR_SUBU     sub.d
> > +#define PTR_L                ld.d
> > +#define PTR_S                st.d
> > +#define PTR_LI               li.d
> > +#define PTR_SLL              slli.d
> > +#define PTR_SLLV     sll.d
> > +#define PTR_SRL              srli.d
> > +#define PTR_SRLV     srl.d
> > +#define PTR_SRA              srai.d
> > +#define PTR_SRAV     sra.d
> > +
> > +#define PTR_SCALESHIFT       3
> > +
> > +#define PTR          .dword
> > +#define PTRSIZE              8
> > +#define PTRLOG               3
> > +#endif
> > +
> > +#endif /* __ASM_ASM_H */
> > diff --git a/arch/loongarch/include/asm/asmmacro.h b/arch/loongarch/include/asm/asmmacro.h
> > new file mode 100644
> > index 000000000000..d7089fab00e1
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/asmmacro.h
> > @@ -0,0 +1,294 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef _ASM_ASMMACRO_H
> > +#define _ASM_ASMMACRO_H
> > +
> > +#include <asm/asm-offsets.h>
> > +#include <asm/regdef.h>
> > +#include <asm/fpregdef.h>
> > +#include <asm/loongarch.h>
> > +
> > +#undef v0
> > +#undef v1
> > +
> > +     .macro  parse_v var val
> > +     \var    = \val
> > +     .endm
> > +
> > +     .macro  parse_r var r
> > +     \var    = -1
> > +     .ifc    \r, $r0
> > +     \var    = 0
> > +     .endif
> > +     .ifc    \r, $r1
> > +     \var    = 1
> > +     .endif
> > +     .ifc    \r, $r2
> > +     \var    = 2
> > +     .endif
> > +     .ifc    \r, $r3
> > +     \var    = 3
> > +     .endif
> > +     .ifc    \r, $r4
> > +     \var    = 4
> > +     .endif
> > +     .ifc    \r, $r5
> > +     \var    = 5
> > +     .endif
> > +     .ifc    \r, $r6
> > +     \var    = 6
> > +     .endif
> > +     .ifc    \r, $r7
> > +     \var    = 7
> > +     .endif
> > +     .ifc    \r, $r8
> > +     \var    = 8
> > +     .endif
> > +     .ifc    \r, $r9
> > +     \var    = 9
> > +     .endif
> > +     .ifc    \r, $r10
> > +     \var    = 10
> > +     .endif
> > +     .ifc    \r, $r11
> > +     \var    = 11
> > +     .endif
> > +     .ifc    \r, $r12
> > +     \var    = 12
> > +     .endif
> > +     .ifc    \r, $r13
> > +     \var    = 13
> > +     .endif
> > +     .ifc    \r, $r14
> > +     \var    = 14
> > +     .endif
> > +     .ifc    \r, $r15
> > +     \var    = 15
> > +     .endif
> > +     .ifc    \r, $r16
> > +     \var    = 16
> > +     .endif
> > +     .ifc    \r, $r17
> > +     \var    = 17
> > +     .endif
> > +     .ifc    \r, $r18
> > +     \var    = 18
> > +     .endif
> > +     .ifc    \r, $r19
> > +     \var    = 19
> > +     .endif
> > +     .ifc    \r, $r20
> > +     \var    = 20
> > +     .endif
> > +     .ifc    \r, $r21
> > +     \var    = 21
> > +     .endif
> > +     .ifc    \r, $r22
> > +     \var    = 22
> > +     .endif
> > +     .ifc    \r, $r23
> > +     \var    = 23
> > +     .endif
> > +     .ifc    \r, $r24
> > +     \var    = 24
> > +     .endif
> > +     .ifc    \r, $r25
> > +     \var    = 25
> > +     .endif
> > +     .ifc    \r, $r26
> > +     \var    = 26
> > +     .endif
> > +     .ifc    \r, $r27
> > +     \var    = 27
> > +     .endif
> > +     .ifc    \r, $r28
> > +     \var    = 28
> > +     .endif
> > +     .ifc    \r, $r29
> > +     \var    = 29
> > +     .endif
> > +     .ifc    \r, $r30
> > +     \var    = 30
> > +     .endif
> > +     .ifc    \r, $r31
> > +     \var    = 31
> > +     .endif
> > +     .iflt   \var
> > +     .error  "Unable to parse register name \r"
> > +     .endif
> > +     .endm
> > +
> > +     .macro  cpu_save_nonscratch thread
> > +     stptr.d s0, \thread, THREAD_REG23
> > +     stptr.d s1, \thread, THREAD_REG24
> > +     stptr.d s2, \thread, THREAD_REG25
> > +     stptr.d s3, \thread, THREAD_REG26
> > +     stptr.d s4, \thread, THREAD_REG27
> > +     stptr.d s5, \thread, THREAD_REG28
> > +     stptr.d s6, \thread, THREAD_REG29
> > +     stptr.d s7, \thread, THREAD_REG30
> > +     stptr.d s8, \thread, THREAD_REG31
> > +     stptr.d sp, \thread, THREAD_REG03
> > +     stptr.d fp, \thread, THREAD_REG22
> > +     .endm
> > +
> > +     .macro  cpu_restore_nonscratch thread
> > +     ldptr.d s0, \thread, THREAD_REG23
> > +     ldptr.d s1, \thread, THREAD_REG24
> > +     ldptr.d s2, \thread, THREAD_REG25
> > +     ldptr.d s3, \thread, THREAD_REG26
> > +     ldptr.d s4, \thread, THREAD_REG27
> > +     ldptr.d s5, \thread, THREAD_REG28
> > +     ldptr.d s6, \thread, THREAD_REG29
> > +     ldptr.d s7, \thread, THREAD_REG30
> > +     ldptr.d s8, \thread, THREAD_REG31
> > +     ldptr.d ra, \thread, THREAD_REG01
> > +     ldptr.d sp, \thread, THREAD_REG03
> > +     ldptr.d fp, \thread, THREAD_REG22
> > +     .endm
> > +
> > +     .macro fpu_save_csr thread tmp
> > +     movfcsr2gr      \tmp, fcsr0
> > +     stptr.w \tmp, \thread, THREAD_FCSR
> > +     .endm
> > +
> > +     .macro fpu_restore_csr thread tmp
> > +     ldptr.w \tmp, \thread, THREAD_FCSR
> > +     movgr2fcsr      fcsr0, \tmp
> > +     .endm
> > +
> > +     .macro fpu_save_cc thread tmp0 tmp1
> > +     movcf2gr        \tmp0, $fcc0
> > +     move    \tmp1, \tmp0
> > +     movcf2gr        \tmp0, $fcc1
> > +     bstrins.d       \tmp1, \tmp0, 15, 8
> > +     movcf2gr        \tmp0, $fcc2
> > +     bstrins.d       \tmp1, \tmp0, 23, 16
> > +     movcf2gr        \tmp0, $fcc3
> > +     bstrins.d       \tmp1, \tmp0, 31, 24
> > +     movcf2gr        \tmp0, $fcc4
> > +     bstrins.d       \tmp1, \tmp0, 39, 32
> > +     movcf2gr        \tmp0, $fcc5
> > +     bstrins.d       \tmp1, \tmp0, 47, 40
> > +     movcf2gr        \tmp0, $fcc6
> > +     bstrins.d       \tmp1, \tmp0, 55, 48
> > +     movcf2gr        \tmp0, $fcc7
> > +     bstrins.d       \tmp1, \tmp0, 63, 56
> > +     stptr.d         \tmp1, \thread, THREAD_FCC
> > +     .endm
> > +
> > +     .macro fpu_restore_cc thread tmp0 tmp1
> > +     ldptr.d \tmp0, \thread, THREAD_FCC
> > +     bstrpick.d      \tmp1, \tmp0, 7, 0
> > +     movgr2cf        $fcc0, \tmp1
> > +     bstrpick.d      \tmp1, \tmp0, 15, 8
> > +     movgr2cf        $fcc1, \tmp1
> > +     bstrpick.d      \tmp1, \tmp0, 23, 16
> > +     movgr2cf        $fcc2, \tmp1
> > +     bstrpick.d      \tmp1, \tmp0, 31, 24
> > +     movgr2cf        $fcc3, \tmp1
> > +     bstrpick.d      \tmp1, \tmp0, 39, 32
> > +     movgr2cf        $fcc4, \tmp1
> > +     bstrpick.d      \tmp1, \tmp0, 47, 40
> > +     movgr2cf        $fcc5, \tmp1
> > +     bstrpick.d      \tmp1, \tmp0, 55, 48
> > +     movgr2cf        $fcc6, \tmp1
> > +     bstrpick.d      \tmp1, \tmp0, 63, 56
> > +     movgr2cf        $fcc7, \tmp1
> > +     .endm
> > +
> > +     .macro  fpu_save_double thread tmp
> > +     li.w    \tmp, THREAD_FPR0
> > +     PTR_ADDU \tmp, \tmp, \thread
> > +     fst.d   $f0, \tmp, THREAD_FPR0  - THREAD_FPR0
> > +     fst.d   $f1, \tmp, THREAD_FPR1  - THREAD_FPR0
> > +     fst.d   $f2, \tmp, THREAD_FPR2  - THREAD_FPR0
> > +     fst.d   $f3, \tmp, THREAD_FPR3  - THREAD_FPR0
> > +     fst.d   $f4, \tmp, THREAD_FPR4  - THREAD_FPR0
> > +     fst.d   $f5, \tmp, THREAD_FPR5  - THREAD_FPR0
> > +     fst.d   $f6, \tmp, THREAD_FPR6  - THREAD_FPR0
> > +     fst.d   $f7, \tmp, THREAD_FPR7  - THREAD_FPR0
> > +     fst.d   $f8, \tmp, THREAD_FPR8  - THREAD_FPR0
> > +     fst.d   $f9, \tmp, THREAD_FPR9  - THREAD_FPR0
> > +     fst.d   $f10, \tmp, THREAD_FPR10 - THREAD_FPR0
> > +     fst.d   $f11, \tmp, THREAD_FPR11 - THREAD_FPR0
> > +     fst.d   $f12, \tmp, THREAD_FPR12 - THREAD_FPR0
> > +     fst.d   $f13, \tmp, THREAD_FPR13 - THREAD_FPR0
> > +     fst.d   $f14, \tmp, THREAD_FPR14 - THREAD_FPR0
> > +     fst.d   $f15, \tmp, THREAD_FPR15 - THREAD_FPR0
> > +     fst.d   $f16, \tmp, THREAD_FPR16 - THREAD_FPR0
> > +     fst.d   $f17, \tmp, THREAD_FPR17 - THREAD_FPR0
> > +     fst.d   $f18, \tmp, THREAD_FPR18 - THREAD_FPR0
> > +     fst.d   $f19, \tmp, THREAD_FPR19 - THREAD_FPR0
> > +     fst.d   $f20, \tmp, THREAD_FPR20 - THREAD_FPR0
> > +     fst.d   $f21, \tmp, THREAD_FPR21 - THREAD_FPR0
> > +     fst.d   $f22, \tmp, THREAD_FPR22 - THREAD_FPR0
> > +     fst.d   $f23, \tmp, THREAD_FPR23 - THREAD_FPR0
> > +     fst.d   $f24, \tmp, THREAD_FPR24 - THREAD_FPR0
> > +     fst.d   $f25, \tmp, THREAD_FPR25 - THREAD_FPR0
> > +     fst.d   $f26, \tmp, THREAD_FPR26 - THREAD_FPR0
> > +     fst.d   $f27, \tmp, THREAD_FPR27 - THREAD_FPR0
> > +     fst.d   $f28, \tmp, THREAD_FPR28 - THREAD_FPR0
> > +     fst.d   $f29, \tmp, THREAD_FPR29 - THREAD_FPR0
> > +     fst.d   $f30, \tmp, THREAD_FPR30 - THREAD_FPR0
> > +     fst.d   $f31, \tmp, THREAD_FPR31 - THREAD_FPR0
> > +     .endm
> > +
> > +     .macro  fpu_restore_double thread tmp
> > +     li.w    \tmp, THREAD_FPR0
> > +     PTR_ADDU \tmp, \tmp, \thread
> > +     fld.d   $f0, \tmp, THREAD_FPR0  - THREAD_FPR0
> > +     fld.d   $f1, \tmp, THREAD_FPR1  - THREAD_FPR0
> > +     fld.d   $f2, \tmp, THREAD_FPR2  - THREAD_FPR0
> > +     fld.d   $f3, \tmp, THREAD_FPR3  - THREAD_FPR0
> > +     fld.d   $f4, \tmp, THREAD_FPR4  - THREAD_FPR0
> > +     fld.d   $f5, \tmp, THREAD_FPR5  - THREAD_FPR0
> > +     fld.d   $f6, \tmp, THREAD_FPR6  - THREAD_FPR0
> > +     fld.d   $f7, \tmp, THREAD_FPR7  - THREAD_FPR0
> > +     fld.d   $f8, \tmp, THREAD_FPR8  - THREAD_FPR0
> > +     fld.d   $f9, \tmp, THREAD_FPR9  - THREAD_FPR0
> > +     fld.d   $f10, \tmp, THREAD_FPR10 - THREAD_FPR0
> > +     fld.d   $f11, \tmp, THREAD_FPR11 - THREAD_FPR0
> > +     fld.d   $f12, \tmp, THREAD_FPR12 - THREAD_FPR0
> > +     fld.d   $f13, \tmp, THREAD_FPR13 - THREAD_FPR0
> > +     fld.d   $f14, \tmp, THREAD_FPR14 - THREAD_FPR0
> > +     fld.d   $f15, \tmp, THREAD_FPR15 - THREAD_FPR0
> > +     fld.d   $f16, \tmp, THREAD_FPR16 - THREAD_FPR0
> > +     fld.d   $f17, \tmp, THREAD_FPR17 - THREAD_FPR0
> > +     fld.d   $f18, \tmp, THREAD_FPR18 - THREAD_FPR0
> > +     fld.d   $f19, \tmp, THREAD_FPR19 - THREAD_FPR0
> > +     fld.d   $f20, \tmp, THREAD_FPR20 - THREAD_FPR0
> > +     fld.d   $f21, \tmp, THREAD_FPR21 - THREAD_FPR0
> > +     fld.d   $f22, \tmp, THREAD_FPR22 - THREAD_FPR0
> > +     fld.d   $f23, \tmp, THREAD_FPR23 - THREAD_FPR0
> > +     fld.d   $f24, \tmp, THREAD_FPR24 - THREAD_FPR0
> > +     fld.d   $f25, \tmp, THREAD_FPR25 - THREAD_FPR0
> > +     fld.d   $f26, \tmp, THREAD_FPR26 - THREAD_FPR0
> > +     fld.d   $f27, \tmp, THREAD_FPR27 - THREAD_FPR0
> > +     fld.d   $f28, \tmp, THREAD_FPR28 - THREAD_FPR0
> > +     fld.d   $f29, \tmp, THREAD_FPR29 - THREAD_FPR0
> > +     fld.d   $f30, \tmp, THREAD_FPR30 - THREAD_FPR0
> > +     fld.d   $f31, \tmp, THREAD_FPR31 - THREAD_FPR0
> > +     .endm
> > +
> > +.macro not dst src
> > +     nor     \dst, \src, zero
> > +.endm
> > +
> > +.macro bgt r0 r1 label
> > +     blt     \r1, \r0, \label
> > +.endm
> > +
> > +.macro bltz r0 label
> > +     blt     \r0, zero, \label
> > +.endm
> > +
> > +.macro bgez r0 label
> > +     bge     \r0, zero, \label
> > +.endm
> These are all supported in upstream binutils, so you can just remove them.
> > +
> > +#define v0 $r4
> > +#define v1 $r5
> If you removed every mention of v0 and v1, this would be unnecessary as
> well. ;-)
> > +#endif /* _ASM_ASMMACRO_H */
> > diff --git a/arch/loongarch/include/asm/clocksource.h b/arch/loongarch/include/asm/clocksource.h
> > new file mode 100644
> > index 000000000000..58e64aa05d26
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/clocksource.h
> > @@ -0,0 +1,12 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Author: Huacai Chen <chenhuacai@loongson.cn>
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#ifndef __ASM_CLOCKSOURCE_H
> > +#define __ASM_CLOCKSOURCE_H
> > +
> > +#include <asm/vdso/clocksource.h>
> > +
> > +#endif /* __ASM_CLOCKSOURCE_H */
> > diff --git a/arch/loongarch/include/asm/compiler.h b/arch/loongarch/include/asm/compiler.h
> > new file mode 100644
> > index 000000000000..657cebe70ace
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/compiler.h
> > @@ -0,0 +1,15 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef _ASM_COMPILER_H
> > +#define _ASM_COMPILER_H
> > +
> > +#define GCC_OFF_SMALL_ASM() "ZC"
> > +
> > +#define LOONGARCH_ISA_LEVEL "loongarch"
> > +#define LOONGARCH_ISA_ARCH_LEVEL "arch=loongarch"
> > +#define LOONGARCH_ISA_LEVEL_RAW loongarch
> Do these need updating? I remember "-march=loongarch" is an old-world thing.
> > +#define LOONGARCH_ISA_ARCH_LEVEL_RAW LOONGARCH_ISA_LEVEL_RAW
> > +
> > +#endif /* _ASM_COMPILER_H */
> > diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> > new file mode 100644
> > index 000000000000..46166ee1e33f
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/inst.h
> > @@ -0,0 +1,63 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef _ASM_INST_H
> > +#define _ASM_INST_H
> > +
> > +#include <linux/types.h>
> > +#include <asm/asm.h>
> > +#include <uapi/asm/inst.h>
> > +
> > +#define ADDR_IMMMASK_LU52ID  0xFFF0000000000000
> > +#define ADDR_IMMMASK_LU32ID  0x000FFFFF00000000
> > +#define ADDR_IMMMASK_ADDU16ID        0x00000000FFFF0000
> > +
> > +#define ADDR_IMMSHIFT_LU52ID 52
> > +#define ADDR_IMMSHIFT_LU32ID 32
> > +#define ADDR_IMMSHIFT_ADDU16ID       16
> > +
> > +#define ADDR_IMM(addr, INSN) ((addr & ADDR_IMMMASK_##INSN) >> ADDR_IMMSHIFT_##INSN)
> > +
> > +enum loongarch_gpr {
> > +     LOONGARCH_GPR_ZERO = 0,
> > +     LOONGARCH_GPR_RA = 1,
> > +     LOONGARCH_GPR_TP = 2,
> > +     LOONGARCH_GPR_SP = 3,
> > +     LOONGARCH_GPR_A0 = 4,
> > +     LOONGARCH_GPR_A1,
> > +     LOONGARCH_GPR_A2,
> > +     LOONGARCH_GPR_A3,
> > +     LOONGARCH_GPR_A4,
> > +     LOONGARCH_GPR_A5,
> > +     LOONGARCH_GPR_A6,
> > +     LOONGARCH_GPR_A7,
> > +     LOONGARCH_GPR_V0 = 4,
> > +     LOONGARCH_GPR_V1 = 5,
> > +     LOONGARCH_GPR_T0 = 12,
> > +     LOONGARCH_GPR_T1,
> > +     LOONGARCH_GPR_T2,
> > +     LOONGARCH_GPR_T3,
> > +     LOONGARCH_GPR_T4,
> > +     LOONGARCH_GPR_T5,
> > +     LOONGARCH_GPR_T6,
> > +     LOONGARCH_GPR_T7,
> > +     LOONGARCH_GPR_T8,
> > +     LOONGARCH_GPR_FP = 22,
> > +     LOONGARCH_GPR_S0 = 23,
> > +     LOONGARCH_GPR_S1,
> > +     LOONGARCH_GPR_S2,
> > +     LOONGARCH_GPR_S3,
> > +     LOONGARCH_GPR_S4,
> > +     LOONGARCH_GPR_S5,
> > +     LOONGARCH_GPR_S6,
> > +     LOONGARCH_GPR_S7,
> > +     LOONGARCH_GPR_S8,
> > +     LOONGARCH_GPR_MAX
> > +};
> > +
> > +u32 larch_insn_gen_lu32id(enum loongarch_gpr rd, int imm);
> > +u32 larch_insn_gen_lu52id(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
> > +u32 larch_insn_gen_jirl(enum loongarch_gpr rd, enum loongarch_gpr rj, unsigned long pc, unsigned long dest);
> > +
> > +#endif /* _ASM_INST_H */
> > diff --git a/arch/loongarch/include/asm/linkage.h b/arch/loongarch/include/asm/linkage.h
> > new file mode 100644
> > index 000000000000..283b3389b561
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/linkage.h
> > @@ -0,0 +1,36 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef __ASM_LINKAGE_H
> > +#define __ASM_LINKAGE_H
> > +
> > +#define __ALIGN              .align 2
> > +#define __ALIGN_STR  ".align 2"
> Just __stringify(__ALIGN) where needed.
OK, thanks.

> > +
> > +#define SYM_FUNC_START(name)                         \
> > +     SYM_START(name, SYM_L_GLOBAL, SYM_A_ALIGN)      \
> > +     .cfi_startproc;
> > +
> > +#define SYM_FUNC_START_NOALIGN(name)                 \
> > +     SYM_START(name, SYM_L_GLOBAL, SYM_A_NONE)       \
> > +     .cfi_startproc;
> > +
> > +#define SYM_FUNC_START_LOCAL(name)                   \
> > +     SYM_START(name, SYM_L_LOCAL, SYM_A_ALIGN)       \
> > +     .cfi_startproc;
> > +
> > +#define SYM_FUNC_START_LOCAL_NOALIGN(name)           \
> > +     SYM_START(name, SYM_L_LOCAL, SYM_A_NONE)        \
> > +     .cfi_startproc;
> > +
> > +#define SYM_FUNC_START_WEAK(name)                    \
> > +     SYM_START(name, SYM_L_WEAK, SYM_A_ALIGN)        \
> > +     .cfi_startproc;
> > +
> > +#define SYM_FUNC_START_WEAK_NOALIGN(name)            \
> > +     SYM_START(name, SYM_L_WEAK, SYM_A_NONE)         \
> > +     .cfi_startproc;
> > +
> > +#define SYM_FUNC_END(name)                           \
> > +     .cfi_endproc;                                   \
> > +     SYM_END(name, SYM_T_FUNC)
> > +
> > +#endif
> > diff --git a/arch/loongarch/include/asm/perf_event.h b/arch/loongarch/include/asm/perf_event.h
> > new file mode 100644
> > index 000000000000..44293ec8c153
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/perf_event.h
> > @@ -0,0 +1,10 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Author: Huacai Chen <chenhuacai@loongson.cn>
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#ifndef __LOONGARCH_PERF_EVENT_H__
> > +#define __LOONGARCH_PERF_EVENT_H__
> > +/* Leave it empty here. The file is required by linux/perf_event.h */
> > +#endif /* __LOONGARCH_PERF_EVENT_H__ */
> > diff --git a/arch/loongarch/include/asm/prefetch.h b/arch/loongarch/include/asm/prefetch.h
> > new file mode 100644
> > index 000000000000..1672262a5e2e
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/prefetch.h
> > @@ -0,0 +1,29 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef __ASM_PREFETCH_H
> > +#define __ASM_PREFETCH_H
> > +
> > +#define Pref_Load    0
> > +#define Pref_Store   8
> > +
> > +#ifdef __ASSEMBLY__
> > +
> > +     .macro  __pref hint addr
> > +#ifdef CONFIG_CPU_HAS_PREFETCH
> > +     preld   \hint, \addr, 0
> > +#endif
> > +     .endm
> > +
> > +     .macro  pref_load addr
> > +     __pref  Pref_Load, \addr
> > +     .endm
> > +
> > +     .macro  pref_store addr
> > +     __pref  Pref_Store, \addr
> > +     .endm
> > +
> > +#endif
> > +
> > +#endif /* __ASM_PREFETCH_H */
> > diff --git a/arch/loongarch/include/asm/serial.h b/arch/loongarch/include/asm/serial.h
> > new file mode 100644
> > index 000000000000..3fb550eb9115
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/serial.h
> > @@ -0,0 +1,11 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef __ASM__SERIAL_H
> > +#define __ASM__SERIAL_H
> > +
> > +#define BASE_BAUD 0
> > +#define STD_COM_FLAGS (ASYNC_BOOT_AUTOCONF | ASYNC_SKIP_TEST)
> > +
> > +#endif /* __ASM__SERIAL_H */
> > diff --git a/arch/loongarch/include/asm/time.h b/arch/loongarch/include/asm/time.h
> > new file mode 100644
> > index 000000000000..ace1665695b8
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/time.h
> > @@ -0,0 +1,50 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef _ASM_TIME_H
> > +#define _ASM_TIME_H
> > +
> > +#include <linux/clockchips.h>
> > +#include <linux/clocksource.h>
> > +#include <asm/loongarch.h>
> > +
> > +extern u64 cpu_clock_freq;
> > +extern u64 const_clock_freq;
> > +
> > +extern void sync_counter(void);
> > +
> > +static inline unsigned int calc_const_freq(void)
> > +{
> > +     unsigned int res;
> > +     unsigned int base_freq;
> > +     unsigned int cfm, cfd;
> > +
> > +     res = read_cpucfg(LOONGARCH_CPUCFG2);
> > +     if (!(res & CPUCFG2_LLFTP))
> > +             return 0;
> > +
> > +     base_freq = read_cpucfg(LOONGARCH_CPUCFG4);
> > +     res = read_cpucfg(LOONGARCH_CPUCFG5);
> > +     cfm = res & 0xffff;
> > +     cfd = (res >> 16) & 0xffff;
> > +
> > +     if (!base_freq || !cfm || !cfd)
> > +             return 0;
> > +     else
> > +             return (base_freq * cfm / cfd);
> No need for the "else" here.
> > +}
> > +
> > +/*
> > + * Initialize the calling CPU's timer interrupt as clockevent device
> > + */
> > +extern int constant_clockevent_init(void);
> > +extern int constant_clocksource_init(void);
> > +
> > +static inline void clockevent_set_clock(struct clock_event_device *cd,
> > +                                     unsigned int clock)
> > +{
> > +     clockevents_calc_mult_shift(cd, clock, 4);
> > +}
> > +
> > +#endif /* _ASM_TIME_H */
> > diff --git a/arch/loongarch/include/asm/timex.h b/arch/loongarch/include/asm/timex.h
> > new file mode 100644
> > index 000000000000..3f8db082f00d
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/timex.h
> > @@ -0,0 +1,31 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef _ASM_TIMEX_H
> > +#define _ASM_TIMEX_H
> > +
> > +#ifdef __KERNEL__
> > +
> > +#include <linux/compiler.h>
> > +
> > +#include <asm/cpu.h>
> > +#include <asm/cpu-features.h>
> > +
> > +/*
> > + * Standard way to access the cycle counter.
> > + * Currently only used on SMP for scheduling.
> > + *
> > + * We know that all SMP capable CPUs have cycle counters.
> > + */
> > +
> > +typedef unsigned long cycles_t;
> > +
> > +static inline cycles_t get_cycles(void)
> > +{
> > +     return drdtime();
> > +}
> > +
> > +#endif /* __KERNEL__ */
> > +
> > +#endif /*  _ASM_TIMEX_H */
> > diff --git a/arch/loongarch/include/asm/topology.h b/arch/loongarch/include/asm/topology.h
> > new file mode 100644
> > index 000000000000..9ac71a25207a
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/topology.h
> > @@ -0,0 +1,15 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef __ASM_TOPOLOGY_H
> > +#define __ASM_TOPOLOGY_H
> > +
> > +#include <linux/smp.h>
> > +
> > +#define cpu_logical_map(cpu)  0
> > +
> > +#include <asm-generic/topology.h>
> > +
> > +static inline void arch_fix_phys_package_id(int num, u32 slot) { }
> > +#endif /* __ASM_TOPOLOGY_H */
> > diff --git a/arch/loongarch/include/asm/types.h b/arch/loongarch/include/asm/types.h
> > new file mode 100644
> > index 000000000000..f783cf11ea52
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/types.h
> > @@ -0,0 +1,33 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef _ASM_TYPES_H
> > +#define _ASM_TYPES_H
> > +
> > +#include <asm-generic/int-ll64.h>
> > +#include <uapi/asm/types.h>
> > +
> > +/*
> > + * The following macros are especially useful for __asm__
> > + * inline assembler.
> > + */
> > +#ifndef __STR
> > +#define __STR(x) #x
> > +#endif
> > +#ifndef STR
> > +#define STR(x) __STR(x)
> > +#endif
> Again, just use __stringify from <linux/stringify.h> where appropriate.
> > +
> > +/*
> > + *  Configure language
> > + */
> > +#ifdef __ASSEMBLY__
> > +#define _ULCAST_
> > +#define _U64CAST_
> > +#else
> > +#define _ULCAST_ (unsigned long)
> > +#define _U64CAST_ (u64)
> > +#endif
> > +
> > +#endif /* _ASM_TYPES_H */
> > diff --git a/arch/loongarch/include/uapi/asm/bitfield.h b/arch/loongarch/include/uapi/asm/bitfield.h
> > new file mode 100644
> > index 000000000000..e31a719b7007
> > --- /dev/null
> > +++ b/arch/loongarch/include/uapi/asm/bitfield.h
> > @@ -0,0 +1,15 @@
> > +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
> > +/*
> > + * Author: Hanlu Li <lihanlu@loongson.cn>
> > + *         Huacai Chen <chenhuacai@loongson.cn>
> > + *
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef __UAPI_ASM_BITFIELD_H
> > +#define __UAPI_ASM_BITFIELD_H
> > +
> > +#define __BITFIELD_FIELD(field, more)                                        \
> > +     more                                                            \
> > +     field;
> > +
> > +#endif /* __UAPI_ASM_BITFIELD_H */
> > diff --git a/arch/loongarch/include/uapi/asm/bitsperlong.h b/arch/loongarch/include/uapi/asm/bitsperlong.h
> > new file mode 100644
> > index 000000000000..5c2c8779a695
> > --- /dev/null
> > +++ b/arch/loongarch/include/uapi/asm/bitsperlong.h
> > @@ -0,0 +1,9 @@
> > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> > +#ifndef __ASM_LOONGARCH_BITSPERLONG_H
> > +#define __ASM_LOONGARCH_BITSPERLONG_H
> > +
> > +#define __BITS_PER_LONG _LOONGARCH_SZLONG
> Use __loongarch_grlen instead of the MIPS-like symbol, or
> __SIZEOF_LONG__ * 8.
OK, thanks.

> > +
> > +#include <asm-generic/bitsperlong.h>
> > +
> > +#endif /* __ASM_LOONGARCH_BITSPERLONG_H */
> > diff --git a/arch/loongarch/include/uapi/asm/byteorder.h b/arch/loongarch/include/uapi/asm/byteorder.h
> > new file mode 100644
> > index 000000000000..b1722d890deb
> > --- /dev/null
> > +++ b/arch/loongarch/include/uapi/asm/byteorder.h
> > @@ -0,0 +1,13 @@
> > +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
> > +/*
> > + * Author: Hanlu Li <lihanlu@loongson.cn>
> > + *         Huacai Chen <chenhuacai@loongson.cn>
> > + *
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef _ASM_BYTEORDER_H
> > +#define _ASM_BYTEORDER_H
> > +
> > +#include <linux/byteorder/little_endian.h>
> > +
> > +#endif /* _ASM_BYTEORDER_H */
> > diff --git a/arch/loongarch/include/uapi/asm/inst.h b/arch/loongarch/include/uapi/asm/inst.h
> File is named "inst" while talking all about "insn"... and do we even
> need this in the UAPI? We have to act quick before merging, or this
> unfortunate legacy from MIPS would continue to plague us.
If possible, I will remove the UAPI file.

Huacai
> > new file mode 100644
> > index 000000000000..fa00cc5ede9d
> > --- /dev/null
> > +++ b/arch/loongarch/include/uapi/asm/inst.h
> > @@ -0,0 +1,57 @@
> > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> > +/*
> > + * Format of an instruction in memory.
> > + *
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef _UAPI_ASM_INST_H
> > +#define _UAPI_ASM_INST_H
> > +
> > +#include <asm/bitfield.h>
> > +
> > +enum reg1i20_op {
> > +     lu12iw_op       = 0x0a,
> > +     lu32id_op       = 0x0b,
> > +};
> > +
> > +enum reg2i12_op {
> > +     lu52id_op       = 0x0c,
> > +};
> > +
> > +enum reg2i16_op {
> > +     jirl_op         = 0x13,
> > +};
> > +
> > +struct reg1i20_format {
> > +     __BITFIELD_FIELD(unsigned int opcode : 7,
> > +     __BITFIELD_FIELD(unsigned int simmediate : 20,
> > +     __BITFIELD_FIELD(unsigned int rd : 5,
> > +     ;)))
> > +};
> > +
> > +struct reg2i12_format {
> > +     __BITFIELD_FIELD(unsigned int opcode : 10,
> > +     __BITFIELD_FIELD(signed int simmediate : 12,
> > +     __BITFIELD_FIELD(unsigned int rj : 5,
> > +     __BITFIELD_FIELD(unsigned int rd : 5,
> > +     ;))))
> > +};
> > +
> > +struct reg2i16_format {
> > +     __BITFIELD_FIELD(unsigned int opcode : 6,
> > +     __BITFIELD_FIELD(unsigned int simmediate : 16,
> > +     __BITFIELD_FIELD(unsigned int rj : 5,
> > +     __BITFIELD_FIELD(unsigned int rd : 5,
> > +     ;))))
> > +};
> > +
> > +union loongarch_instruction {
> > +     unsigned int word;
> > +     struct reg1i20_format reg1i20_format;
> > +     struct reg2i12_format reg2i12_format;
> > +     struct reg2i16_format reg2i16_format;
>
> So the official names for instruction formats are like "2RI12" and
> "2RI16", while "1RI20" is not even one of the 9 "basic" formats... You
> may call the formats "fmt_XriXX", or better yet, use the
> loongarch-opcodes scheme [1] which is more well-defined than the
> official (which would be something like "struct fmt_djsk12 djsk12;").
>
> Also while at it you could shorten "instruction" into "insn" like almost
> everywhere else (e.g. a few lines below).
>
> [1]: https://github.com/loongson-community/loongarch-opcodes
>
> > +};
> > +
> > +#define LOONGARCH_INSN_SIZE  sizeof(union loongarch_instruction)
> > +
> > +#endif /* _UAPI_ASM_INST_H */
> > diff --git a/arch/loongarch/include/uapi/asm/reg.h b/arch/loongarch/include/uapi/asm/reg.h
> > new file mode 100644
> > index 000000000000..90ad910c60eb
> > --- /dev/null
> > +++ b/arch/loongarch/include/uapi/asm/reg.h
> > @@ -0,0 +1,59 @@
> > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> > +/*
> > + * Various register offset definitions for debuggers, core file
> > + * examiners and whatnot.
> > + *
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#ifndef __UAPI_ASM_LOONGARCH_REG_H
> > +#define __UAPI_ASM_LOONGARCH_REG_H
> > +
> > +#define LOONGARCH_EF_R0              0
> > +#define LOONGARCH_EF_R1              1
> > +#define LOONGARCH_EF_R2              2
> > +#define LOONGARCH_EF_R3              3
> > +#define LOONGARCH_EF_R4              4
> > +#define LOONGARCH_EF_R5              5
> > +#define LOONGARCH_EF_R6              6
> > +#define LOONGARCH_EF_R7              7
> > +#define LOONGARCH_EF_R8              8
> > +#define LOONGARCH_EF_R9              9
> > +#define LOONGARCH_EF_R10     10
> > +#define LOONGARCH_EF_R11     11
> > +#define LOONGARCH_EF_R12     12
> > +#define LOONGARCH_EF_R13     13
> > +#define LOONGARCH_EF_R14     14
> > +#define LOONGARCH_EF_R15     15
> > +#define LOONGARCH_EF_R16     16
> > +#define LOONGARCH_EF_R17     17
> > +#define LOONGARCH_EF_R18     18
> > +#define LOONGARCH_EF_R19     19
> > +#define LOONGARCH_EF_R20     20
> > +#define LOONGARCH_EF_R21     21
> > +#define LOONGARCH_EF_R22     22
> > +#define LOONGARCH_EF_R23     23
> > +#define LOONGARCH_EF_R24     24
> > +#define LOONGARCH_EF_R25     25
> > +#define LOONGARCH_EF_R26     26
> > +#define LOONGARCH_EF_R27     27
> > +#define LOONGARCH_EF_R28     28
> > +#define LOONGARCH_EF_R29     29
> > +#define LOONGARCH_EF_R30     30
> > +#define LOONGARCH_EF_R31     31
> > +
> > +/*
> > + * Saved special registers
> > + */
> > +#define LOONGARCH_EF_ORIG_A0 32
> > +#define LOONGARCH_EF_CSR_ERA 33
> > +#define LOONGARCH_EF_CSR_BADV        34
> > +#define LOONGARCH_EF_CSR_CRMD        35
> > +#define LOONGARCH_EF_CSR_PRMD        36
> > +#define LOONGARCH_EF_CSR_EUEN        37
> > +#define LOONGARCH_EF_CSR_ECFG        38
> > +#define LOONGARCH_EF_CSR_ESTAT       39
> > +
> > +#define LOONGARCH_EF_SIZE    320     /* size in bytes */
> > +
> > +#endif /* __UAPI_ASM_LOONGARCH_REG_H */
> > diff --git a/tools/include/uapi/asm/bitsperlong.h b/tools/include/uapi/asm/bitsperlong.h
> > index edba4d93e9e6..da5206517158 100644
> > --- a/tools/include/uapi/asm/bitsperlong.h
> > +++ b/tools/include/uapi/asm/bitsperlong.h
> > @@ -17,6 +17,8 @@
> >   #include "../../../arch/riscv/include/uapi/asm/bitsperlong.h"
> >   #elif defined(__alpha__)
> >   #include "../../../arch/alpha/include/uapi/asm/bitsperlong.h"
> > +#elif defined(__loongarch__)
> > +#include "../../../arch/loongarch/include/uapi/asm/bitsperlong.h"
> >   #else
> >   #include <asm-generic/bitsperlong.h>
> >   #endif

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 03/24] LoongArch: Add elf-related definitions
  2022-05-01  9:41   ` WANG Xuerui
@ 2022-05-01 14:27     ` Huacai Chen
  0 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-05-01 14:27 UTC (permalink / raw)
  To: WANG Xuerui
  Cc: Huacai Chen, Arnd Bergmann, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION, LKML,
	Xuefeng Li, Yanteng Si, Guo Ren, Jiaxun Yang

Hi, Xuerui,

On Sun, May 1, 2022 at 5:41 PM WANG Xuerui <kernel@xen0n.name> wrote:
>
> Hi,
>
> Commit message title could be "ELF" -- proper capitalization.
OK, thanks.

Huacai
>
> On 4/30/22 17:04, Huacai Chen wrote:
> > Add elf-related definitions for LoongArch, including: EM_LOONGARCH,
> > KEXEC_ARCH_LOONGARCH, AUDIT_ARCH_LOONGARCH32, AUDIT_ARCH_LOONGARCH64
> > and NT_LOONGARCH_*.
> >
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > ---
> >   include/uapi/linux/audit.h  | 2 ++
> >   include/uapi/linux/elf-em.h | 1 +
> >   include/uapi/linux/elf.h    | 5 +++++
> >   include/uapi/linux/kexec.h  | 1 +
> >   scripts/sorttable.c         | 5 +++++
> >   5 files changed, 14 insertions(+)
> >
> > diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
> > index 8eda133ca4c1..7c1dc818b1d5 100644
> > --- a/include/uapi/linux/audit.h
> > +++ b/include/uapi/linux/audit.h
> > @@ -439,6 +439,8 @@ enum {
> >   #define AUDIT_ARCH_UNICORE  (EM_UNICORE|__AUDIT_ARCH_LE)
> >   #define AUDIT_ARCH_X86_64   (EM_X86_64|__AUDIT_ARCH_64BIT|__AUDIT_ARCH_LE)
> >   #define AUDIT_ARCH_XTENSA   (EM_XTENSA)
> > +#define AUDIT_ARCH_LOONGARCH32       (EM_LOONGARCH|__AUDIT_ARCH_LE)
> > +#define AUDIT_ARCH_LOONGARCH64       (EM_LOONGARCH|__AUDIT_ARCH_64BIT|__AUDIT_ARCH_LE)
> >
> >   #define AUDIT_PERM_EXEC             1
> >   #define AUDIT_PERM_WRITE    2
> > diff --git a/include/uapi/linux/elf-em.h b/include/uapi/linux/elf-em.h
> > index f47e853546fa..ef38c2bc5ab7 100644
> > --- a/include/uapi/linux/elf-em.h
> > +++ b/include/uapi/linux/elf-em.h
> > @@ -51,6 +51,7 @@
> >   #define EM_RISCV    243     /* RISC-V */
> >   #define EM_BPF              247     /* Linux BPF - in-kernel virtual machine */
> >   #define EM_CSKY             252     /* C-SKY */
> > +#define EM_LOONGARCH 258     /* LoongArch */
> >   #define EM_FRV              0x5441  /* Fujitsu FR-V */
> >
> >   /*
> > diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
> > index 7ce993e6786c..1e0ae3f554f6 100644
> > --- a/include/uapi/linux/elf.h
> > +++ b/include/uapi/linux/elf.h
> > @@ -436,6 +436,11 @@ typedef struct elf64_shdr {
> >   #define NT_MIPS_DSP 0x800           /* MIPS DSP ASE registers */
> >   #define NT_MIPS_FP_MODE     0x801           /* MIPS floating-point mode */
> >   #define NT_MIPS_MSA 0x802           /* MIPS SIMD registers */
> > +#define NT_LOONGARCH_CPUCFG  0xa00   /* LoongArch CPU config registers */
> > +#define NT_LOONGARCH_CSR     0xa01   /* LoongArch control and status registers */
> > +#define NT_LOONGARCH_LSX     0xa02   /* LoongArch Loongson SIMD Extension registers */
> > +#define NT_LOONGARCH_LASX    0xa03   /* LoongArch Loongson Advanced SIMD Extension registers */
> > +#define NT_LOONGARCH_LBT     0xa04   /* LoongArch Loongson Binary Translation registers */
> These are named NT_LARCH_* in binutils source, better keep consistent?
> >
> >   /* Note types with note name "GNU" */
> >   #define NT_GNU_PROPERTY_TYPE_0      5
> > diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h
> > index fb7e2ef60825..981016e05cfa 100644
> > --- a/include/uapi/linux/kexec.h
> > +++ b/include/uapi/linux/kexec.h
> > @@ -43,6 +43,7 @@
> >   #define KEXEC_ARCH_MIPS    ( 8 << 16)
> >   #define KEXEC_ARCH_AARCH64 (183 << 16)
> >   #define KEXEC_ARCH_RISCV   (243 << 16)
> > +#define KEXEC_ARCH_LOONGARCH (258 << 16)
> >
> >   /* The artificial cap on the number of segments passed to kexec_load. */
> >   #define KEXEC_SEGMENT_MAX 16
> > diff --git a/scripts/sorttable.c b/scripts/sorttable.c
> > index d00504c5f530..fba40e99f354 100644
> > --- a/scripts/sorttable.c
> > +++ b/scripts/sorttable.c
> > @@ -60,6 +60,10 @@
> >   #define EM_RISCV    243
> >   #endif
> >
> > +#ifndef EM_LOONGARCH
> > +#define EM_LOONGARCH 258
> > +#endif
> > +
> >   static uint32_t (*r)(const uint32_t *);
> >   static uint16_t (*r2)(const uint16_t *);
> >   static uint64_t (*r8)(const uint64_t *);
> > @@ -313,6 +317,7 @@ static int do_file(char const *const fname, void *addr)
> >       case EM_ARCOMPACT:
> >       case EM_ARCV2:
> >       case EM_ARM:
> > +     case EM_LOONGARCH:
> >       case EM_MICROBLAZE:
> >       case EM_MIPS:
> >       case EM_XTENSA:

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 06/24] LoongArch: Add CPU definition headers
  2022-05-01 11:05   ` WANG Xuerui
@ 2022-05-01 15:07     ` Huacai Chen
  0 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-05-01 15:07 UTC (permalink / raw)
  To: WANG Xuerui
  Cc: Huacai Chen, Arnd Bergmann, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION, LKML,
	Xuefeng Li, Yanteng Si, Guo Ren, Jiaxun Yang

Hi, Xuerui,

On Sun, May 1, 2022 at 7:05 PM WANG Xuerui <kernel@xen0n.name> wrote:
>
>
> On 4/30/22 17:05, Huacai Chen wrote:
> > This patch adds common headers (CPU definition and address space layout)
> > for basic LoongArch support.
> >
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > ---
> >   arch/loongarch/include/asm/addrspace.h    |  110 ++
> >   arch/loongarch/include/asm/cpu-features.h |   69 +
> >   arch/loongarch/include/asm/cpu-info.h     |  136 ++
> >   arch/loongarch/include/asm/cpu.h          |  127 ++
> >   arch/loongarch/include/asm/fpregdef.h     |   49 +
> >   arch/loongarch/include/asm/loongarch.h    | 1528 +++++++++++++++++++++
> >   arch/loongarch/include/asm/loongson.h     |  159 +++
> >   arch/loongarch/include/asm/regdef.h       |   43 +
> >   8 files changed, 2221 insertions(+)
> >   create mode 100644 arch/loongarch/include/asm/addrspace.h
> >   create mode 100644 arch/loongarch/include/asm/cpu-features.h
> >   create mode 100644 arch/loongarch/include/asm/cpu-info.h
> >   create mode 100644 arch/loongarch/include/asm/cpu.h
> >   create mode 100644 arch/loongarch/include/asm/fpregdef.h
> >   create mode 100644 arch/loongarch/include/asm/loongarch.h
> >   create mode 100644 arch/loongarch/include/asm/loongson.h
> >   create mode 100644 arch/loongarch/include/asm/regdef.h
> >
> > diff --git a/arch/loongarch/include/asm/addrspace.h b/arch/loongarch/include/asm/addrspace.h
> > new file mode 100644
> > index 000000000000..e92541629d25
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/addrspace.h
> > @@ -0,0 +1,110 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> This file obviously comes from the MIPS asm/addrspace.h, with visible
> similarities, so you should add attribution here.
> > + */
> > +#ifndef _ASM_ADDRSPACE_H
> > +#define _ASM_ADDRSPACE_H
> > +
> > +#include <linux/const.h>
> > +
> > +#include <asm/loongarch.h>
> > +
> > +/*
> > + * This gives the physical RAM offset.
> > + */
> > +#ifndef __ASSEMBLY__
> > +#ifndef PHYS_OFFSET
> > +#define PHYS_OFFSET  _AC(0, UL)
> > +#endif
> > +extern unsigned long vm_map_base;
> > +#endif /* __ASSEMBLY__ */
> > +
> > +#ifndef IO_BASE
> > +#define IO_BASE                      CSR_DMW0_BASE
> > +#endif
> > +
> > +#ifndef CAC_BASE
> > +#define CAC_BASE             CSR_DMW1_BASE
> Could use something less terse than the MIPS name... "CACHED_BASE"
> sounds a lot better while only costing a few more keystrokes.
I will use CACHE_BASE and UNCACHE_BASE.

> > +#endif
> > +
> > +#ifndef UNCAC_BASE
> > +#define UNCAC_BASE           CSR_DMW0_BASE
> > +#endif
> > +
> > +#define DMW_PABITS   48
> > +#define TO_PHYS_MASK ((1ULL << DMW_PABITS) - 1)
> > +
> > +/*
> > + * Memory above this physical address will be considered highmem.
> > + */
> > +#ifndef HIGHMEM_START
> > +#define HIGHMEM_START                (_AC(1, UL) << _AC(DMW_PABITS, UL))
> > +#endif
> > +
> > +#define TO_PHYS(x)           (             ((x) & TO_PHYS_MASK))
> > +#define TO_CAC(x)            (CAC_BASE   | ((x) & TO_PHYS_MASK))
> > +#define TO_UNCAC(x)          (UNCAC_BASE | ((x) & TO_PHYS_MASK))
> > +
> > +/*
> > + * This handles the memory map.
> > + */
> > +#ifndef PAGE_OFFSET
> > +#define PAGE_OFFSET          (CAC_BASE + PHYS_OFFSET)
> > +#endif
> > +
> > +#ifndef FIXADDR_TOP
> > +#define FIXADDR_TOP          ((unsigned long)(long)(int)0xfffe0000)
> > +#endif
> > +
> > +/*
> > + *  Configure language
> What's a "configure language"? This seems to be carried over from MIPS
> too, better clarify a bit...
> > + */
> > +#ifdef __ASSEMBLY__
> > +#define _ATYPE_
> > +#define _ATYPE32_
> > +#define _ATYPE64_
> > +#define _CONST64_(x) x
> > +#else
> > +#define _ATYPE_              __PTRDIFF_TYPE__
> > +#define _ATYPE32_    int
> > +#define _ATYPE64_    __s64
> > +#ifdef CONFIG_64BIT
> > +#define _CONST64_(x) x ## L
> > +#else
> > +#define _CONST64_(x) x ## LL
> > +#endif
> > +#endif
> > +
> > +/*
> > + *  32/64-bit LoongArch address spaces
> > + */
> > +#ifdef __ASSEMBLY__
> > +#define _ACAST32_
> > +#define _ACAST64_
> > +#else
> > +#define _ACAST32_            (_ATYPE_)(_ATYPE32_)    /* widen if necessary */
> > +#define _ACAST64_            (_ATYPE64_)             /* do _not_ narrow */
> > +#endif
> > +
> > +#ifdef CONFIG_32BIT
> > +
> > +#define UVRANGE                      0x00000000
> > +#define KPRANGE0             0x80000000
> > +#define KPRANGE1             0xa0000000
> > +#define KVRANGE                      0xc0000000
> > +
> > +#else
> > +
> > +#define XUVRANGE             _CONST64_(0x0000000000000000)
> > +#define XSPRANGE             _CONST64_(0x4000000000000000)
> > +#define XKPRANGE             _CONST64_(0x8000000000000000)
> > +#define XKVRANGE             _CONST64_(0xc000000000000000)
> > +
> > +#endif
> > +
> > +/*
> > + * Returns the physical address of a KPRANGEx / XKPRANGE address
> > + */
> > +#define PHYSADDR(a)          ((_ACAST64_(a)) & TO_PHYS_MASK)
> > +
> > +#endif /* _ASM_ADDRSPACE_H */
> > diff --git a/arch/loongarch/include/asm/cpu-features.h b/arch/loongarch/include/asm/cpu-features.h
> > new file mode 100644
> > index 000000000000..e29d446112e8
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/cpu-features.h
> > @@ -0,0 +1,69 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> This is also awfully similar to the MIPS header with the same name. Add
> attribution... I won't repeat myself for the other files out there, but
> make sure all derivative work are appropriately marked!
> > + */
> > +#ifndef __ASM_CPU_FEATURES_H
> > +#define __ASM_CPU_FEATURES_H
> > +
> > +#include <asm/cpu.h>
> > +#include <asm/cpu-info.h>
> > +
> > +#define cpu_opt(opt)                 (cpu_data[0].options & (opt))
> > +#define cpu_has(feat)                        (cpu_data[0].options & BIT_ULL(feat))
> > +
> > +#define cpu_has_loongarch            (cpu_has_loongarch32 | cpu_has_loongarch64)
> > +#define cpu_has_loongarch32          (cpu_data[0].isa_level & LOONGARCH_CPU_ISA_32BIT)
> > +#define cpu_has_loongarch64          (cpu_data[0].isa_level & LOONGARCH_CPU_ISA_64BIT)
> > +
> > +#define cpu_icache_line_size()               cpu_data[0].icache.linesz
> > +#define cpu_dcache_line_size()               cpu_data[0].dcache.linesz
> > +#define cpu_vcache_line_size()               cpu_data[0].vcache.linesz
> > +#define cpu_scache_line_size()               cpu_data[0].scache.linesz
> > +
> > +#ifdef CONFIG_32BIT
> > +# define cpu_has_64bits                      (cpu_data[0].isa_level & LOONGARCH_CPU_ISA_64BIT)
> > +# define cpu_vabits                  31
> > +# define cpu_pabits                  31
> > +#endif
> > +
> > +#ifdef CONFIG_64BIT
> > +# define cpu_has_64bits                      1
> > +# define cpu_vabits                  cpu_data[0].vabits
> > +# define cpu_pabits                  cpu_data[0].pabits
> > +# define __NEED_ADDRBITS_PROBE
> > +#endif
> > +
> > +/*
> > + * SMP assumption: Options of CPU 0 are a superset of all processors.
> > + * This is true for all known LoongArch systems.
> > + */
> > +#define cpu_has_cpucfg               cpu_opt(LOONGARCH_CPU_CPUCFG)
> > +#define cpu_has_lam          cpu_opt(LOONGARCH_CPU_LAM)
> > +#define cpu_has_ual          cpu_opt(LOONGARCH_CPU_UAL)
> > +#define cpu_has_fpu          cpu_opt(LOONGARCH_CPU_FPU)
> > +#define cpu_has_lsx          cpu_opt(LOONGARCH_CPU_LSX)
> > +#define cpu_has_lasx         cpu_opt(LOONGARCH_CPU_LASX)
> > +#define cpu_has_complex              cpu_opt(LOONGARCH_CPU_COMPLEX)
> > +#define cpu_has_crypto               cpu_opt(LOONGARCH_CPU_CRYPTO)
> > +#define cpu_has_lvz          cpu_opt(LOONGARCH_CPU_LVZ)
> > +#define cpu_has_lbt_x86              cpu_opt(LOONGARCH_CPU_LBT_X86)
> > +#define cpu_has_lbt_arm              cpu_opt(LOONGARCH_CPU_LBT_ARM)
> > +#define cpu_has_lbt_mips     cpu_opt(LOONGARCH_CPU_LBT_MIPS)
> > +#define cpu_has_lbt          (cpu_has_lbt_x86|cpu_has_lbt_arm|cpu_has_lbt_mips)
> > +#define cpu_has_csr          cpu_opt(LOONGARCH_CPU_CSR)
> > +#define cpu_has_tlb          cpu_opt(LOONGARCH_CPU_TLB)
> > +#define cpu_has_watch                cpu_opt(LOONGARCH_CPU_WATCH)
> > +#define cpu_has_vint         cpu_opt(LOONGARCH_CPU_VINT)
> > +#define cpu_has_csripi               cpu_opt(LOONGARCH_CPU_CSRIPI)
> > +#define cpu_has_extioi               cpu_opt(LOONGARCH_CPU_EXTIOI)
> > +#define cpu_has_prefetch     cpu_opt(LOONGARCH_CPU_PREFETCH)
> > +#define cpu_has_pmp          cpu_opt(LOONGARCH_CPU_PMP)
> > +#define cpu_has_perf         cpu_opt(LOONGARCH_CPU_PMP)
> > +#define cpu_has_scalefreq    cpu_opt(LOONGARCH_CPU_SCALEFREQ)
> > +#define cpu_has_flatmode     cpu_opt(LOONGARCH_CPU_FLATMODE)
> > +#define cpu_has_eiodecode    cpu_opt(LOONGARCH_CPU_EIODECODE)
> > +#define cpu_has_guestid              cpu_opt(LOONGARCH_CPU_GUESTID)
> > +#define cpu_has_hypervisor   cpu_opt(LOONGARCH_CPU_HYPERVISOR)
> These are all dynamic, according to these definitions, unlike the MIPS
> asm/cpu-features.h where features can be statically overridden by
> individual mach. So we can drop most of these and just write
> cpu_opt(XXX) inline everywhere?
> > +
> > +
> > +#endif /* __ASM_CPU_FEATURES_H */
> > diff --git a/arch/loongarch/include/asm/cpu-info.h b/arch/loongarch/include/asm/cpu-info.h
> > new file mode 100644
> > index 000000000000..8c173ee5650b
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/cpu-info.h
> > @@ -0,0 +1,136 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef __ASM_CPU_INFO_H
> > +#define __ASM_CPU_INFO_H
> > +
> > +#include <linux/cache.h>
> > +#include <linux/types.h>
> > +
> > +#include <asm/loongarch.h>
> > +
> > +/*
> > + * Descriptor for a cache
> > + */
> > +struct cache_desc {
> > +     unsigned int waysize;   /* Bytes per way */
> > +     unsigned short sets;    /* Number of lines per set */
> > +     unsigned char ways;     /* Number of ways */
> > +     unsigned char linesz;   /* Size of line in bytes */
> > +     unsigned char waybit;   /* Bits to select in a cache set */
> > +     unsigned char flags;    /* Flags describing cache properties */
> > +};
> > +
> > +struct cpuinfo_loongarch {
> > +     u64                     asid_cache;
> > +     unsigned long           asid_mask;
> > +
> > +     /*
> > +      * Capability and feature descriptor structure for LoongArch CPU
> > +      */
> > +     unsigned long           ases;
> Please don't use MIPS acronyms, especially NOT this one. ASE means
> Application-Specific Extension, but LoongArch has none
> "application-specific" adaptations at the moment...
OK, ases should be removed.

> > +     unsigned long long      options;
> > +     unsigned int            processor_id;
> > +     unsigned int            fpu_vers;
> > +     unsigned int            fpu_csr0;
> > +     unsigned int            fpu_mask;
> > +     unsigned int            cputype;
> > +     int                     isa_level;
> > +     int                     tlbsize;
> > +     int                     tlbsizemtlb;
> > +     int                     tlbsizestlbsets;
> > +     int                     tlbsizestlbways;
> > +     struct cache_desc       icache; /* Primary I-cache */
> > +     struct cache_desc       dcache; /* Primary D or combined I/D cache */
> > +     struct cache_desc       vcache; /* Victim cache, between pcache and scache */
> > +     struct cache_desc       scache; /* Secondary cache */
> > +     struct cache_desc       tcache; /* Tertiary/split secondary cache */
> > +     int                     package;/* physical package number */
> > +     unsigned int            globalnumber;
> > +     int                     vabits; /* Virtual Address size in bits */
> > +     int                     pabits; /* Physical Address size in bits */
> > +     void                    *data;  /* Additional data */
> > +     unsigned int            watch_dreg_count;   /* Number data breakpoints */
> > +     unsigned int            watch_ireg_count;   /* Number instruction breakpoints */
> > +     unsigned int            watch_reg_use_cnt; /* min(NUM_WATCH_REGS, watch_dreg_count + watch_ireg_count), Usable by ptrace */
> > +     unsigned int            kscratch_mask; /* Usable KScratch mask. */
> > +} __aligned(SMP_CACHE_BYTES);
> > +
> > +extern struct cpuinfo_loongarch cpu_data[];
> > +#define boot_cpu_data cpu_data[0]
> > +#define current_cpu_data cpu_data[smp_processor_id()]
> > +#define raw_current_cpu_data cpu_data[raw_smp_processor_id()]
> > +
> > +extern void cpu_probe(void);
> > +
> > +extern const char *__cpu_family[];
> > +extern const char *__cpu_full_name[];
> > +#define cpu_family_string()  __cpu_family[raw_smp_processor_id()]
> > +#define cpu_full_name_string()       __cpu_full_name[raw_smp_processor_id()]
> > +
> > +struct seq_file;
> > +struct notifier_block;
> > +
> > +extern int register_proc_cpuinfo_notifier(struct notifier_block *nb);
> > +extern int proc_cpuinfo_notifier_call_chain(unsigned long val, void *v);
> > +
> > +#define proc_cpuinfo_notifier(fn, pri)                                       \
> > +({                                                                   \
> > +     static struct notifier_block fn##_nb = {                        \
> > +             .notifier_call = fn,                                    \
> > +             .priority = pri                                         \
> > +     };                                                              \
> > +                                                                     \
> > +     register_proc_cpuinfo_notifier(&fn##_nb);                       \
> > +})
> > +
> > +struct proc_cpuinfo_notifier_args {
> > +     struct seq_file *m;
> > +     unsigned long n;
> > +};
> > +
> > +static inline unsigned int cpu_cluster(struct cpuinfo_loongarch *cpuinfo)
> > +{
> > +     return (cpuinfo->globalnumber & LOONGARCH_GLOBALNUMBER_CLUSTER) >>
> > +             LOONGARCH_GLOBALNUMBER_CLUSTER_SHF;
> > +}
> > +
> > +static inline unsigned int cpu_core(struct cpuinfo_loongarch *cpuinfo)
> > +{
> > +     return (cpuinfo->globalnumber & LOONGARCH_GLOBALNUMBER_CORE) >>
> > +             LOONGARCH_GLOBALNUMBER_CORE_SHF;
> > +}
> > +
> > +extern void cpu_set_cluster(struct cpuinfo_loongarch *cpuinfo, unsigned int cluster);
> > +extern void cpu_set_core(struct cpuinfo_loongarch *cpuinfo, unsigned int core);
> > +
> > +static inline bool cpus_are_siblings(int cpua, int cpub)
> > +{
> > +     struct cpuinfo_loongarch *infoa = &cpu_data[cpua];
> > +     struct cpuinfo_loongarch *infob = &cpu_data[cpub];
> > +     unsigned int gnuma, gnumb;
> > +
> > +     if (infoa->package != infob->package)
> > +             return false;
> > +
> > +     gnuma = infoa->globalnumber & ~LOONGARCH_GLOBALNUMBER_VP;
> > +     gnumb = infob->globalnumber & ~LOONGARCH_GLOBALNUMBER_VP;
> > +     if (gnuma != gnumb)
> > +             return false;
> > +
> > +     return true;
> > +}
>
> Please don't use the "global number" expression anywhere in the port;
> come up with another suitable name.
>
> I'm initially confused by all the "GLOBALNUMBER" and "VP" things, only
> knowing they might be related to something MIPS-specific, because the
> "global numbers" of powerpc and microblaze obviously stand for PCI
> domain number, as explained by their comments. After some further
> digging it became obvious, that GlobalNumber is actually a MIPSr6+
> configuration register, introduced by the MT ASE and conveying
> information regarding Virtual Processors. Of course LoongArch doesn't
> have anything similar to that...
Globalnumber can encode cluster number, core number and threading
number in a single word. Since we have no SMT now, I will remove it.

>
> > +
> > +static inline unsigned long cpu_asid_mask(struct cpuinfo_loongarch *cpuinfo)
> > +{
> > +     return cpuinfo->asid_mask;
> > +}
> > +
> > +static inline void set_cpu_asid_mask(struct cpuinfo_loongarch *cpuinfo,
> > +                                  unsigned long asid_mask)
> > +{
> > +     cpuinfo->asid_mask = asid_mask;
> > +}
> Why keep the accessors when you can just inline the expression at call
> site? MIPS does this because they have to differentiate based on
> CONFIG_MIPS_ASID_BITS_VARIABLE, but LoongArch doesn't behave the same,
> so the 2 functions here should be removed.
> > +
> > +#endif /* __ASM_CPU_INFO_H */
> > diff --git a/arch/loongarch/include/asm/cpu.h b/arch/loongarch/include/asm/cpu.h
> > new file mode 100644
> > index 000000000000..62e9cb6520a9
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/cpu.h
> > @@ -0,0 +1,127 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * cpu.h: Values of the PRId register used to match up
> > + *     various LoongArch cpu types.
>
> "PRID"; "CPU".
>
> Similarly please change all other "PRId" to "PRID" to match the
> reference manual...
>
> > + *
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef _ASM_CPU_H
> > +#define _ASM_CPU_H
> > +
> > +/*
> > + * As of the LoongArch specs from Loongson Technology, the PRId register
> > + * (CPUCFG.00) is defined in this (backwards compatible) way:
> > + *
> > + * +----------------+----------------+----------------+----------------+
> > + * | Reserved       | Company ID         | Processor ID   | Revision       |
> > + * +----------------+----------------+----------------+----------------+
> > + *  31                24 23            16 15             8 7              0
>
> I can't find the relevant spec... at least not in the Loongson
> 3A5000/3B5000 Processor Reference Manual nor the LoongArch reference
> manual. The former only gives the default value of 0x14c010 while the
> latter only mentions the field's name and meaning.
>
> Also, "as of the specs" is broken; you should say "As described in the ...".
>
> > + *
> > + */
> > +
> > +/*
> > + * Assigned Company values for bits 23:16 of the PRId register.
> > + */
> > +
> > +#define PRID_COMP_MASK               0xff0000
> > +
> > +#define PRID_COMP_LOONGSON   0x140000
> > +
> > +/*
> > + * Assigned Processor ID (implementation) values for bits 15:8 of the PRId
> > + * register.  In order to detect a certain CPU type exactly eventually
> > + * additional registers may need to be examined.
> > + */
> > +
> > +#define PRID_IMP_MASK                0xff00
> > +
> > +#define PRID_IMP_LOONGSON_32 0x4200  /* Loongson 32bit */
> > +#define PRID_IMP_LOONGSON_64R        0x6100  /* Reduced Loongson 64bit */
> > +#define PRID_IMP_LOONGSON_64C        0x6300  /* Classic Loongson 64bit */
> Do we even have "classic" LoongArch cores? This scheme surely is carried
> over from the MIPS era, but I don't think any of the "classic"
> Loongson/MIPS cores is getting a LoongArch refresh...
You should consider 2K500 and so on.

> > +#define PRID_IMP_LOONGSON_64G        0xc000  /* Generic Loongson 64bit */
> > +#define PRID_IMP_UNKNOWN     0xff00
> > +
> > +/*
> > + * Particular Revision values for bits 7:0 of the PRId register.
> > + */
> > +
> > +#define PRID_REV_MASK                0x00ff
> > +
> > +#if !defined(__ASSEMBLY__)
> > +
> > +enum cpu_type_enum {
> > +     CPU_UNKNOWN,
> > +     CPU_LOONGSON32,
> > +     CPU_LOONGSON64,
> > +     CPU_LAST
> > +};
> > +
> > +#endif /* !__ASSEMBLY */
> > +
> > +/*
> > + * ISA Level encodings
> > + *
> > + */
> > +
> > +#define LOONGARCH_CPU_ISA_LA32R 0x00000001
> > +#define LOONGARCH_CPU_ISA_LA32S 0x00000002
> > +#define LOONGARCH_CPU_ISA_LA64  0x00000004
> > +
> > +#define LOONGARCH_CPU_ISA_32BIT (LOONGARCH_CPU_ISA_LA32R | LOONGARCH_CPU_ISA_LA32S)
> > +#define LOONGARCH_CPU_ISA_64BIT LOONGARCH_CPU_ISA_LA64
> > +
> > +/*
> > + * CPU Option encodings
> > + */
> > +#define CPU_FEATURE_CPUCFG           0       /* CPU has CPUCFG */
> > +#define CPU_FEATURE_LAM                      1       /* CPU has Atomic instructions */
> > +#define CPU_FEATURE_UAL                      2       /* CPU has Unaligned Access support */
> > +#define CPU_FEATURE_FPU                      3       /* CPU has FPU */
> > +#define CPU_FEATURE_LSX                      4       /* CPU has 128bit SIMD instructions */
> > +#define CPU_FEATURE_LASX             5       /* CPU has 256bit SIMD instructions */
> > +#define CPU_FEATURE_COMPLEX          6       /* CPU has Complex instructions */
> > +#define CPU_FEATURE_CRYPTO           7       /* CPU has Crypto instructions */
> > +#define CPU_FEATURE_LVZ                      8       /* CPU has Virtualization extension */
> > +#define CPU_FEATURE_LBT_X86          9       /* CPU has X86 Binary Translation */
> > +#define CPU_FEATURE_LBT_ARM          10      /* CPU has ARM Binary Translation */
> > +#define CPU_FEATURE_LBT_MIPS         11      /* CPU has MIPS Binary Translation */
> > +#define CPU_FEATURE_TLB                      12      /* CPU has TLB */
> > +#define CPU_FEATURE_CSR                      13      /* CPU has CSR feature */
> > +#define CPU_FEATURE_WATCH            14      /* CPU has watchpoint registers */
> > +#define CPU_FEATURE_VINT             15      /* CPU has vectored interrupts */
> > +#define CPU_FEATURE_CSRIPI           16      /* CPU has CSR-IPI */
> > +#define CPU_FEATURE_EXTIOI           17      /* CPU has EXT-IOI */
> > +#define CPU_FEATURE_PREFETCH         18      /* CPU has prefetch instructions */
> > +#define CPU_FEATURE_PMP                      19      /* CPU has perfermance counter */
> > +#define CPU_FEATURE_SCALEFREQ                20      /* CPU support scale cpufreq */
> > +#define CPU_FEATURE_FLATMODE         21      /* CPU has flatmode */
> > +#define CPU_FEATURE_EIODECODE                22      /* CPU has extioi int pin decode mode */
> "EXTIOI interrupt pin decoding mode"?
> > +#define CPU_FEATURE_GUESTID          23      /* CPU has GuestID feature */
> > +#define CPU_FEATURE_HYPERVISOR               24      /* CPU has hypervisor (run in VM) */
> "CPU is virtualized (under a hypervisor)"?
> > +
> > +#define LOONGARCH_CPU_CPUCFG         BIT_ULL(CPU_FEATURE_CPUCFG)
> > +#define LOONGARCH_CPU_LAM            BIT_ULL(CPU_FEATURE_LAM)
> > +#define LOONGARCH_CPU_UAL            BIT_ULL(CPU_FEATURE_UAL)
> > +#define LOONGARCH_CPU_FPU            BIT_ULL(CPU_FEATURE_FPU)
> > +#define LOONGARCH_CPU_LSX            BIT_ULL(CPU_FEATURE_LSX)
> > +#define LOONGARCH_CPU_LASX           BIT_ULL(CPU_FEATURE_LASX)
> > +#define LOONGARCH_CPU_COMPLEX                BIT_ULL(CPU_FEATURE_COMPLEX)
> > +#define LOONGARCH_CPU_CRYPTO         BIT_ULL(CPU_FEATURE_CRYPTO)
> > +#define LOONGARCH_CPU_LVZ            BIT_ULL(CPU_FEATURE_LVZ)
> > +#define LOONGARCH_CPU_LBT_X86                BIT_ULL(CPU_FEATURE_LBT_X86)
> > +#define LOONGARCH_CPU_LBT_ARM                BIT_ULL(CPU_FEATURE_LBT_ARM)
> > +#define LOONGARCH_CPU_LBT_MIPS               BIT_ULL(CPU_FEATURE_LBT_MIPS)
> > +#define LOONGARCH_CPU_TLB            BIT_ULL(CPU_FEATURE_TLB)
> > +#define LOONGARCH_CPU_CSR            BIT_ULL(CPU_FEATURE_CSR)
> > +#define LOONGARCH_CPU_WATCH          BIT_ULL(CPU_FEATURE_WATCH)
> > +#define LOONGARCH_CPU_VINT           BIT_ULL(CPU_FEATURE_VINT)
> > +#define LOONGARCH_CPU_CSRIPI         BIT_ULL(CPU_FEATURE_CSRIPI)
> > +#define LOONGARCH_CPU_EXTIOI         BIT_ULL(CPU_FEATURE_EXTIOI)
> > +#define LOONGARCH_CPU_PREFETCH               BIT_ULL(CPU_FEATURE_PREFETCH)
> > +#define LOONGARCH_CPU_PMP            BIT_ULL(CPU_FEATURE_PMP)
> > +#define LOONGARCH_CPU_SCALEFREQ              BIT_ULL(CPU_FEATURE_SCALEFREQ)
> > +#define LOONGARCH_CPU_FLATMODE               BIT_ULL(CPU_FEATURE_FLATMODE)
> > +#define LOONGARCH_CPU_EIODECODE              BIT_ULL(CPU_FEATURE_EIODECODE)
> > +#define LOONGARCH_CPU_GUESTID                BIT_ULL(CPU_FEATURE_GUESTID)
> > +#define LOONGARCH_CPU_HYPERVISOR     BIT_ULL(CPU_FEATURE_HYPERVISOR)
> > +#endif /* _ASM_CPU_H */
> > diff --git a/arch/loongarch/include/asm/fpregdef.h b/arch/loongarch/include/asm/fpregdef.h
> > new file mode 100644
> > index 000000000000..151dc9aee1c6
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/fpregdef.h
> > @@ -0,0 +1,49 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Definitions for the FPU register names
> > + *
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef _ASM_FPREGDEF_H
> > +#define _ASM_FPREGDEF_H
> > +
> > +#define fv0  $f0     /* return value */
> > +#define fv1  $f2
> > +#define fa0  $f12    /* argument registers */
> > +#define fa1  $f13
> > +#define fa2  $f14
> > +#define fa3  $f15
> > +#define fa4  $f16
> > +#define fa5  $f17
> > +#define fa6  $f18
> > +#define fa7  $f19
> > +#define ft0  $f4     /* caller saved */
> > +#define ft1  $f5
> > +#define ft2  $f6
> > +#define ft3  $f7
> > +#define ft4  $f8
> > +#define ft5  $f9
> > +#define ft6  $f10
> > +#define ft7  $f11
> > +#define ft8  $f20
> > +#define ft9  $f21
> > +#define ft10 $f22
> > +#define ft11 $f23
> > +#define ft12 $f1
> > +#define ft13 $f3
> > +#define fs0  $f24    /* callee saved */
> > +#define fs1  $f25
> > +#define fs2  $f26
> > +#define fs3  $f27
> > +#define fs4  $f28
> > +#define fs5  $f29
> > +#define fs6  $f30
> > +#define fs7  $f31
> This doesn't agree with the current ABI spec, and may need further
> checking. There's no way $f1 could be $ft12 and return values not
> sharing storage with the first two arguments.
> > +
> > +#define fcsr0        $r0
> > +#define fcsr1        $r1
> > +#define fcsr2        $r2
> > +#define fcsr3        $r3
> > +#define vcsr16       $r16
> > +
> > +#endif /* _ASM_FPREGDEF_H */
> > diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h
> > new file mode 100644
> > index 000000000000..083e6726d4cb
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/loongarch.h
> > @@ -0,0 +1,1528 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef _ASM_LOONGARCH_H
> > +#define _ASM_LOONGARCH_H
> > +
> > +#include <linux/bits.h>
> > +#include <linux/linkage.h>
> > +#include <linux/types.h>
> > +
> > +#ifndef __ASSEMBLY__
> > +#include <larchintrin.h>
> > +
> > +/*
> > + * parse_r var, r - Helper assembler macro for parsing register names.
> > + *
> > + * This converts the register name in $n form provided in \r to the
> > + * corresponding register number, which is assigned to the variable \var. It is
> > + * needed to allow explicit encoding of instructions in inline assembly where
> > + * registers are chosen by the compiler in $n form, allowing us to avoid using
> > + * fixed register numbers.
> > + *
> > + * It also allows newer instructions (not implemented by the assembler) to be
> > + * transparently implemented using assembler macros, instead of needing separate
> > + * cases depending on toolchain support.
> > + *
> > + * Simple usage example:
> > + * __asm__ __volatile__("parse_r addr, %0\n\t"
> > + *                   "#invtlb op, 0, %0\n\t"
> > + *                   ".word ((0x6498000) | (addr << 10) | (0 << 5) | op)"
> > + *                   : "=r" (status);
> > + */
> > +
> > +/* Match an individual register number and assign to \var */
> > +#define _IFC_REG(n)                          \
> > +     ".ifc   \\r, $r" #n "\n\t"              \
> > +     "\\var  = " #n "\n\t"                   \
> > +     ".endif\n\t"
> > +
> > +__asm__(".macro      parse_r var r\n\t"
> > +     "\\var  = -1\n\t"
> > +     _IFC_REG(0)  _IFC_REG(1)  _IFC_REG(2)  _IFC_REG(3)
> > +     _IFC_REG(4)  _IFC_REG(5)  _IFC_REG(6)  _IFC_REG(7)
> > +     _IFC_REG(8)  _IFC_REG(9)  _IFC_REG(10) _IFC_REG(11)
> > +     _IFC_REG(12) _IFC_REG(13) _IFC_REG(14) _IFC_REG(15)
> > +     _IFC_REG(16) _IFC_REG(17) _IFC_REG(18) _IFC_REG(19)
> > +     _IFC_REG(20) _IFC_REG(21) _IFC_REG(22) _IFC_REG(23)
> > +     _IFC_REG(24) _IFC_REG(25) _IFC_REG(26) _IFC_REG(27)
> > +     _IFC_REG(28) _IFC_REG(29) _IFC_REG(30) _IFC_REG(31)
> > +     ".iflt  \\var\n\t"
> > +     ".error \"Unable to parse register name \\r\"\n\t"
> > +     ".endif\n\t"
> > +     ".endm");
> > +
> > +#undef _IFC_REG
> > +
> > +/* CPUCFG */
> > +static inline u32 read_cpucfg(u32 reg)
> > +{
> > +     return __cpucfg(reg);
> > +}
> > +
> > +#endif /* !__ASSEMBLY__ */
> > +
> > +#ifdef __ASSEMBLY__
> > +
> > +/* LoongArch Registers */
> > +#define REG_RA       0x1
> > +#define REG_TP       0x2
> > +#define REG_SP       0x3
> > +#define REG_A0       0x4
> > +#define REG_A1       0x5
> > +#define REG_A2       0x6
> > +#define REG_A3       0x7
> > +#define REG_A4       0x8
> > +#define REG_A5       0x9
> > +#define REG_A6       0xa
> > +#define REG_A7       0xb
> > +#define REG_V0       REG_A0
> > +#define REG_V1       REG_A1
> Remove V* aliases per spec.
> > +#define REG_T0       0xc
> > +#define REG_T1       0xd
> > +#define REG_T2       0xe
> > +#define REG_T3       0xf
> > +#define REG_T4       0x10
> > +#define REG_T5       0x11
> > +#define REG_T6       0x12
> > +#define REG_T7       0x13
> > +#define REG_T8       0x14
> > +#define REG_U0       0x15
> And document this somewhere.
OK, thanks.

> > +#define REG_FP       0x16
> > +#define REG_S0       0x17
> > +#define REG_S1       0x18
> > +#define REG_S2       0x19
> > +#define REG_S3       0x1a
> > +#define REG_S4       0x1b
> > +#define REG_S5       0x1c
> > +#define REG_S6       0x1d
> > +#define REG_S7       0x1e
> > +#define REG_S8       0x1f
> > +
> > +#endif /* __ASSEMBLY__ */
> > +
> > +/* Bit Domains for CPUCFG registers */
> "bit fields"?
OK, thanks.

Huacai
> > +#define LOONGARCH_CPUCFG0            0x0
> > +#define  CPUCFG0_PRID                        GENMASK(31, 0)
> > +
> > +#define LOONGARCH_CPUCFG1            0x1
> > +#define  CPUCFG1_ISGR32                      BIT(0)
> > +#define  CPUCFG1_ISGR64                      BIT(1)
> > +#define  CPUCFG1_PAGING                      BIT(2)
> > +#define  CPUCFG1_IOCSR                       BIT(3)
> > +#define  CPUCFG1_PABITS                      GENMASK(11, 4)
> > +#define  CPUCFG1_VABITS                      GENMASK(19, 12)
> > +#define  CPUCFG1_UAL                 BIT(20)
> > +#define  CPUCFG1_RI                  BIT(21)
> > +#define  CPUCFG1_EP                  BIT(22)
> > +#define  CPUCFG1_RPLV                        BIT(23)
> > +#define  CPUCFG1_HUGEPG                      BIT(24)
> > +#define  CPUCFG1_IOCSRBRD            BIT(25)
> > +#define  CPUCFG1_MSGINT                      BIT(26)
> These names are not consistent with the reference manual. For example
> the bits 0-1 is actually an enum, and bit 2 is called PGMMU in the
> LoongArch reference manual.
> > +
> > +#define LOONGARCH_CPUCFG2            0x2
> > +#define  CPUCFG2_FP                  BIT(0)
> > +#define  CPUCFG2_FPSP                        BIT(1)
> > +#define  CPUCFG2_FPDP                        BIT(2)
> > +#define  CPUCFG2_FPVERS                      GENMASK(5, 3)
> > +#define  CPUCFG2_LSX                 BIT(6)
> > +#define  CPUCFG2_LASX                        BIT(7)
> > +#define  CPUCFG2_COMPLEX             BIT(8)
> > +#define  CPUCFG2_CRYPTO                      BIT(9)
> > +#define  CPUCFG2_LVZP                        BIT(10)
> > +#define  CPUCFG2_LVZVER                      GENMASK(13, 11)
> > +#define  CPUCFG2_LLFTP                       BIT(14)
> > +#define  CPUCFG2_LLFTPREV            GENMASK(17, 15)
> > +#define  CPUCFG2_X86BT                       BIT(18)
> > +#define  CPUCFG2_ARMBT                       BIT(19)
> > +#define  CPUCFG2_MIPSBT                      BIT(20)
> > +#define  CPUCFG2_LSPW                        BIT(21)
> > +#define  CPUCFG2_LAM                 BIT(22)
> > +
> > +#define LOONGARCH_CPUCFG3            0x3
> > +#define  CPUCFG3_CCDMA                       BIT(0)
> > +#define  CPUCFG3_SFB                 BIT(1)
> > +#define  CPUCFG3_UCACC                       BIT(2)
> > +#define  CPUCFG3_LLEXC                       BIT(3)
> > +#define  CPUCFG3_SCDLY                       BIT(4)
> > +#define  CPUCFG3_LLDBAR                      BIT(5)
> > +#define  CPUCFG3_ITLBT                       BIT(6)
> > +#define  CPUCFG3_ICACHET             BIT(7)
> > +#define  CPUCFG3_SPW_LVL             GENMASK(10, 8)
> > +#define  CPUCFG3_SPW_HG_HF           BIT(11)
> > +#define  CPUCFG3_RVA                 BIT(12)
> > +#define  CPUCFG3_RVAMAX                      GENMASK(16, 13)
> > +
> > +#define LOONGARCH_CPUCFG4            0x4
> > +#define  CPUCFG4_CCFREQ                      GENMASK(31, 0)
> > +
> > +#define LOONGARCH_CPUCFG5            0x5
> > +#define  CPUCFG5_CCMUL                       GENMASK(15, 0)
> > +#define  CPUCFG5_CCDIV                       GENMASK(31, 16)
> > +
> > +#define LOONGARCH_CPUCFG6            0x6
> > +#define  CPUCFG6_PMP                 BIT(0)
> > +#define  CPUCFG6_PAMVER                      GENMASK(3, 1)
> > +#define  CPUCFG6_PMNUM                       GENMASK(7, 4)
> > +#define  CPUCFG6_PMBITS                      GENMASK(13, 8)
> > +#define  CPUCFG6_UPM                 BIT(14)
> > +
> > +#define LOONGARCH_CPUCFG16           0x10
> > +#define  CPUCFG16_L1_IUPRE           BIT(0)
> > +#define  CPUCFG16_L1_IUUNIFY         BIT(1)
> > +#define  CPUCFG16_L1_DPRE            BIT(2)
> > +#define  CPUCFG16_L2_IUPRE           BIT(3)
> > +#define  CPUCFG16_L2_IUUNIFY         BIT(4)
> > +#define  CPUCFG16_L2_IUPRIV          BIT(5)
> > +#define  CPUCFG16_L2_IUINCL          BIT(6)
> > +#define  CPUCFG16_L2_DPRE            BIT(7)
> > +#define  CPUCFG16_L2_DPRIV           BIT(8)
> > +#define  CPUCFG16_L2_DINCL           BIT(9)
> > +#define  CPUCFG16_L3_IUPRE           BIT(10)
> > +#define  CPUCFG16_L3_IUUNIFY         BIT(11)
> > +#define  CPUCFG16_L3_IUPRIV          BIT(12)
> > +#define  CPUCFG16_L3_IUINCL          BIT(13)
> > +#define  CPUCFG16_L3_DPRE            BIT(14)
> > +#define  CPUCFG16_L3_DPRIV           BIT(15)
> > +#define  CPUCFG16_L3_DINCL           BIT(16)
> > +
> > +#define LOONGARCH_CPUCFG17           0x11
> > +#define  CPUCFG17_L1I_WAYS_M         GENMASK(15, 0)
> > +#define  CPUCFG17_L1I_SETS_M         GENMASK(23, 16)
> > +#define  CPUCFG17_L1I_SIZE_M         GENMASK(30, 24)
> > +#define  CPUCFG17_L1I_WAYS           0
> > +#define  CPUCFG17_L1I_SETS           16
> > +#define  CPUCFG17_L1I_SIZE           24
> > +
> > +#define LOONGARCH_CPUCFG18           0x12
> > +#define  CPUCFG18_L1D_WAYS_M         GENMASK(15, 0)
> > +#define  CPUCFG18_L1D_SETS_M         GENMASK(23, 16)
> > +#define  CPUCFG18_L1D_SIZE_M         GENMASK(30, 24)
> > +#define  CPUCFG18_L1D_WAYS           0
> > +#define  CPUCFG18_L1D_SETS           16
> > +#define  CPUCFG18_L1D_SIZE           24
> > +
> > +#define LOONGARCH_CPUCFG19           0x13
> > +#define  CPUCFG19_L2_WAYS_M          GENMASK(15, 0)
> > +#define  CPUCFG19_L2_SETS_M          GENMASK(23, 16)
> > +#define  CPUCFG19_L2_SIZE_M          GENMASK(30, 24)
> > +#define  CPUCFG19_L2_WAYS            0
> > +#define  CPUCFG19_L2_SETS            16
> > +#define  CPUCFG19_L2_SIZE            24
> > +
> > +#define LOONGARCH_CPUCFG20           0x14
> > +#define  CPUCFG20_L3_WAYS_M          GENMASK(15, 0)
> > +#define  CPUCFG20_L3_SETS_M          GENMASK(23, 16)
> > +#define  CPUCFG20_L3_SIZE_M          GENMASK(30, 24)
> > +#define  CPUCFG20_L3_WAYS            0
> > +#define  CPUCFG20_L3_SETS            16
> > +#define  CPUCFG20_L3_SIZE            24
> > +
> > +#define LOONGARCH_CPUCFG48           0x30
> > +#define  CPUCFG48_MCSR_LCK           BIT(0)
> > +#define  CPUCFG48_NAP_EN             BIT(1)
> > +#define  CPUCFG48_VFPU_CG            BIT(2)
> > +#define  CPUCFG48_RAM_CG             BIT(3)
> > +
> > +#ifndef __ASSEMBLY__
> > +
> > +/* CSR */
> > +static __always_inline u32 csr_readl(u32 reg)
> > +{
> > +     return __csrrd_w(reg);
> > +}
> > +
> > +static __always_inline u64 csr_readq(u32 reg)
> > +{
> > +     return __csrrd_d(reg);
> > +}
> > +
> > +static __always_inline void csr_writel(u32 val, u32 reg)
> > +{
> > +     __csrwr_w(val, reg);
> > +}
> > +
> > +static __always_inline void csr_writeq(u64 val, u32 reg)
> > +{
> > +     __csrwr_d(val, reg);
> > +}
> > +
> > +static __always_inline u32 csr_xchgl(u32 val, u32 mask, u32 reg)
> > +{
> > +     return __csrxchg_w(val, mask, reg);
> > +}
> > +
> > +static __always_inline u64 csr_xchgq(u64 val, u64 mask, u32 reg)
> > +{
> > +     return __csrxchg_d(val, mask, reg);
> > +}
> > +
> > +/* IOCSR */
> > +static __always_inline u32 iocsr_readl(u32 reg)
> > +{
> > +     return __iocsrrd_w(reg);
> > +}
> > +
> > +static __always_inline u64 iocsr_readq(u32 reg)
> > +{
> > +     return __iocsrrd_d(reg);
> > +}
> > +
> > +static __always_inline void iocsr_writel(u32 val, u32 reg)
> > +{
> > +     __iocsrwr_w(val, reg);
> > +}
> > +
> > +static __always_inline void iocsr_writeq(u64 val, u32 reg)
> > +{
> > +     __iocsrwr_d(val, reg);
> > +}
> > +
> > +#endif /* !__ASSEMBLY__ */
> > +
> > +/* CSR register number */
> > +
> > +/* Basic CSR registers */
> > +#define LOONGARCH_CSR_CRMD           0x0     /* Current mode info */
> > +#define  CSR_CRMD_WE_SHIFT           9
> > +#define  CSR_CRMD_WE                 (_ULCAST_(0x1) << CSR_CRMD_WE_SHIFT)
> > +#define  CSR_CRMD_DACM_SHIFT         7
> > +#define  CSR_CRMD_DACM_WIDTH         2
> > +#define  CSR_CRMD_DACM                       (_ULCAST_(0x3) << CSR_CRMD_DACM_SHIFT)
> > +#define  CSR_CRMD_DACF_SHIFT         5
> > +#define  CSR_CRMD_DACF_WIDTH         2
> > +#define  CSR_CRMD_DACF                       (_ULCAST_(0x3) << CSR_CRMD_DACF_SHIFT)
> > +#define  CSR_CRMD_PG_SHIFT           4
> > +#define  CSR_CRMD_PG                 (_ULCAST_(0x1) << CSR_CRMD_PG_SHIFT)
> > +#define  CSR_CRMD_DA_SHIFT           3
> > +#define  CSR_CRMD_DA                 (_ULCAST_(0x1) << CSR_CRMD_DA_SHIFT)
> > +#define  CSR_CRMD_IE_SHIFT           2
> > +#define  CSR_CRMD_IE                 (_ULCAST_(0x1) << CSR_CRMD_IE_SHIFT)
> > +#define  CSR_CRMD_PLV_SHIFT          0
> > +#define  CSR_CRMD_PLV_WIDTH          2
> > +#define  CSR_CRMD_PLV                        (_ULCAST_(0x3) << CSR_CRMD_PLV_SHIFT)
> > +
> > +#define PLV_KERN                     0
> > +#define PLV_USER                     3
> > +#define PLV_MASK                     0x3
> > +
> > +#define LOONGARCH_CSR_PRMD           0x1     /* Prev-exception mode info */
> > +#define  CSR_PRMD_PWE_SHIFT          3
> > +#define  CSR_PRMD_PWE                        (_ULCAST_(0x1) << CSR_PRMD_PWE_SHIFT)
> > +#define  CSR_PRMD_PIE_SHIFT          2
> > +#define  CSR_PRMD_PIE                        (_ULCAST_(0x1) << CSR_PRMD_PIE_SHIFT)
> > +#define  CSR_PRMD_PPLV_SHIFT         0
> > +#define  CSR_PRMD_PPLV_WIDTH         2
> > +#define  CSR_PRMD_PPLV                       (_ULCAST_(0x3) << CSR_PRMD_PPLV_SHIFT)
> > +
> > +#define LOONGARCH_CSR_EUEN           0x2     /* Extended unit enable */
> > +#define  CSR_EUEN_LBTEN_SHIFT                3
> > +#define  CSR_EUEN_LBTEN                      (_ULCAST_(0x1) << CSR_EUEN_LBTEN_SHIFT)
> > +#define  CSR_EUEN_LASXEN_SHIFT               2
> > +#define  CSR_EUEN_LASXEN             (_ULCAST_(0x1) << CSR_EUEN_LASXEN_SHIFT)
> > +#define  CSR_EUEN_LSXEN_SHIFT                1
> > +#define  CSR_EUEN_LSXEN                      (_ULCAST_(0x1) << CSR_EUEN_LSXEN_SHIFT)
> > +#define  CSR_EUEN_FPEN_SHIFT         0
> > +#define  CSR_EUEN_FPEN                       (_ULCAST_(0x1) << CSR_EUEN_FPEN_SHIFT)
> > +
> > +#define LOONGARCH_CSR_MISC           0x3     /* Misc config */
> > +
> > +#define LOONGARCH_CSR_ECFG           0x4     /* Exception config */
> > +#define  CSR_ECFG_VS_SHIFT           16
> > +#define  CSR_ECFG_VS_WIDTH           3
> > +#define  CSR_ECFG_VS                 (_ULCAST_(0x7) << CSR_ECFG_VS_SHIFT)
> > +#define  CSR_ECFG_IM_SHIFT           0
> > +#define  CSR_ECFG_IM_WIDTH           13
> > +#define  CSR_ECFG_IM                 (_ULCAST_(0x1fff) << CSR_ECFG_IM_SHIFT)
> > +
> > +#define LOONGARCH_CSR_ESTAT          0x5     /* Exception status */
> > +#define  CSR_ESTAT_ESUBCODE_SHIFT    22
> > +#define  CSR_ESTAT_ESUBCODE_WIDTH    9
> > +#define  CSR_ESTAT_ESUBCODE          (_ULCAST_(0x1ff) << CSR_ESTAT_ESUBCODE_SHIFT)
> > +#define  CSR_ESTAT_EXC_SHIFT         16
> > +#define  CSR_ESTAT_EXC_WIDTH         6
> > +#define  CSR_ESTAT_EXC                       (_ULCAST_(0x3f) << CSR_ESTAT_EXC_SHIFT)
> > +#define  CSR_ESTAT_IS_SHIFT          0
> > +#define  CSR_ESTAT_IS_WIDTH          15
> > +#define  CSR_ESTAT_IS                        (_ULCAST_(0x7fff) << CSR_ESTAT_IS_SHIFT)
> > +
> > +#define LOONGARCH_CSR_ERA            0x6     /* ERA */
> > +
> > +#define LOONGARCH_CSR_BADV           0x7     /* Bad virtual address */
> > +
> > +#define LOONGARCH_CSR_BADI           0x8     /* Bad instruction */
> > +
> > +#define LOONGARCH_CSR_EENTRY         0xc     /* Exception entry */
> > +
> > +/* TLB related CSR registers */
> > +#define LOONGARCH_CSR_TLBIDX         0x10    /* TLB Index, EHINV, PageSize, NP */
> > +#define  CSR_TLBIDX_EHINV_SHIFT              31
> > +#define  CSR_TLBIDX_EHINV            (_ULCAST_(1) << CSR_TLBIDX_EHINV_SHIFT)
> > +#define  CSR_TLBIDX_PS_SHIFT         24
> > +#define  CSR_TLBIDX_PS_WIDTH         6
> > +#define  CSR_TLBIDX_PS                       (_ULCAST_(0x3f) << CSR_TLBIDX_PS_SHIFT)
> > +#define  CSR_TLBIDX_IDX_SHIFT                0
> > +#define  CSR_TLBIDX_IDX_WIDTH                12
> > +#define  CSR_TLBIDX_IDX                      (_ULCAST_(0xfff) << CSR_TLBIDX_IDX_SHIFT)
> > +#define  CSR_TLBIDX_SIZEM            0x3f000000
> > +#define  CSR_TLBIDX_SIZE             CSR_TLBIDX_PS_SHIFT
> > +#define  CSR_TLBIDX_IDXM             0xfff
> > +#define  CSR_INVALID_ENTRY(e)                (CSR_TLBIDX_EHINV | e)
> > +
> > +#define LOONGARCH_CSR_TLBEHI         0x11    /* TLB EntryHi */
> > +
> > +#define LOONGARCH_CSR_TLBELO0                0x12    /* TLB EntryLo0 */
> > +#define  CSR_TLBLO0_RPLV_SHIFT               63
> > +#define  CSR_TLBLO0_RPLV             (_ULCAST_(0x1) << CSR_TLBLO0_RPLV_SHIFT)
> > +#define  CSR_TLBLO0_NX_SHIFT         62
> > +#define  CSR_TLBLO0_NX                       (_ULCAST_(0x1) << CSR_TLBLO0_NX_SHIFT)
> > +#define  CSR_TLBLO0_NR_SHIFT         61
> > +#define  CSR_TLBLO0_NR                       (_ULCAST_(0x1) << CSR_TLBLO0_NR_SHIFT)
> > +#define  CSR_TLBLO0_PFN_SHIFT                12
> > +#define  CSR_TLBLO0_PFN_WIDTH                36
> > +#define  CSR_TLBLO0_PFN                      (_ULCAST_(0xfffffffff) << CSR_TLBLO0_PFN_SHIFT)
> > +#define  CSR_TLBLO0_GLOBAL_SHIFT     6
> > +#define  CSR_TLBLO0_GLOBAL           (_ULCAST_(0x1) << CSR_TLBLO0_GLOBAL_SHIFT)
> > +#define  CSR_TLBLO0_CCA_SHIFT                4
> > +#define  CSR_TLBLO0_CCA_WIDTH                2
> > +#define  CSR_TLBLO0_CCA                      (_ULCAST_(0x3) << CSR_TLBLO0_CCA_SHIFT)
> > +#define  CSR_TLBLO0_PLV_SHIFT                2
> > +#define  CSR_TLBLO0_PLV_WIDTH                2
> > +#define  CSR_TLBLO0_PLV                      (_ULCAST_(0x3) << CSR_TLBLO0_PLV_SHIFT)
> > +#define  CSR_TLBLO0_WE_SHIFT         1
> > +#define  CSR_TLBLO0_WE                       (_ULCAST_(0x1) << CSR_TLBLO0_WE_SHIFT)
> > +#define  CSR_TLBLO0_V_SHIFT          0
> > +#define  CSR_TLBLO0_V                        (_ULCAST_(0x1) << CSR_TLBLO0_V_SHIFT)
> > +
> > +#define LOONGARCH_CSR_TLBELO1                0x13    /* TLB EntryLo1 */
> > +#define  CSR_TLBLO1_RPLV_SHIFT               63
> > +#define  CSR_TLBLO1_RPLV             (_ULCAST_(0x1) << CSR_TLBLO1_RPLV_SHIFT)
> > +#define  CSR_TLBLO1_NX_SHIFT         62
> > +#define  CSR_TLBLO1_NX                       (_ULCAST_(0x1) << CSR_TLBLO1_NX_SHIFT)
> > +#define  CSR_TLBLO1_NR_SHIFT         61
> > +#define  CSR_TLBLO1_NR                       (_ULCAST_(0x1) << CSR_TLBLO1_NR_SHIFT)
> > +#define  CSR_TLBLO1_PFN_SHIFT                12
> > +#define  CSR_TLBLO1_PFN_WIDTH                36
> > +#define  CSR_TLBLO1_PFN                      (_ULCAST_(0xfffffffff) << CSR_TLBLO1_PFN_SHIFT)
> > +#define  CSR_TLBLO1_GLOBAL_SHIFT     6
> > +#define  CSR_TLBLO1_GLOBAL           (_ULCAST_(0x1) << CSR_TLBLO1_GLOBAL_SHIFT)
> > +#define  CSR_TLBLO1_CCA_SHIFT                4
> > +#define  CSR_TLBLO1_CCA_WIDTH                2
> > +#define  CSR_TLBLO1_CCA                      (_ULCAST_(0x3) << CSR_TLBLO1_CCA_SHIFT)
> > +#define  CSR_TLBLO1_PLV_SHIFT                2
> > +#define  CSR_TLBLO1_PLV_WIDTH                2
> > +#define  CSR_TLBLO1_PLV                      (_ULCAST_(0x3) << CSR_TLBLO1_PLV_SHIFT)
> > +#define  CSR_TLBLO1_WE_SHIFT         1
> > +#define  CSR_TLBLO1_WE                       (_ULCAST_(0x1) << CSR_TLBLO1_WE_SHIFT)
> > +#define  CSR_TLBLO1_V_SHIFT          0
> > +#define  CSR_TLBLO1_V                        (_ULCAST_(0x1) << CSR_TLBLO1_V_SHIFT)
> > +
> > +#define LOONGARCH_CSR_GTLBC          0x15    /* Guest TLB control */
> > +#define  CSR_GTLBC_RID_SHIFT         16
> > +#define  CSR_GTLBC_RID_WIDTH         8
> > +#define  CSR_GTLBC_RID                       (_ULCAST_(0xff) << CSR_GTLBC_RID_SHIFT)
> > +#define  CSR_GTLBC_TOTI_SHIFT                13
> > +#define  CSR_GTLBC_TOTI                      (_ULCAST_(0x1) << CSR_GTLBC_TOTI_SHIFT)
> > +#define  CSR_GTLBC_USERID_SHIFT              12
> > +#define  CSR_GTLBC_USERID            (_ULCAST_(0x1) << CSR_GTLBC_USERID_SHIFT)
> > +#define  CSR_GTLBC_GMTLBSZ_SHIFT     0
> > +#define  CSR_GTLBC_GMTLBSZ_WIDTH     6
> > +#define  CSR_GTLBC_GMTLBSZ           (_ULCAST_(0x3f) << CSR_GTLBC_GMTLBSZ_SHIFT)
> > +
> > +#define LOONGARCH_CSR_TRGP           0x16    /* TLBR read guest info */
> > +#define  CSR_TRGP_RID_SHIFT          16
> > +#define  CSR_TRGP_RID_WIDTH          8
> > +#define  CSR_TRGP_RID                        (_ULCAST_(0xff) << CSR_TRGP_RID_SHIFT)
> > +#define  CSR_TRGP_GTLB_SHIFT         0
> > +#define  CSR_TRGP_GTLB                       (1 << CSR_TRGP_GTLB_SHIFT)
> > +
> > +#define LOONGARCH_CSR_ASID           0x18    /* ASID */
> > +#define  CSR_ASID_BIT_SHIFT          16      /* ASIDBits */
> > +#define  CSR_ASID_BIT_WIDTH          8
> > +#define  CSR_ASID_BIT                        (_ULCAST_(0xff) << CSR_ASID_BIT_SHIFT)
> > +#define  CSR_ASID_ASID_SHIFT         0
> > +#define  CSR_ASID_ASID_WIDTH         10
> > +#define  CSR_ASID_ASID                       (_ULCAST_(0x3ff) << CSR_ASID_ASID_SHIFT)
> > +
> > +#define LOONGARCH_CSR_PGDL           0x19    /* Page table base address when VA[47] = 0 */
> > +
> > +#define LOONGARCH_CSR_PGDH           0x1a    /* Page table base address when VA[47] = 1 */
> > +
> > +#define LOONGARCH_CSR_PGD            0x1b    /* Page table base */
> > +
> > +#define LOONGARCH_CSR_PWCTL0         0x1c    /* PWCtl0 */
> > +#define  CSR_PWCTL0_PTEW_SHIFT               30
> > +#define  CSR_PWCTL0_PTEW_WIDTH               2
> > +#define  CSR_PWCTL0_PTEW             (_ULCAST_(0x3) << CSR_PWCTL0_PTEW_SHIFT)
> > +#define  CSR_PWCTL0_DIR1WIDTH_SHIFT  25
> > +#define  CSR_PWCTL0_DIR1WIDTH_WIDTH  5
> > +#define  CSR_PWCTL0_DIR1WIDTH                (_ULCAST_(0x1f) << CSR_PWCTL0_DIR1WIDTH_SHIFT)
> > +#define  CSR_PWCTL0_DIR1BASE_SHIFT   20
> > +#define  CSR_PWCTL0_DIR1BASE_WIDTH   5
> > +#define  CSR_PWCTL0_DIR1BASE         (_ULCAST_(0x1f) << CSR_PWCTL0_DIR1BASE_SHIFT)
> > +#define  CSR_PWCTL0_DIR0WIDTH_SHIFT  15
> > +#define  CSR_PWCTL0_DIR0WIDTH_WIDTH  5
> > +#define  CSR_PWCTL0_DIR0WIDTH                (_ULCAST_(0x1f) << CSR_PWCTL0_DIR0WIDTH_SHIFT)
> > +#define  CSR_PWCTL0_DIR0BASE_SHIFT   10
> > +#define  CSR_PWCTL0_DIR0BASE_WIDTH   5
> > +#define  CSR_PWCTL0_DIR0BASE         (_ULCAST_(0x1f) << CSR_PWCTL0_DIR0BASE_SHIFT)
> > +#define  CSR_PWCTL0_PTWIDTH_SHIFT    5
> > +#define  CSR_PWCTL0_PTWIDTH_WIDTH    5
> > +#define  CSR_PWCTL0_PTWIDTH          (_ULCAST_(0x1f) << CSR_PWCTL0_PTWIDTH_SHIFT)
> > +#define  CSR_PWCTL0_PTBASE_SHIFT     0
> > +#define  CSR_PWCTL0_PTBASE_WIDTH     5
> > +#define  CSR_PWCTL0_PTBASE           (_ULCAST_(0x1f) << CSR_PWCTL0_PTBASE_SHIFT)
> > +
> > +#define LOONGARCH_CSR_PWCTL1         0x1d    /* PWCtl1 */
> > +#define  CSR_PWCTL1_DIR3WIDTH_SHIFT  18
> > +#define  CSR_PWCTL1_DIR3WIDTH_WIDTH  5
> > +#define  CSR_PWCTL1_DIR3WIDTH                (_ULCAST_(0x1f) << CSR_PWCTL1_DIR3WIDTH_SHIFT)
> > +#define  CSR_PWCTL1_DIR3BASE_SHIFT   12
> > +#define  CSR_PWCTL1_DIR3BASE_WIDTH   5
> > +#define  CSR_PWCTL1_DIR3BASE         (_ULCAST_(0x1f) << CSR_PWCTL0_DIR3BASE_SHIFT)
> > +#define  CSR_PWCTL1_DIR2WIDTH_SHIFT  6
> > +#define  CSR_PWCTL1_DIR2WIDTH_WIDTH  5
> > +#define  CSR_PWCTL1_DIR2WIDTH                (_ULCAST_(0x1f) << CSR_PWCTL1_DIR2WIDTH_SHIFT)
> > +#define  CSR_PWCTL1_DIR2BASE_SHIFT   0
> > +#define  CSR_PWCTL1_DIR2BASE_WIDTH   5
> > +#define  CSR_PWCTL1_DIR2BASE         (_ULCAST_(0x1f) << CSR_PWCTL0_DIR2BASE_SHIFT)
> > +
> > +#define LOONGARCH_CSR_STLBPGSIZE     0x1e
> > +#define  CSR_STLBPGSIZE_PS_WIDTH     6
> > +#define  CSR_STLBPGSIZE_PS           (_ULCAST_(0x3f))
> > +
> > +#define LOONGARCH_CSR_RVACFG         0x1f
> > +#define  CSR_RVACFG_RDVA_WIDTH               4
> > +#define  CSR_RVACFG_RDVA             (_ULCAST_(0xf))
> > +
> > +/* Config CSR registers */
> > +#define LOONGARCH_CSR_CPUID          0x20    /* CPU core id */
> > +#define  CSR_CPUID_COREID_WIDTH              9
> > +#define  CSR_CPUID_COREID            _ULCAST_(0x1ff)
> > +
> > +#define LOONGARCH_CSR_PRCFG1         0x21    /* Config1 */
> > +#define  CSR_CONF1_VSMAX_SHIFT               12
> > +#define  CSR_CONF1_VSMAX_WIDTH               3
> > +#define  CSR_CONF1_VSMAX             (_ULCAST_(7) << CSR_CONF1_VSMAX_SHIFT)
> > +#define  CSR_CONF1_TMRBITS_SHIFT     4
> > +#define  CSR_CONF1_TMRBITS_WIDTH     8
> > +#define  CSR_CONF1_TMRBITS           (_ULCAST_(0xff) << CSR_CONF1_TMRBITS_SHIFT)
> > +#define  CSR_CONF1_KSNUM_WIDTH               4
> > +#define  CSR_CONF1_KSNUM             _ULCAST_(0xf)
> > +
> > +#define LOONGARCH_CSR_PRCFG2         0x22    /* Config2 */
> > +#define  CSR_CONF2_PGMASK_SUPP               0x3ffff000
> > +
> > +#define LOONGARCH_CSR_PRCFG3         0x23    /* Config3 */
> > +#define  CSR_CONF3_STLBIDX_SHIFT     20
> > +#define  CSR_CONF3_STLBIDX_WIDTH     6
> > +#define  CSR_CONF3_STLBIDX           (_ULCAST_(0x3f) << CSR_CONF3_STLBIDX_SHIFT)
> > +#define  CSR_CONF3_STLBWAYS_SHIFT    12
> > +#define  CSR_CONF3_STLBWAYS_WIDTH    8
> > +#define  CSR_CONF3_STLBWAYS          (_ULCAST_(0xff) << CSR_CONF3_STLBWAYS_SHIFT)
> > +#define  CSR_CONF3_MTLBSIZE_SHIFT    4
> > +#define  CSR_CONF3_MTLBSIZE_WIDTH    8
> > +#define  CSR_CONF3_MTLBSIZE          (_ULCAST_(0xff) << CSR_CONF3_MTLBSIZE_SHIFT)
> > +#define  CSR_CONF3_TLBTYPE_SHIFT     0
> > +#define  CSR_CONF3_TLBTYPE_WIDTH     4
> > +#define  CSR_CONF3_TLBTYPE           (_ULCAST_(0xf) << CSR_CONF3_TLBTYPE_SHIFT)
> > +
> > +/* Kscratch registers */
> > +#define LOONGARCH_CSR_KS0            0x30
> > +#define LOONGARCH_CSR_KS1            0x31
> > +#define LOONGARCH_CSR_KS2            0x32
> > +#define LOONGARCH_CSR_KS3            0x33
> > +#define LOONGARCH_CSR_KS4            0x34
> > +#define LOONGARCH_CSR_KS5            0x35
> > +#define LOONGARCH_CSR_KS6            0x36
> > +#define LOONGARCH_CSR_KS7            0x37
> > +#define LOONGARCH_CSR_KS8            0x38
> > +
> > +/* Exception allocated KS0, KS1 and KS2 statically */
> > +#define EXCEPTION_KS0                        LOONGARCH_CSR_KS0
> > +#define EXCEPTION_KS1                        LOONGARCH_CSR_KS1
> > +#define EXCEPTION_KS2                        LOONGARCH_CSR_KS2
> > +#define EXC_KSCRATCH_MASK            (1 << 0 | 1 << 1 | 1 << 2)
> > +
> > +/* Percpu-data base allocated KS3 statically */
> > +#define PERCPU_BASE_KS                       LOONGARCH_CSR_KS3
> > +#define PERCPU_KSCRATCH_MASK         (1 << 3)
> > +
> > +/* KVM allocated KS4 and KS5 statically */
> > +#define KVM_VCPU_KS                  LOONGARCH_CSR_KS4
> > +#define KVM_TEMP_KS                  LOONGARCH_CSR_KS5
> > +#define KVM_KSCRATCH_MASK            (1 << 4 | 1 << 5)
> > +
> > +/* Timer registers */
> > +#define LOONGARCH_CSR_TMID           0x40    /* Timer ID */
> > +
> > +#define LOONGARCH_CSR_TCFG           0x41    /* Timer config */
> > +#define  CSR_TCFG_VAL_SHIFT          2
> > +#define       CSR_TCFG_VAL_WIDTH             48
> > +#define  CSR_TCFG_VAL                        (_ULCAST_(0x3fffffffffff) << CSR_TCFG_VAL_SHIFT)
> > +#define  CSR_TCFG_PERIOD_SHIFT               1
> > +#define  CSR_TCFG_PERIOD             (_ULCAST_(0x1) << CSR_TCFG_PERIOD_SHIFT)
> > +#define  CSR_TCFG_EN                 (_ULCAST_(0x1))
> > +
> > +#define LOONGARCH_CSR_TVAL           0x42    /* Timer value */
> > +
> > +#define LOONGARCH_CSR_CNTC           0x43    /* Timer offset */
> > +
> > +#define LOONGARCH_CSR_TINTCLR                0x44    /* Timer interrupt clear */
> > +#define  CSR_TINTCLR_TI_SHIFT                0
> > +#define  CSR_TINTCLR_TI                      (1 << CSR_TINTCLR_TI_SHIFT)
> > +
> > +/* Guest registers */
> > +#define LOONGARCH_CSR_GSTAT          0x50    /* Guest status */
> > +#define  CSR_GSTAT_GID_SHIFT         16
> > +#define  CSR_GSTAT_GID_WIDTH         8
> > +#define  CSR_GSTAT_GID                       (_ULCAST_(0xff) << CSR_GSTAT_GID_SHIFT)
> > +#define  CSR_GSTAT_GIDBIT_SHIFT              4
> > +#define  CSR_GSTAT_GIDBIT_WIDTH              6
> > +#define  CSR_GSTAT_GIDBIT            (_ULCAST_(0x3f) << CSR_GSTAT_GIDBIT_SHIFT)
> > +#define  CSR_GSTAT_PVM_SHIFT         1
> > +#define  CSR_GSTAT_PVM                       (_ULCAST_(0x1) << CSR_GSTAT_PVM_SHIFT)
> > +#define  CSR_GSTAT_VM_SHIFT          0
> > +#define  CSR_GSTAT_VM                        (_ULCAST_(0x1) << CSR_GSTAT_VM_SHIFT)
> > +
> > +#define LOONGARCH_CSR_GCFG           0x51    /* Guest config */
> > +#define  CSR_GCFG_GPERF_SHIFT                24
> > +#define  CSR_GCFG_GPERF_WIDTH                3
> > +#define  CSR_GCFG_GPERF                      (_ULCAST_(0x7) << CSR_GCFG_GPERF_SHIFT)
> > +#define  CSR_GCFG_GCI_SHIFT          20
> > +#define  CSR_GCFG_GCI_WIDTH          2
> > +#define  CSR_GCFG_GCI                        (_ULCAST_(0x3) << CSR_GCFG_GCI_SHIFT)
> > +#define  CSR_GCFG_GCI_ALL            (_ULCAST_(0x0) << CSR_GCFG_GCI_SHIFT)
> > +#define  CSR_GCFG_GCI_HIT            (_ULCAST_(0x1) << CSR_GCFG_GCI_SHIFT)
> > +#define  CSR_GCFG_GCI_SECURE         (_ULCAST_(0x2) << CSR_GCFG_GCI_SHIFT)
> > +#define  CSR_GCFG_GCIP_SHIFT         16
> > +#define  CSR_GCFG_GCIP                       (_ULCAST_(0xf) << CSR_GCFG_GCIP_SHIFT)
> > +#define  CSR_GCFG_GCIP_ALL           (_ULCAST_(0x1) << CSR_GCFG_GCIP_SHIFT)
> > +#define  CSR_GCFG_GCIP_HIT           (_ULCAST_(0x1) << (CSR_GCFG_GCIP_SHIFT + 1))
> > +#define  CSR_GCFG_GCIP_SECURE                (_ULCAST_(0x1) << (CSR_GCFG_GCIP_SHIFT + 2))
> > +#define  CSR_GCFG_TORU_SHIFT         15
> > +#define  CSR_GCFG_TORU                       (_ULCAST_(0x1) << CSR_GCFG_TORU_SHIFT)
> > +#define  CSR_GCFG_TORUP_SHIFT                14
> > +#define  CSR_GCFG_TORUP                      (_ULCAST_(0x1) << CSR_GCFG_TORUP_SHIFT)
> > +#define  CSR_GCFG_TOP_SHIFT          13
> > +#define  CSR_GCFG_TOP                        (_ULCAST_(0x1) << CSR_GCFG_TOP_SHIFT)
> > +#define  CSR_GCFG_TOPP_SHIFT         12
> > +#define  CSR_GCFG_TOPP                       (_ULCAST_(0x1) << CSR_GCFG_TOPP_SHIFT)
> > +#define  CSR_GCFG_TOE_SHIFT          11
> > +#define  CSR_GCFG_TOE                        (_ULCAST_(0x1) << CSR_GCFG_TOE_SHIFT)
> > +#define  CSR_GCFG_TOEP_SHIFT         10
> > +#define  CSR_GCFG_TOEP                       (_ULCAST_(0x1) << CSR_GCFG_TOEP_SHIFT)
> > +#define  CSR_GCFG_TIT_SHIFT          9
> > +#define  CSR_GCFG_TIT                        (_ULCAST_(0x1) << CSR_GCFG_TIT_SHIFT)
> > +#define  CSR_GCFG_TITP_SHIFT         8
> > +#define  CSR_GCFG_TITP                       (_ULCAST_(0x1) << CSR_GCFG_TITP_SHIFT)
> > +#define  CSR_GCFG_SIT_SHIFT          7
> > +#define  CSR_GCFG_SIT                        (_ULCAST_(0x1) << CSR_GCFG_SIT_SHIFT)
> > +#define  CSR_GCFG_SITP_SHIFT         6
> > +#define  CSR_GCFG_SITP                       (_ULCAST_(0x1) << CSR_GCFG_SITP_SHIFT)
> > +#define  CSR_GCFG_MATC_SHITF         4
> > +#define  CSR_GCFG_MATC_WIDTH         2
> > +#define  CSR_GCFG_MATC_MASK          (_ULCAST_(0x3) << CSR_GCFG_MATC_SHITF)
> > +#define  CSR_GCFG_MATC_GUEST         (_ULCAST_(0x0) << CSR_GCFG_MATC_SHITF)
> > +#define  CSR_GCFG_MATC_ROOT          (_ULCAST_(0x1) << CSR_GCFG_MATC_SHITF)
> > +#define  CSR_GCFG_MATC_NEST          (_ULCAST_(0x2) << CSR_GCFG_MATC_SHITF)
> > +
> > +#define LOONGARCH_CSR_GINTC          0x52    /* Guest interrupt control */
> > +#define  CSR_GINTC_HC_SHIFT          16
> > +#define  CSR_GINTC_HC_WIDTH          8
> > +#define  CSR_GINTC_HC                        (_ULCAST_(0xff) << CSR_GINTC_HC_SHIFT)
> > +#define  CSR_GINTC_PIP_SHIFT         8
> > +#define  CSR_GINTC_PIP_WIDTH         8
> > +#define  CSR_GINTC_PIP                       (_ULCAST_(0xff) << CSR_GINTC_PIP_SHIFT)
> > +#define  CSR_GINTC_VIP_SHIFT         0
> > +#define  CSR_GINTC_VIP_WIDTH         8
> > +#define  CSR_GINTC_VIP                       (_ULCAST_(0xff))
> > +
> > +#define LOONGARCH_CSR_GCNTC          0x53    /* Guest timer offset */
> > +
> > +/* LLBCTL register */
> > +#define LOONGARCH_CSR_LLBCTL         0x60    /* LLBit control */
> > +#define  CSR_LLBCTL_ROLLB_SHIFT              0
> > +#define  CSR_LLBCTL_ROLLB            (_ULCAST_(1) << CSR_LLBCTL_ROLLB_SHIFT)
> > +#define  CSR_LLBCTL_WCLLB_SHIFT              1
> > +#define  CSR_LLBCTL_WCLLB            (_ULCAST_(1) << CSR_LLBCTL_WCLLB_SHIFT)
> > +#define  CSR_LLBCTL_KLO_SHIFT                2
> > +#define  CSR_LLBCTL_KLO                      (_ULCAST_(1) << CSR_LLBCTL_KLO_SHIFT)
> > +
> > +/* Implement dependent */
> > +#define LOONGARCH_CSR_IMPCTL1                0x80    /* Loongson config1 */
> > +#define  CSR_MISPEC_SHIFT            20
> > +#define  CSR_MISPEC_WIDTH            8
> > +#define  CSR_MISPEC                  (_ULCAST_(0xff) << CSR_MISPEC_SHIFT)
> > +#define  CSR_SSEN_SHIFT                      18
> > +#define  CSR_SSEN                    (_ULCAST_(1) << CSR_SSEN_SHIFT)
> > +#define  CSR_SCRAND_SHIFT            17
> > +#define  CSR_SCRAND                  (_ULCAST_(1) << CSR_SCRAND_SHIFT)
> > +#define  CSR_LLEXCL_SHIFT            16
> > +#define  CSR_LLEXCL                  (_ULCAST_(1) << CSR_LLEXCL_SHIFT)
> > +#define  CSR_DISVC_SHIFT             15
> > +#define  CSR_DISVC                   (_ULCAST_(1) << CSR_DISVC_SHIFT)
> > +#define  CSR_VCLRU_SHIFT             14
> > +#define  CSR_VCLRU                   (_ULCAST_(1) << CSR_VCLRU_SHIFT)
> > +#define  CSR_DCLRU_SHIFT             13
> > +#define  CSR_DCLRU                   (_ULCAST_(1) << CSR_DCLRU_SHIFT)
> > +#define  CSR_FASTLDQ_SHIFT           12
> > +#define  CSR_FASTLDQ                 (_ULCAST_(1) << CSR_FASTLDQ_SHIFT)
> > +#define  CSR_USERCAC_SHIFT           11
> > +#define  CSR_USERCAC                 (_ULCAST_(1) << CSR_USERCAC_SHIFT)
> > +#define  CSR_ANTI_MISPEC_SHIFT               10
> > +#define  CSR_ANTI_MISPEC             (_ULCAST_(1) << CSR_ANTI_MISPEC_SHIFT)
> > +#define  CSR_AUTO_FLUSHSFB_SHIFT     9
> > +#define  CSR_AUTO_FLUSHSFB           (_ULCAST_(1) << CSR_AUTO_FLUSHSFB_SHIFT)
> > +#define  CSR_STFILL_SHIFT            8
> > +#define  CSR_STFILL                  (_ULCAST_(1) << CSR_STFILL_SHIFT)
> > +#define  CSR_LIFEP_SHIFT             7
> > +#define  CSR_LIFEP                   (_ULCAST_(1) << CSR_LIFEP_SHIFT)
> > +#define  CSR_LLSYNC_SHIFT            6
> > +#define  CSR_LLSYNC                  (_ULCAST_(1) << CSR_LLSYNC_SHIFT)
> > +#define  CSR_BRBTDIS_SHIFT           5
> > +#define  CSR_BRBTDIS                 (_ULCAST_(1) << CSR_BRBTDIS_SHIFT)
> > +#define  CSR_RASDIS_SHIFT            4
> > +#define  CSR_RASDIS                  (_ULCAST_(1) << CSR_RASDIS_SHIFT)
> > +#define  CSR_STPRE_SHIFT             2
> > +#define  CSR_STPRE_WIDTH             2
> > +#define  CSR_STPRE                   (_ULCAST_(3) << CSR_STPRE_SHIFT)
> > +#define  CSR_INSTPRE_SHIFT           1
> > +#define  CSR_INSTPRE                 (_ULCAST_(1) << CSR_INSTPRE_SHIFT)
> > +#define  CSR_DATAPRE_SHIFT           0
> > +#define  CSR_DATAPRE                 (_ULCAST_(1) << CSR_DATAPRE_SHIFT)
> > +
> > +#define LOONGARCH_CSR_IMPCTL2                0x81    /* Loongson config2 */
> > +#define  CSR_FLUSH_MTLB_SHIFT                0
> > +#define  CSR_FLUSH_MTLB                      (_ULCAST_(1) << CSR_FLUSH_MTLB_SHIFT)
> > +#define  CSR_FLUSH_STLB_SHIFT                1
> > +#define  CSR_FLUSH_STLB                      (_ULCAST_(1) << CSR_FLUSH_STLB_SHIFT)
> > +#define  CSR_FLUSH_DTLB_SHIFT                2
> > +#define  CSR_FLUSH_DTLB                      (_ULCAST_(1) << CSR_FLUSH_DTLB_SHIFT)
> > +#define  CSR_FLUSH_ITLB_SHIFT                3
> > +#define  CSR_FLUSH_ITLB                      (_ULCAST_(1) << CSR_FLUSH_ITLB_SHIFT)
> > +#define  CSR_FLUSH_BTAC_SHIFT                4
> > +#define  CSR_FLUSH_BTAC                      (_ULCAST_(1) << CSR_FLUSH_BTAC_SHIFT)
> > +
> > +#define LOONGARCH_CSR_GNMI           0x82
> > +
> > +/* TLB Refill registers */
> > +#define LOONGARCH_CSR_TLBRENTRY              0x88    /* TLB refill exception entry */
> > +#define LOONGARCH_CSR_TLBRBADV               0x89    /* TLB refill badvaddr */
> > +#define LOONGARCH_CSR_TLBRERA                0x8a    /* TLB refill ERA */
> > +#define LOONGARCH_CSR_TLBRSAVE               0x8b    /* KScratch for TLB refill exception */
> > +#define LOONGARCH_CSR_TLBRELO0               0x8c    /* TLB refill entrylo0 */
> > +#define LOONGARCH_CSR_TLBRELO1               0x8d    /* TLB refill entrylo1 */
> > +#define LOONGARCH_CSR_TLBREHI                0x8e    /* TLB refill entryhi */
> > +#define  CSR_TLBREHI_PS_SHIFT                0
> > +#define  CSR_TLBREHI_PS                      (_ULCAST_(0x3f) << CSR_TLBREHI_PS_SHIFT)
> > +#define LOONGARCH_CSR_TLBRPRMD               0x8f    /* TLB refill mode info */
> > +
> > +/* Machine Error registers */
> > +#define LOONGARCH_CSR_MERRCTL                0x90    /* MERRCTL */
> > +#define LOONGARCH_CSR_MERRINFO1              0x91    /* MError info1 */
> > +#define LOONGARCH_CSR_MERRINFO2              0x92    /* MError info2 */
> > +#define LOONGARCH_CSR_MERRENTRY              0x93    /* MError exception entry */
> > +#define LOONGARCH_CSR_MERRERA                0x94    /* MError exception ERA */
> > +#define LOONGARCH_CSR_MERRSAVE               0x95    /* KScratch for machine error exception */
> > +
> > +#define LOONGARCH_CSR_CTAG           0x98    /* TagLo + TagHi */
> > +
> > +#define LOONGARCH_CSR_PRID           0xc0
> > +
> > +/* Shadow MCSR : 0xc0 ~ 0xff */
> > +#define LOONGARCH_CSR_MCSR0          0xc0    /* CPUCFG0 and CPUCFG1 */
> > +#define  MCSR0_INT_IMPL_SHIFT                58
> > +#define  MCSR0_INT_IMPL                      0
> > +#define  MCSR0_IOCSR_BRD_SHIFT               57
> > +#define  MCSR0_IOCSR_BRD             (_ULCAST_(1) << MCSR0_IOCSR_BRD_SHIFT)
> > +#define  MCSR0_HUGEPG_SHIFT          56
> > +#define  MCSR0_HUGEPG                        (_ULCAST_(1) << MCSR0_HUGEPG_SHIFT)
> > +#define  MCSR0_RPLMTLB_SHIFT         55
> > +#define  MCSR0_RPLMTLB                       (_ULCAST_(1) << MCSR0_RPLMTLB_SHIFT)
> > +#define  MCSR0_EP_SHIFT                      54
> > +#define  MCSR0_EP                    (_ULCAST_(1) << MCSR0_EP_SHIFT)
> > +#define  MCSR0_RI_SHIFT                      53
> > +#define  MCSR0_RI                    (_ULCAST_(1) << MCSR0_RI_SHIFT)
> > +#define  MCSR0_UAL_SHIFT             52
> > +#define  MCSR0_UAL                   (_ULCAST_(1) << MCSR0_UAL_SHIFT)
> > +#define  MCSR0_VABIT_SHIFT           44
> > +#define  MCSR0_VABIT_WIDTH           8
> > +#define  MCSR0_VABIT                 (_ULCAST_(0xff) << MCSR0_VABIT_SHIFT)
> > +#define  VABIT_DEFAULT                       0x2f
> > +#define  MCSR0_PABIT_SHIFT           36
> > +#define  MCSR0_PABIT_WIDTH           8
> > +#define  MCSR0_PABIT                 (_ULCAST_(0xff) << MCSR0_PABIT_SHIFT)
> > +#define  PABIT_DEFAULT                       0x2f
> > +#define  MCSR0_IOCSR_SHIFT           35
> > +#define  MCSR0_IOCSR                 (_ULCAST_(1) << MCSR0_IOCSR_SHIFT)
> > +#define  MCSR0_PAGING_SHIFT          34
> > +#define  MCSR0_PAGING                        (_ULCAST_(1) << MCSR0_PAGING_SHIFT)
> > +#define  MCSR0_GR64_SHIFT            33
> > +#define  MCSR0_GR64                  (_ULCAST_(1) << MCSR0_GR64_SHIFT)
> > +#define  GR64_DEFAULT                        1
> > +#define  MCSR0_GR32_SHIFT            32
> > +#define  MCSR0_GR32                  (_ULCAST_(1) << MCSR0_GR32_SHIFT)
> > +#define  GR32_DEFAULT                        0
> > +#define  MCSR0_PRID_WIDTH            32
> > +#define  MCSR0_PRID                  0x14C010
> > +
> > +#define LOONGARCH_CSR_MCSR1          0xc1    /* CPUCFG2 and CPUCFG3 */
> > +#define  MCSR1_HPFOLD_SHIFT          43
> > +#define  MCSR1_HPFOLD                        (_ULCAST_(1) << MCSR1_HPFOLD_SHIFT)
> > +#define  MCSR1_SPW_LVL_SHIFT         40
> > +#define  MCSR1_SPW_LVL_WIDTH         3
> > +#define  MCSR1_SPW_LVL                       (_ULCAST_(7) << MCSR1_SPW_LVL_SHIFT)
> > +#define  MCSR1_ICACHET_SHIFT         39
> > +#define  MCSR1_ICACHET                       (_ULCAST_(1) << MCSR1_ICACHET_SHIFT)
> > +#define  MCSR1_ITLBT_SHIFT           38
> > +#define  MCSR1_ITLBT                 (_ULCAST_(1) << MCSR1_ITLBT_SHIFT)
> > +#define  MCSR1_LLDBAR_SHIFT          37
> > +#define  MCSR1_LLDBAR                        (_ULCAST_(1) << MCSR1_LLDBAR_SHIFT)
> > +#define  MCSR1_SCDLY_SHIFT           36
> > +#define  MCSR1_SCDLY                 (_ULCAST_(1) << MCSR1_SCDLY_SHIFT)
> > +#define  MCSR1_LLEXC_SHIFT           35
> > +#define  MCSR1_LLEXC                 (_ULCAST_(1) << MCSR1_LLEXC_SHIFT)
> > +#define  MCSR1_UCACC_SHIFT           34
> > +#define  MCSR1_UCACC                 (_ULCAST_(1) << MCSR1_UCACC_SHIFT)
> > +#define  MCSR1_SFB_SHIFT             33
> > +#define  MCSR1_SFB                   (_ULCAST_(1) << MCSR1_SFB_SHIFT)
> > +#define  MCSR1_CCDMA_SHIFT           32
> > +#define  MCSR1_CCDMA                 (_ULCAST_(1) << MCSR1_CCDMA_SHIFT)
> > +#define  MCSR1_LAMO_SHIFT            22
> > +#define  MCSR1_LAMO                  (_ULCAST_(1) << MCSR1_LAMO_SHIFT)
> > +#define  MCSR1_LSPW_SHIFT            21
> > +#define  MCSR1_LSPW                  (_ULCAST_(1) << MCSR1_LSPW_SHIFT)
> > +#define  MCSR1_MIPSBT_SHIFT          20
> > +#define  MCSR1_MIPSBT                        (_ULCAST_(1) << MCSR1_MIPSBT_SHIFT)
> > +#define  MCSR1_ARMBT_SHIFT           19
> > +#define  MCSR1_ARMBT                 (_ULCAST_(1) << MCSR1_ARMBT_SHIFT)
> > +#define  MCSR1_X86BT_SHIFT           18
> > +#define  MCSR1_X86BT                 (_ULCAST_(1) << MCSR1_X86BT_SHIFT)
> > +#define  MCSR1_LLFTPVERS_SHIFT               15
> > +#define  MCSR1_LLFTPVERS_WIDTH               3
> > +#define  MCSR1_LLFTPVERS             (_ULCAST_(7) << MCSR1_LLFTPVERS_SHIFT)
> > +#define  MCSR1_LLFTP_SHIFT           14
> > +#define  MCSR1_LLFTP                 (_ULCAST_(1) << MCSR1_LLFTP_SHIFT)
> > +#define  MCSR1_VZVERS_SHIFT          11
> > +#define  MCSR1_VZVERS_WIDTH          3
> > +#define  MCSR1_VZVERS                        (_ULCAST_(7) << MCSR1_VZVERS_SHIFT)
> > +#define  MCSR1_VZ_SHIFT                      10
> > +#define  MCSR1_VZ                    (_ULCAST_(1) << MCSR1_VZ_SHIFT)
> > +#define  MCSR1_CRYPTO_SHIFT          9
> > +#define  MCSR1_CRYPTO                        (_ULCAST_(1) << MCSR1_CRYPTO_SHIFT)
> > +#define  MCSR1_COMPLEX_SHIFT         8
> > +#define  MCSR1_COMPLEX                       (_ULCAST_(1) << MCSR1_COMPLEX_SHIFT)
> > +#define  MCSR1_LASX_SHIFT            7
> > +#define  MCSR1_LASX                  (_ULCAST_(1) << MCSR1_LASX_SHIFT)
> > +#define  MCSR1_LSX_SHIFT             6
> > +#define  MCSR1_LSX                   (_ULCAST_(1) << MCSR1_LSX_SHIFT)
> > +#define  MCSR1_FPVERS_SHIFT          3
> > +#define  MCSR1_FPVERS_WIDTH          3
> > +#define  MCSR1_FPVERS                        (_ULCAST_(7) << MCSR1_FPVERS_SHIFT)
> > +#define  MCSR1_FPDP_SHIFT            2
> > +#define  MCSR1_FPDP                  (_ULCAST_(1) << MCSR1_FPDP_SHIFT)
> > +#define  MCSR1_FPSP_SHIFT            1
> > +#define  MCSR1_FPSP                  (_ULCAST_(1) << MCSR1_FPSP_SHIFT)
> > +#define  MCSR1_FP_SHIFT                      0
> > +#define  MCSR1_FP                    (_ULCAST_(1) << MCSR1_FP_SHIFT)
> > +
> > +#define LOONGARCH_CSR_MCSR2          0xc2    /* CPUCFG4 and CPUCFG5 */
> > +#define  MCSR2_CCDIV_SHIFT           48
> > +#define  MCSR2_CCDIV_WIDTH           16
> > +#define  MCSR2_CCDIV                 (_ULCAST_(0xffff) << MCSR2_CCDIV_SHIFT)
> > +#define  MCSR2_CCMUL_SHIFT           32
> > +#define  MCSR2_CCMUL_WIDTH           16
> > +#define  MCSR2_CCMUL                 (_ULCAST_(0xffff) << MCSR2_CCMUL_SHIFT)
> > +#define  MCSR2_CCFREQ_WIDTH          32
> > +#define  MCSR2_CCFREQ                        (_ULCAST_(0xffffffff))
> > +#define  CCFREQ_DEFAULT                      0x5f5e100       /* 100MHz */
> > +
> > +#define LOONGARCH_CSR_MCSR3          0xc3    /* CPUCFG6 */
> > +#define  MCSR3_UPM_SHIFT             14
> > +#define  MCSR3_UPM                   (_ULCAST_(1) << MCSR3_UPM_SHIFT)
> > +#define  MCSR3_PMBITS_SHIFT          8
> > +#define  MCSR3_PMBITS_WIDTH          6
> > +#define  MCSR3_PMBITS                        (_ULCAST_(0x3f) << MCSR3_PMBITS_SHIFT)
> > +#define  PMBITS_DEFAULT                      0x40
> > +#define  MCSR3_PMNUM_SHIFT           4
> > +#define  MCSR3_PMNUM_WIDTH           4
> > +#define  MCSR3_PMNUM                 (_ULCAST_(0xf) << MCSR3_PMNUM_SHIFT)
> > +#define  MCSR3_PAMVER_SHIFT          1
> > +#define  MCSR3_PAMVER_WIDTH          3
> > +#define  MCSR3_PAMVER                        (_ULCAST_(0x7) << MCSR3_PAMVER_SHIFT)
> > +#define  MCSR3_PMP_SHIFT             0
> > +#define  MCSR3_PMP                   (_ULCAST_(1) << MCSR3_PMP_SHIFT)
> > +
> > +#define LOONGARCH_CSR_MCSR8          0xc8    /* CPUCFG16 and CPUCFG17 */
> > +#define  MCSR8_L1I_SIZE_SHIFT                56
> > +#define  MCSR8_L1I_SIZE_WIDTH                7
> > +#define  MCSR8_L1I_SIZE                      (_ULCAST_(0x7f) << MCSR8_L1I_SIZE_SHIFT)
> > +#define  MCSR8_L1I_IDX_SHIFT         48
> > +#define  MCSR8_L1I_IDX_WIDTH         8
> > +#define  MCSR8_L1I_IDX                       (_ULCAST_(0xff) << MCSR8_L1I_IDX_SHIFT)
> > +#define  MCSR8_L1I_WAY_SHIFT         32
> > +#define  MCSR8_L1I_WAY_WIDTH         16
> > +#define  MCSR8_L1I_WAY                       (_ULCAST_(0xffff) << MCSR8_L1I_WAY_SHIFT)
> > +#define  MCSR8_L3DINCL_SHIFT         16
> > +#define  MCSR8_L3DINCL                       (_ULCAST_(1) << MCSR8_L3DINCL_SHIFT)
> > +#define  MCSR8_L3DPRIV_SHIFT         15
> > +#define  MCSR8_L3DPRIV                       (_ULCAST_(1) << MCSR8_L3DPRIV_SHIFT)
> > +#define  MCSR8_L3DPRE_SHIFT          14
> > +#define  MCSR8_L3DPRE                        (_ULCAST_(1) << MCSR8_L3DPRE_SHIFT)
> > +#define  MCSR8_L3IUINCL_SHIFT                13
> > +#define  MCSR8_L3IUINCL                      (_ULCAST_(1) << MCSR8_L3IUINCL_SHIFT)
> > +#define  MCSR8_L3IUPRIV_SHIFT                12
> > +#define  MCSR8_L3IUPRIV                      (_ULCAST_(1) << MCSR8_L3IUPRIV_SHIFT)
> > +#define  MCSR8_L3IUUNIFY_SHIFT               11
> > +#define  MCSR8_L3IUUNIFY             (_ULCAST_(1) << MCSR8_L3IUUNIFY_SHIFT)
> > +#define  MCSR8_L3IUPRE_SHIFT         10
> > +#define  MCSR8_L3IUPRE                       (_ULCAST_(1) << MCSR8_L3IUPRE_SHIFT)
> > +#define  MCSR8_L2DINCL_SHIFT         9
> > +#define  MCSR8_L2DINCL                       (_ULCAST_(1) << MCSR8_L2DINCL_SHIFT)
> > +#define  MCSR8_L2DPRIV_SHIFT         8
> > +#define  MCSR8_L2DPRIV                       (_ULCAST_(1) << MCSR8_L2DPRIV_SHIFT)
> > +#define  MCSR8_L2DPRE_SHIFT          7
> > +#define  MCSR8_L2DPRE                        (_ULCAST_(1) << MCSR8_L2DPRE_SHIFT)
> > +#define  MCSR8_L2IUINCL_SHIFT                6
> > +#define  MCSR8_L2IUINCL                      (_ULCAST_(1) << MCSR8_L2IUINCL_SHIFT)
> > +#define  MCSR8_L2IUPRIV_SHIFT                5
> > +#define  MCSR8_L2IUPRIV                      (_ULCAST_(1) << MCSR8_L2IUPRIV_SHIFT)
> > +#define  MCSR8_L2IUUNIFY_SHIFT               4
> > +#define  MCSR8_L2IUUNIFY             (_ULCAST_(1) << MCSR8_L2IUUNIFY_SHIFT)
> > +#define  MCSR8_L2IUPRE_SHIFT         3
> > +#define  MCSR8_L2IUPRE                       (_ULCAST_(1) << MCSR8_L2IUPRE_SHIFT)
> > +#define  MCSR8_L1DPRE_SHIFT          2
> > +#define  MCSR8_L1DPRE                        (_ULCAST_(1) << MCSR8_L1DPRE_SHIFT)
> > +#define  MCSR8_L1IUUNIFY_SHIFT               1
> > +#define  MCSR8_L1IUUNIFY             (_ULCAST_(1) << MCSR8_L1IUUNIFY_SHIFT)
> > +#define  MCSR8_L1IUPRE_SHIFT         0
> > +#define  MCSR8_L1IUPRE                       (_ULCAST_(1) << MCSR8_L1IUPRE_SHIFT)
> > +
> > +#define LOONGARCH_CSR_MCSR9          0xc9    /* CPUCFG18 and CPUCFG19 */
> > +#define  MCSR9_L2U_SIZE_SHIFT                56
> > +#define  MCSR9_L2U_SIZE_WIDTH                7
> > +#define  MCSR9_L2U_SIZE                      (_ULCAST_(0x7f) << MCSR9_L2U_SIZE_SHIFT)
> > +#define  MCSR9_L2U_IDX_SHIFT         48
> > +#define  MCSR9_L2U_IDX_WIDTH         8
> > +#define  MCSR9_L2U_IDX                       (_ULCAST_(0xff) << MCSR9_IDX_LOG_SHIFT)
> > +#define  MCSR9_L2U_WAY_SHIFT         32
> > +#define  MCSR9_L2U_WAY_WIDTH         16
> > +#define  MCSR9_L2U_WAY                       (_ULCAST_(0xffff) << MCSR9_L2U_WAY_SHIFT)
> > +#define  MCSR9_L1D_SIZE_SHIFT                24
> > +#define  MCSR9_L1D_SIZE_WIDTH                7
> > +#define  MCSR9_L1D_SIZE                      (_ULCAST_(0x7f) << MCSR9_L1D_SIZE_SHIFT)
> > +#define  MCSR9_L1D_IDX_SHIFT         16
> > +#define  MCSR9_L1D_IDX_WIDTH         8
> > +#define  MCSR9_L1D_IDX                       (_ULCAST_(0xff) << MCSR9_L1D_IDX_SHIFT)
> > +#define  MCSR9_L1D_WAY_SHIFT         0
> > +#define  MCSR9_L1D_WAY_WIDTH         16
> > +#define  MCSR9_L1D_WAY                       (_ULCAST_(0xffff) << MCSR9_L1D_WAY_SHIFT)
> > +
> > +#define LOONGARCH_CSR_MCSR10         0xca    /* CPUCFG20 */
> > +#define  MCSR10_L3U_SIZE_SHIFT               24
> > +#define  MCSR10_L3U_SIZE_WIDTH               7
> > +#define  MCSR10_L3U_SIZE             (_ULCAST_(0x7f) << MCSR10_L3U_SIZE_SHIFT)
> > +#define  MCSR10_L3U_IDX_SHIFT                16
> > +#define  MCSR10_L3U_IDX_WIDTH                8
> > +#define  MCSR10_L3U_IDX                      (_ULCAST_(0xff) << MCSR10_L3U_IDX_SHIFT)
> > +#define  MCSR10_L3U_WAY_SHIFT                0
> > +#define  MCSR10_L3U_WAY_WIDTH                16
> > +#define  MCSR10_L3U_WAY                      (_ULCAST_(0xffff) << MCSR10_L3U_WAY_SHIFT)
> > +
> > +#define LOONGARCH_CSR_MCSR24         0xf0    /* cpucfg48 */
> > +#define  MCSR24_RAMCG_SHIFT          3
> > +#define  MCSR24_RAMCG                        (_ULCAST_(1) << MCSR24_RAMCG_SHIFT)
> > +#define  MCSR24_VFPUCG_SHIFT         2
> > +#define  MCSR24_VFPUCG                       (_ULCAST_(1) << MCSR24_VFPUCG_SHIFT)
> > +#define  MCSR24_NAPEN_SHIFT          1
> > +#define  MCSR24_NAPEN                        (_ULCAST_(1) << MCSR24_NAPEN_SHIFT)
> > +#define  MCSR24_MCSRLOCK_SHIFT               0
> > +#define  MCSR24_MCSRLOCK             (_ULCAST_(1) << MCSR24_MCSRLOCK_SHIFT)
> > +
> > +/* Uncached accelerate windows registers */
> > +#define LOONGARCH_CSR_UCAWIN         0x100
> > +#define LOONGARCH_CSR_UCAWIN0_LO     0x102
> > +#define LOONGARCH_CSR_UCAWIN0_HI     0x103
> > +#define LOONGARCH_CSR_UCAWIN1_LO     0x104
> > +#define LOONGARCH_CSR_UCAWIN1_HI     0x105
> > +#define LOONGARCH_CSR_UCAWIN2_LO     0x106
> > +#define LOONGARCH_CSR_UCAWIN2_HI     0x107
> > +#define LOONGARCH_CSR_UCAWIN3_LO     0x108
> > +#define LOONGARCH_CSR_UCAWIN3_HI     0x109
> > +
> > +/* Direct Map windows registers */
> > +#define LOONGARCH_CSR_DMWIN0         0x180   /* 64 direct map win0: MEM & IF */
> > +#define LOONGARCH_CSR_DMWIN1         0x181   /* 64 direct map win1: MEM & IF */
> > +#define LOONGARCH_CSR_DMWIN2         0x182   /* 64 direct map win2: MEM */
> > +#define LOONGARCH_CSR_DMWIN3         0x183   /* 64 direct map win3: MEM */
> > +
> > +/* Direct Map window 0/1 */
> > +#define CSR_DMW0_PLV0                _CONST64_(1 << 0)
> > +#define CSR_DMW0_VSEG                _CONST64_(0x8000)
> > +#define CSR_DMW0_BASE                (CSR_DMW0_VSEG << DMW_PABITS)
> > +#define CSR_DMW0_INIT                (CSR_DMW0_BASE | CSR_DMW0_PLV0)
> > +
> > +#define CSR_DMW1_PLV0                _CONST64_(1 << 0)
> > +#define CSR_DMW1_MAT         _CONST64_(1 << 4)
> > +#define CSR_DMW1_VSEG                _CONST64_(0x9000)
> > +#define CSR_DMW1_BASE                (CSR_DMW1_VSEG << DMW_PABITS)
> > +#define CSR_DMW1_INIT                (CSR_DMW1_BASE | CSR_DMW1_MAT | CSR_DMW1_PLV0)
> > +
> > +/* Performance Counter registers */
> > +#define LOONGARCH_CSR_PERFCTRL0              0x200   /* 32 perf event 0 config */
> > +#define LOONGARCH_CSR_PERFCNTR0              0x201   /* 64 perf event 0 count value */
> > +#define LOONGARCH_CSR_PERFCTRL1              0x202   /* 32 perf event 1 config */
> > +#define LOONGARCH_CSR_PERFCNTR1              0x203   /* 64 perf event 1 count value */
> > +#define LOONGARCH_CSR_PERFCTRL2              0x204   /* 32 perf event 2 config */
> > +#define LOONGARCH_CSR_PERFCNTR2              0x205   /* 64 perf event 2 count value */
> > +#define LOONGARCH_CSR_PERFCTRL3              0x206   /* 32 perf event 3 config */
> > +#define LOONGARCH_CSR_PERFCNTR3              0x207   /* 64 perf event 3 count value */
> > +#define  CSR_PERFCTRL_PLV0           (_ULCAST_(1) << 16)
> > +#define  CSR_PERFCTRL_PLV1           (_ULCAST_(1) << 17)
> > +#define  CSR_PERFCTRL_PLV2           (_ULCAST_(1) << 18)
> > +#define  CSR_PERFCTRL_PLV3           (_ULCAST_(1) << 19)
> > +#define  CSR_PERFCTRL_IE             (_ULCAST_(1) << 20)
> > +#define  CSR_PERFCTRL_EVENT          0x3ff
> > +
> > +/* Debug registers */
> > +#define LOONGARCH_CSR_MWPC           0x300   /* data breakpoint config */
> > +#define LOONGARCH_CSR_MWPS           0x301   /* data breakpoint status */
> > +
> > +#define LOONGARCH_CSR_DB0ADDR                0x310   /* data breakpoint 0 address */
> > +#define LOONGARCH_CSR_DB0MASK                0x311   /* data breakpoint 0 mask */
> > +#define LOONGARCH_CSR_DB0CTL         0x312   /* data breakpoint 0 control */
> > +#define LOONGARCH_CSR_DB0ASID                0x313   /* data breakpoint 0 asid */
> > +
> > +#define LOONGARCH_CSR_DB1ADDR                0x318   /* data breakpoint 1 address */
> > +#define LOONGARCH_CSR_DB1MASK                0x319   /* data breakpoint 1 mask */
> > +#define LOONGARCH_CSR_DB1CTL         0x31a   /* data breakpoint 1 control */
> > +#define LOONGARCH_CSR_DB1ASID                0x31b   /* data breakpoint 1 asid */
> > +
> > +#define LOONGARCH_CSR_DB2ADDR                0x320   /* data breakpoint 2 address */
> > +#define LOONGARCH_CSR_DB2MASK                0x321   /* data breakpoint 2 mask */
> > +#define LOONGARCH_CSR_DB2CTL         0x322   /* data breakpoint 2 control */
> > +#define LOONGARCH_CSR_DB2ASID                0x323   /* data breakpoint 2 asid */
> > +
> > +#define LOONGARCH_CSR_DB3ADDR                0x328   /* data breakpoint 3 address */
> > +#define LOONGARCH_CSR_DB3MASK                0x329   /* data breakpoint 3 mask */
> > +#define LOONGARCH_CSR_DB3CTL         0x32a   /* data breakpoint 3 control */
> > +#define LOONGARCH_CSR_DB3ASID                0x32b   /* data breakpoint 3 asid */
> > +
> > +#define LOONGARCH_CSR_DB4ADDR                0x330   /* data breakpoint 4 address */
> > +#define LOONGARCH_CSR_DB4MASK                0x331   /* data breakpoint 4 maks */
> > +#define LOONGARCH_CSR_DB4CTL         0x332   /* data breakpoint 4 control */
> > +#define LOONGARCH_CSR_DB4ASID                0x333   /* data breakpoint 4 asid */
> > +
> > +#define LOONGARCH_CSR_DB5ADDR                0x338   /* data breakpoint 5 address */
> > +#define LOONGARCH_CSR_DB5MASK                0x339   /* data breakpoint 5 mask */
> > +#define LOONGARCH_CSR_DB5CTL         0x33a   /* data breakpoint 5 control */
> > +#define LOONGARCH_CSR_DB5ASID                0x33b   /* data breakpoint 5 asid */
> > +
> > +#define LOONGARCH_CSR_DB6ADDR                0x340   /* data breakpoint 6 address */
> > +#define LOONGARCH_CSR_DB6MASK                0x341   /* data breakpoint 6 mask */
> > +#define LOONGARCH_CSR_DB6CTL         0x342   /* data breakpoint 6 control */
> > +#define LOONGARCH_CSR_DB6ASID                0x343   /* data breakpoint 6 asid */
> > +
> > +#define LOONGARCH_CSR_DB7ADDR                0x348   /* data breakpoint 7 address */
> > +#define LOONGARCH_CSR_DB7MASK                0x349   /* data breakpoint 7 mask */
> > +#define LOONGARCH_CSR_DB7CTL         0x34a   /* data breakpoint 7 control */
> > +#define LOONGARCH_CSR_DB7ASID                0x34b   /* data breakpoint 7 asid */
> > +
> > +#define LOONGARCH_CSR_FWPC           0x380   /* instruction breakpoint config */
> > +#define LOONGARCH_CSR_FWPS           0x381   /* instruction breakpoint status */
> > +
> > +#define LOONGARCH_CSR_IB0ADDR                0x390   /* inst breakpoint 0 address */
> > +#define LOONGARCH_CSR_IB0MASK                0x391   /* inst breakpoint 0 mask */
> > +#define LOONGARCH_CSR_IB0CTL         0x392   /* inst breakpoint 0 control */
> > +#define LOONGARCH_CSR_IB0ASID                0x393   /* inst breakpoint 0 asid */
> > +
> > +#define LOONGARCH_CSR_IB1ADDR                0x398   /* inst breakpoint 1 address */
> > +#define LOONGARCH_CSR_IB1MASK                0x399   /* inst breakpoint 1 mask */
> > +#define LOONGARCH_CSR_IB1CTL         0x39a   /* inst breakpoint 1 control */
> > +#define LOONGARCH_CSR_IB1ASID                0x39b   /* inst breakpoint 1 asid */
> > +
> > +#define LOONGARCH_CSR_IB2ADDR                0x3a0   /* inst breakpoint 2 address */
> > +#define LOONGARCH_CSR_IB2MASK                0x3a1   /* inst breakpoint 2 mask */
> > +#define LOONGARCH_CSR_IB2CTL         0x3a2   /* inst breakpoint 2 control */
> > +#define LOONGARCH_CSR_IB2ASID                0x3a3   /* inst breakpoint 2 asid */
> > +
> > +#define LOONGARCH_CSR_IB3ADDR                0x3a8   /* inst breakpoint 3 address */
> > +#define LOONGARCH_CSR_IB3MASK                0x3a9   /* breakpoint 3 mask */
> > +#define LOONGARCH_CSR_IB3CTL         0x3aa   /* inst breakpoint 3 control */
> > +#define LOONGARCH_CSR_IB3ASID                0x3ab   /* inst breakpoint 3 asid */
> > +
> > +#define LOONGARCH_CSR_IB4ADDR                0x3b0   /* inst breakpoint 4 address */
> > +#define LOONGARCH_CSR_IB4MASK                0x3b1   /* inst breakpoint 4 mask */
> > +#define LOONGARCH_CSR_IB4CTL         0x3b2   /* inst breakpoint 4 control */
> > +#define LOONGARCH_CSR_IB4ASID                0x3b3   /* inst breakpoint 4 asid */
> > +
> > +#define LOONGARCH_CSR_IB5ADDR                0x3b8   /* inst breakpoint 5 address */
> > +#define LOONGARCH_CSR_IB5MASK                0x3b9   /* inst breakpoint 5 mask */
> > +#define LOONGARCH_CSR_IB5CTL         0x3ba   /* inst breakpoint 5 control */
> > +#define LOONGARCH_CSR_IB5ASID                0x3bb   /* inst breakpoint 5 asid */
> > +
> > +#define LOONGARCH_CSR_IB6ADDR                0x3c0   /* inst breakpoint 6 address */
> > +#define LOONGARCH_CSR_IB6MASK                0x3c1   /* inst breakpoint 6 mask */
> > +#define LOONGARCH_CSR_IB6CTL         0x3c2   /* inst breakpoint 6 control */
> > +#define LOONGARCH_CSR_IB6ASID                0x3c3   /* inst breakpoint 6 asid */
> > +
> > +#define LOONGARCH_CSR_IB7ADDR                0x3c8   /* inst breakpoint 7 address */
> > +#define LOONGARCH_CSR_IB7MASK                0x3c9   /* inst breakpoint 7 mask */
> > +#define LOONGARCH_CSR_IB7CTL         0x3ca   /* inst breakpoint 7 control */
> > +#define LOONGARCH_CSR_IB7ASID                0x3cb   /* inst breakpoint 7 asid */
> > +
> > +#define LOONGARCH_CSR_DEBUG          0x500   /* debug config */
> > +#define LOONGARCH_CSR_DERA           0x501   /* debug era */
> > +#define LOONGARCH_CSR_DESAVE         0x502   /* debug save */
> > +
> > +/*
> > + * CSR_ECFG IM
> > + */
> > +#define ECFG0_IM             0x00001fff
> > +#define ECFGB_SIP0           0
> > +#define ECFGF_SIP0           (_ULCAST_(1) << ECFGB_SIP0)
> > +#define ECFGB_SIP1           1
> > +#define ECFGF_SIP1           (_ULCAST_(1) << ECFGB_SIP1)
> > +#define ECFGB_IP0            2
> > +#define ECFGF_IP0            (_ULCAST_(1) << ECFGB_IP0)
> > +#define ECFGB_IP1            3
> > +#define ECFGF_IP1            (_ULCAST_(1) << ECFGB_IP1)
> > +#define ECFGB_IP2            4
> > +#define ECFGF_IP2            (_ULCAST_(1) << ECFGB_IP2)
> > +#define ECFGB_IP3            5
> > +#define ECFGF_IP3            (_ULCAST_(1) << ECFGB_IP3)
> > +#define ECFGB_IP4            6
> > +#define ECFGF_IP4            (_ULCAST_(1) << ECFGB_IP4)
> > +#define ECFGB_IP5            7
> > +#define ECFGF_IP5            (_ULCAST_(1) << ECFGB_IP5)
> > +#define ECFGB_IP6            8
> > +#define ECFGF_IP6            (_ULCAST_(1) << ECFGB_IP6)
> > +#define ECFGB_IP7            9
> > +#define ECFGF_IP7            (_ULCAST_(1) << ECFGB_IP7)
> > +#define ECFGB_PMC            10
> > +#define ECFGF_PMC            (_ULCAST_(1) << ECFGB_PMC)
> > +#define ECFGB_TIMER          11
> > +#define ECFGF_TIMER          (_ULCAST_(1) << ECFGB_TIMER)
> > +#define ECFGB_IPI            12
> > +#define ECFGF_IPI            (_ULCAST_(1) << ECFGB_IPI)
> > +#define ECFGF(hwirq)         (_ULCAST_(1) << hwirq)
> > +
> > +#define ESTATF_IP            0x00001fff
> > +
> > +#define LOONGARCH_IOCSR_FEATURES     0x8
> > +#define  IOCSRF_TEMP                 BIT_ULL(0)
> > +#define  IOCSRF_NODECNT                      BIT_ULL(1)
> > +#define  IOCSRF_MSI                  BIT_ULL(2)
> > +#define  IOCSRF_EXTIOI                       BIT_ULL(3)
> > +#define  IOCSRF_CSRIPI                       BIT_ULL(4)
> > +#define  IOCSRF_FREQCSR                      BIT_ULL(5)
> > +#define  IOCSRF_FREQSCALE            BIT_ULL(6)
> > +#define  IOCSRF_DVFSV1                       BIT_ULL(7)
> > +#define  IOCSRF_EIODECODE            BIT_ULL(9)
> > +#define  IOCSRF_FLATMODE             BIT_ULL(10)
> > +#define  IOCSRF_VM                   BIT_ULL(11)
> > +
> > +#define LOONGARCH_IOCSR_VENDOR               0x10
> > +
> > +#define LOONGARCH_IOCSR_CPUNAME              0x20
> > +
> > +#define LOONGARCH_IOCSR_NODECNT              0x408
> > +
> > +#define LOONGARCH_IOCSR_MISC_FUNC    0x420
> > +#define  IOCSR_MISC_FUNC_TIMER_RESET BIT_ULL(21)
> > +#define  IOCSR_MISC_FUNC_EXT_IOI_EN  BIT_ULL(48)
> > +
> > +#define LOONGARCH_IOCSR_CPUTEMP              0x428
> > +
> > +/* PerCore CSR, only accessible by local cores */
> > +#define LOONGARCH_IOCSR_IPI_STATUS   0x1000
> > +#define LOONGARCH_IOCSR_IPI_EN               0x1004
> > +#define LOONGARCH_IOCSR_IPI_SET              0x1008
> > +#define LOONGARCH_IOCSR_IPI_CLEAR    0x100c
> > +#define LOONGARCH_IOCSR_MBUF0                0x1020
> > +#define LOONGARCH_IOCSR_MBUF1                0x1028
> > +#define LOONGARCH_IOCSR_MBUF2                0x1030
> > +#define LOONGARCH_IOCSR_MBUF3                0x1038
> > +
> > +#define LOONGARCH_IOCSR_IPI_SEND     0x1040
> > +#define  IOCSR_IPI_SEND_IP_SHIFT     0
> > +#define  IOCSR_IPI_SEND_CPU_SHIFT    16
> > +#define  IOCSR_IPI_SEND_BLOCKING     BIT(31)
> > +
> > +#define LOONGARCH_IOCSR_MBUF_SEND    0x1048
> > +#define  IOCSR_MBUF_SEND_BLOCKING    BIT_ULL(31)
> > +#define  IOCSR_MBUF_SEND_BOX_SHIFT   2
> > +#define  IOCSR_MBUF_SEND_BOX_LO(box) (box << 1)
> > +#define  IOCSR_MBUF_SEND_BOX_HI(box) ((box << 1) + 1)
> > +#define  IOCSR_MBUF_SEND_CPU_SHIFT   16
> > +#define  IOCSR_MBUF_SEND_BUF_SHIFT   32
> > +#define  IOCSR_MBUF_SEND_H32_MASK    0xFFFFFFFF00000000ULL
> > +
> > +#define LOONGARCH_IOCSR_ANY_SEND     0x1158
> > +#define  IOCSR_ANY_SEND_BLOCKING     BIT_ULL(31)
> > +#define  IOCSR_ANY_SEND_CPU_SHIFT    16
> > +#define  IOCSR_ANY_SEND_MASK_SHIFT   27
> > +#define  IOCSR_ANY_SEND_BUF_SHIFT    32
> > +#define  IOCSR_ANY_SEND_H32_MASK     0xFFFFFFFF00000000ULL
> > +
> > +/* Register offset and bit definition for CSR access */
> > +#define LOONGARCH_IOCSR_TIMER_CFG       0x1060
> > +#define LOONGARCH_IOCSR_TIMER_TICK      0x1070
> > +#define  IOCSR_TIMER_CFG_RESERVED       (_ULCAST_(1) << 63)
> > +#define  IOCSR_TIMER_CFG_PERIODIC       (_ULCAST_(1) << 62)
> > +#define  IOCSR_TIMER_CFG_EN             (_ULCAST_(1) << 61)
> > +#define  IOCSR_TIMER_MASK            0x0ffffffffffffULL
> > +#define  IOCSR_TIMER_INITVAL_RST        (_ULCAST_(0xffff) << 48)
> > +
> > +#define LOONGARCH_IOCSR_EXTIOI_NODEMAP_BASE  0x14a0
> > +#define LOONGARCH_IOCSR_EXTIOI_IPMAP_BASE    0x14c0
> > +#define LOONGARCH_IOCSR_EXTIOI_EN_BASE               0x1600
> > +#define LOONGARCH_IOCSR_EXTIOI_BOUNCE_BASE   0x1680
> > +#define LOONGARCH_IOCSR_EXTIOI_ISR_BASE              0x1800
> > +#define LOONGARCH_IOCSR_EXTIOI_ROUTE_BASE    0x1c00
> > +#define IOCSR_EXTIOI_VECTOR_NUM                      256
> > +
> > +#ifndef __ASSEMBLY__
> > +
> > +static inline u64 drdtime(void)
> > +{
> > +     int rID = 0;
> > +     u64 val = 0;
> > +
> > +     __asm__ __volatile__(
> > +             "rdtime.d %0, %1 \n\t"
> > +             : "=r"(val), "=r"(rID)
> > +             :
> > +             );
> > +     return val;
> > +}
> > +
> > +static inline unsigned int get_csr_cpuid(void)
> > +{
> > +     return csr_readl(LOONGARCH_CSR_CPUID);
> > +}
> > +
> > +static inline void csr_any_send(unsigned int addr, unsigned int data,
> > +                             unsigned int data_mask, unsigned int cpu)
> > +{
> > +     uint64_t val = 0;
> > +
> > +     val = IOCSR_ANY_SEND_BLOCKING | addr;
> > +     val |= (cpu << IOCSR_ANY_SEND_CPU_SHIFT);
> > +     val |= (data_mask << IOCSR_ANY_SEND_MASK_SHIFT);
> > +     val |= ((uint64_t)data << IOCSR_ANY_SEND_BUF_SHIFT);
> > +     iocsr_writeq(val, LOONGARCH_IOCSR_ANY_SEND);
> > +}
> > +
> > +static inline unsigned int read_csr_excode(void)
> > +{
> > +     return (csr_readl(LOONGARCH_CSR_ESTAT) & CSR_ESTAT_EXC) >> CSR_ESTAT_EXC_SHIFT;
> > +}
> > +
> > +static inline void write_csr_index(unsigned int idx)
> > +{
> > +     csr_xchgl(idx, CSR_TLBIDX_IDXM, LOONGARCH_CSR_TLBIDX);
> > +}
> > +
> > +static inline unsigned int read_csr_pagesize(void)
> > +{
> > +     return (csr_readl(LOONGARCH_CSR_TLBIDX) & CSR_TLBIDX_SIZEM) >> CSR_TLBIDX_SIZE;
> > +}
> > +
> > +static inline void write_csr_pagesize(unsigned int size)
> > +{
> > +     csr_xchgl(size << CSR_TLBIDX_SIZE, CSR_TLBIDX_SIZEM, LOONGARCH_CSR_TLBIDX);
> > +}
> > +
> > +static inline unsigned int read_csr_tlbrefill_pagesize(void)
> > +{
> > +     return (csr_readq(LOONGARCH_CSR_TLBREHI) & CSR_TLBREHI_PS) >> CSR_TLBREHI_PS_SHIFT;
> > +}
> > +
> > +static inline void write_csr_tlbrefill_pagesize(unsigned int size)
> > +{
> > +     csr_xchgq(size << CSR_TLBREHI_PS_SHIFT, CSR_TLBREHI_PS, LOONGARCH_CSR_TLBREHI);
> > +}
> > +
> > +#define read_csr_asid()                      csr_readl(LOONGARCH_CSR_ASID)
> > +#define write_csr_asid(val)          csr_writel(val, LOONGARCH_CSR_ASID)
> > +#define read_csr_entryhi()           csr_readq(LOONGARCH_CSR_TLBEHI)
> > +#define write_csr_entryhi(val)               csr_writeq(val, LOONGARCH_CSR_TLBEHI)
> > +#define read_csr_entrylo0()          csr_readq(LOONGARCH_CSR_TLBELO0)
> > +#define write_csr_entrylo0(val)              csr_writeq(val, LOONGARCH_CSR_TLBELO0)
> > +#define read_csr_entrylo1()          csr_readq(LOONGARCH_CSR_TLBELO1)
> > +#define write_csr_entrylo1(val)              csr_writeq(val, LOONGARCH_CSR_TLBELO1)
> > +#define read_csr_ecfg()                      csr_readl(LOONGARCH_CSR_ECFG)
> > +#define write_csr_ecfg(val)          csr_writel(val, LOONGARCH_CSR_ECFG)
> > +#define read_csr_estat()             csr_readl(LOONGARCH_CSR_ESTAT)
> > +#define write_csr_estat(val)         csr_writel(val, LOONGARCH_CSR_ESTAT)
> > +#define read_csr_tlbidx()            csr_readl(LOONGARCH_CSR_TLBIDX)
> > +#define write_csr_tlbidx(val)                csr_writel(val, LOONGARCH_CSR_TLBIDX)
> > +#define read_csr_euen()                      csr_readl(LOONGARCH_CSR_EUEN)
> > +#define write_csr_euen(val)          csr_writel(val, LOONGARCH_CSR_EUEN)
> > +#define read_csr_cpuid()             csr_readl(LOONGARCH_CSR_CPUID)
> > +#define read_csr_prcfg1()            csr_readq(LOONGARCH_CSR_PRCFG1)
> > +#define write_csr_prcfg1(val)                csr_writeq(val, LOONGARCH_CSR_PRCFG1)
> > +#define read_csr_prcfg2()            csr_readq(LOONGARCH_CSR_PRCFG2)
> > +#define write_csr_prcfg2(val)                csr_writeq(val, LOONGARCH_CSR_PRCFG2)
> > +#define read_csr_prcfg3()            csr_readq(LOONGARCH_CSR_PRCFG3)
> > +#define write_csr_prcfg3(val)                csr_writeq(val, LOONGARCH_CSR_PRCFG3)
> > +#define read_csr_stlbpgsize()                csr_readl(LOONGARCH_CSR_STLBPGSIZE)
> > +#define write_csr_stlbpgsize(val)    csr_writel(val, LOONGARCH_CSR_STLBPGSIZE)
> > +#define read_csr_rvacfg()            csr_readl(LOONGARCH_CSR_RVACFG)
> > +#define write_csr_rvacfg(val)                csr_writel(val, LOONGARCH_CSR_RVACFG)
> > +#define write_csr_tintclear(val)     csr_writel(val, LOONGARCH_CSR_TINTCLR)
> > +#define read_csr_impctl1()           csr_readq(LOONGARCH_CSR_IMPCTL1)
> > +#define write_csr_impctl1(val)               csr_writeq(val, LOONGARCH_CSR_IMPCTL1)
> > +#define write_csr_impctl2(val)               csr_writeq(val, LOONGARCH_CSR_IMPCTL2)
> > +
> > +#define read_csr_perfctrl0()         csr_readq(LOONGARCH_CSR_PERFCTRL0)
> > +#define read_csr_perfcntr0()         csr_readq(LOONGARCH_CSR_PERFCNTR0)
> > +#define read_csr_perfctrl1()         csr_readq(LOONGARCH_CSR_PERFCTRL1)
> > +#define read_csr_perfcntr1()         csr_readq(LOONGARCH_CSR_PERFCNTR1)
> > +#define read_csr_perfctrl2()         csr_readq(LOONGARCH_CSR_PERFCTRL2)
> > +#define read_csr_perfcntr2()         csr_readq(LOONGARCH_CSR_PERFCNTR2)
> > +#define read_csr_perfctrl3()         csr_readq(LOONGARCH_CSR_PERFCTRL3)
> > +#define read_csr_perfcntr3()         csr_readq(LOONGARCH_CSR_PERFCNTR3)
> > +#define write_csr_perfctrl0(val)     csr_writeq(val, LOONGARCH_CSR_PERFCTRL0)
> > +#define write_csr_perfcntr0(val)     csr_writeq(val, LOONGARCH_CSR_PERFCNTR0)
> > +#define write_csr_perfctrl1(val)     csr_writeq(val, LOONGARCH_CSR_PERFCTRL1)
> > +#define write_csr_perfcntr1(val)     csr_writeq(val, LOONGARCH_CSR_PERFCNTR1)
> > +#define write_csr_perfctrl2(val)     csr_writeq(val, LOONGARCH_CSR_PERFCTRL2)
> > +#define write_csr_perfcntr2(val)     csr_writeq(val, LOONGARCH_CSR_PERFCNTR2)
> > +#define write_csr_perfctrl3(val)     csr_writeq(val, LOONGARCH_CSR_PERFCTRL3)
> > +#define write_csr_perfcntr3(val)     csr_writeq(val, LOONGARCH_CSR_PERFCNTR3)
> > +
> > +/*
> > + * Manipulate bits in a register.
> > + */
> > +#define __BUILD_CSR_COMMON(name)                             \
> > +static inline unsigned long                                  \
> > +set_##name(unsigned long set)                                        \
> > +{                                                            \
> > +     unsigned long res, new;                                 \
> > +                                                             \
> > +     res = read_##name();                                    \
> > +     new = res | set;                                        \
> > +     write_##name(new);                                      \
> > +                                                             \
> > +     return res;                                             \
> > +}                                                            \
> > +                                                             \
> > +static inline unsigned long                                  \
> > +clear_##name(unsigned long clear)                            \
> > +{                                                            \
> > +     unsigned long res, new;                                 \
> > +                                                             \
> > +     res = read_##name();                                    \
> > +     new = res & ~clear;                                     \
> > +     write_##name(new);                                      \
> > +                                                             \
> > +     return res;                                             \
> > +}                                                            \
> > +                                                             \
> > +static inline unsigned long                                  \
> > +change_##name(unsigned long change, unsigned long val)               \
> > +{                                                            \
> > +     unsigned long res, new;                                 \
> > +                                                             \
> > +     res = read_##name();                                    \
> > +     new = res & ~change;                                    \
> > +     new |= (val & change);                                  \
> > +     write_##name(new);                                      \
> > +                                                             \
> > +     return res;                                             \
> > +}
> > +
> > +#define __BUILD_CSR_OP(name) __BUILD_CSR_COMMON(csr_##name)
> > +
> > +__BUILD_CSR_OP(euen)
> > +__BUILD_CSR_OP(ecfg)
> > +__BUILD_CSR_OP(tlbidx)
> > +
> > +#define set_csr_estat(val)   \
> > +     csr_xchgl(val, val, LOONGARCH_CSR_ESTAT)
> > +#define clear_csr_estat(val) \
> > +     csr_xchgl(~(val), val, LOONGARCH_CSR_ESTAT)
> > +
> > +#endif /* __ASSEMBLY__ */
> > +
> > +/* Generic EntryLo bit definitions */
> > +#define ENTRYLO_V            (_ULCAST_(1) << 0)
> > +#define ENTRYLO_D            (_ULCAST_(1) << 1)
> > +#define ENTRYLO_PLV_SHIFT    2
> > +#define ENTRYLO_PLV          (_ULCAST_(3) << ENTRYLO_PLV_SHIFT)
> > +#define ENTRYLO_C_SHIFT              4
> > +#define ENTRYLO_C            (_ULCAST_(3) << ENTRYLO_C_SHIFT)
> > +#define ENTRYLO_G            (_ULCAST_(1) << 6)
> > +#define ENTRYLO_NR           (_ULCAST_(1) << 61)
> > +#define ENTRYLO_NX           (_ULCAST_(1) << 62)
> > +
> > +/* LoongArch GlobalNumber definitions */
> > +#define LOONGARCH_GLOBALNUMBER_VP_SHF        0
> > +#define LOONGARCH_GLOBALNUMBER_VP            (_ULCAST_(0xff) << LOONGARCH_GLOBALNUMBER_VP_SHF)
> > +#define LOONGARCH_GLOBALNUMBER_CORE_SHF      8
> > +#define LOONGARCH_GLOBALNUMBER_CORE          (_ULCAST_(0xff) << LOONGARCH_GLOBALNUMBER_CORE_SHF)
> > +#define LOONGARCH_GLOBALNUMBER_CLUSTER_SHF   16
> > +#define LOONGARCH_GLOBALNUMBER_CLUSTER       (_ULCAST_(0xf) << LOONGARCH_GLOBALNUMBER_CLUSTER_SHF)
> > +
> > +/* Values for PageSize register */
> > +#define PS_4K                0x0000000c
> > +#define PS_8K                0x0000000d
> > +#define PS_16K               0x0000000e
> > +#define PS_32K               0x0000000f
> > +#define PS_64K               0x00000010
> > +#define PS_128K              0x00000011
> > +#define PS_256K              0x00000012
> > +#define PS_512K              0x00000013
> > +#define PS_1M                0x00000014
> > +#define PS_2M                0x00000015
> > +#define PS_4M                0x00000016
> > +#define PS_8M                0x00000017
> > +#define PS_16M               0x00000018
> > +#define PS_32M               0x00000019
> > +#define PS_64M               0x0000001a
> > +#define PS_128M              0x0000001b
> > +#define PS_256M              0x0000001c
> > +#define PS_512M              0x0000001d
> > +#define PS_1G                0x0000001e
> > +
> > +#define PS_MASK              0x3f000000
> > +#define PS_SHIFT     24
> > +
> > +/* Default page size for a given kernel configuration */
> > +#ifdef CONFIG_PAGE_SIZE_4KB
> > +#define PS_DEFAULT_SIZE PS_4K
> > +#elif defined(CONFIG_PAGE_SIZE_16KB)
> > +#define PS_DEFAULT_SIZE PS_16K
> > +#elif defined(CONFIG_PAGE_SIZE_64KB)
> > +#define PS_DEFAULT_SIZE PS_64K
> > +#else
> > +#error Bad page size configuration!
> > +#endif
> > +
> > +/* Default huge tlb size for a given kernel configuration */
> > +#ifdef CONFIG_PAGE_SIZE_4KB
> > +#define PS_HUGE_SIZE   PS_1M
> > +#elif defined(CONFIG_PAGE_SIZE_16KB)
> > +#define PS_HUGE_SIZE   PS_16M
> > +#elif defined(CONFIG_PAGE_SIZE_64KB)
> > +#define PS_HUGE_SIZE   PS_256M
> > +#else
> > +#error Bad page size configuration for hugetlbfs!
> > +#endif
> > +
> > +/* ExStatus.ExcCode */
> > +#define EXCCODE_RSV          0       /* Reserved */
> > +#define EXCCODE_TLBL         1       /* TLB miss on a load */
> > +#define EXCCODE_TLBS         2       /* TLB miss on a store */
> > +#define EXCCODE_TLBI         3       /* TLB miss on a ifetch */
> > +#define EXCCODE_TLBM         4       /* TLB modified fault */
> > +#define EXCCODE_TLBNR                5       /* TLB Read-Inhibit exception */
> > +#define EXCCODE_TLBNX                6       /* TLB Execution-Inhibit exception */
> > +#define EXCCODE_TLBPE                7       /* TLB Privilege Error */
> > +#define EXCCODE_ADE          8       /* Address Error */
> > +     #define EXSUBCODE_ADEF          0       /* Fetch Instruction */
> > +     #define EXSUBCODE_ADEM          1       /* Access Memory*/
> > +#define EXCCODE_ALE          9       /* Unalign Access */
> > +#define EXCCODE_OOB          10      /* Out of bounds */
> > +#define EXCCODE_SYS          11      /* System call */
> > +#define EXCCODE_BP           12      /* Breakpoint */
> > +#define EXCCODE_INE          13      /* Inst. Not Exist */
> > +#define EXCCODE_IPE          14      /* Inst. Privileged Error */
> "Privilege Error"?
> > +#define EXCCODE_FPDIS                15      /* FPU Disabled */
> > +#define EXCCODE_LSXDIS               16      /* LSX Disabled */
> > +#define EXCCODE_LASXDIS              17      /* LASX Disabled */
> > +#define EXCCODE_FPE          18      /* Floating Point Exception */
> > +     #define EXCSUBCODE_FPE          0       /* Floating Point Exception */
> > +     #define EXCSUBCODE_VFPE         1       /* Vector Exception */
> > +#define EXCCODE_WATCH                19      /* Watch address reference */
> > +#define EXCCODE_BTDIS                20      /* Binary Trans. Disabled */
> > +#define EXCCODE_BTE          21      /* Binary Trans. Exception */
> > +#define EXCCODE_PSI          22      /* Guest Privileged Error */
> > +#define EXCCODE_HYP          23      /* Hypercall */
> > +#define EXCCODE_GCM          24      /* Guest CSR modified */
> > +     #define EXCSUBCODE_GCSC         0       /* Software caused */
> > +     #define EXCSUBCODE_GCHC         1       /* Hardware caused */
> > +#define EXCCODE_SE           25      /* Security */
> > +
> > +#define EXCCODE_INT_START   64
> > +#define EXCCODE_SIP0        64
> > +#define EXCCODE_SIP1        65
> > +#define EXCCODE_IP0         66
> > +#define EXCCODE_IP1         67
> > +#define EXCCODE_IP2         68
> > +#define EXCCODE_IP3         69
> > +#define EXCCODE_IP4         70
> > +#define EXCCODE_IP5         71
> > +#define EXCCODE_IP6         72
> > +#define EXCCODE_IP7         73
> > +#define EXCCODE_PMC         74 /* Performance Counter */
> > +#define EXCCODE_TIMER       75
> > +#define EXCCODE_IPI         76
> > +#define EXCCODE_NMI         77
> > +#define EXCCODE_INT_END     78
> > +#define EXCCODE_INT_NUM          (EXCCODE_INT_END - EXCCODE_INT_START)
> > +
> > +/* FPU register names */
> > +#define LOONGARCH_FCSR0      $r0
> > +#define LOONGARCH_FCSR1      $r1
> > +#define LOONGARCH_FCSR2      $r2
> > +#define LOONGARCH_FCSR3      $r3
> > +
> > +/* FPU Status Register Values */
> > +#define FPU_CSR_RSVD 0xe0e0fce0
> > +
> > +/*
> > + * X the exception cause indicator
> > + * E the exception enable
> > + * S the sticky/flag bit
> > + */
> > +#define FPU_CSR_ALL_X        0x1f000000
> > +#define FPU_CSR_INV_X        0x10000000
> > +#define FPU_CSR_DIV_X        0x08000000
> > +#define FPU_CSR_OVF_X        0x04000000
> > +#define FPU_CSR_UDF_X        0x02000000
> > +#define FPU_CSR_INE_X        0x01000000
> > +
> > +#define FPU_CSR_ALL_S        0x001f0000
> > +#define FPU_CSR_INV_S        0x00100000
> > +#define FPU_CSR_DIV_S        0x00080000
> > +#define FPU_CSR_OVF_S        0x00040000
> > +#define FPU_CSR_UDF_S        0x00020000
> > +#define FPU_CSR_INE_S        0x00010000
> > +
> > +#define FPU_CSR_ALL_E        0x0000001f
> > +#define FPU_CSR_INV_E        0x00000010
> > +#define FPU_CSR_DIV_E        0x00000008
> > +#define FPU_CSR_OVF_E        0x00000004
> > +#define FPU_CSR_UDF_E        0x00000002
> > +#define FPU_CSR_INE_E        0x00000001
> > +
> > +/* Bits 8 and 9 of FPU Status Register specify the rounding mode */
> > +#define FPU_CSR_RM   0x300
> > +#define FPU_CSR_RN   0x000   /* nearest */
> > +#define FPU_CSR_RZ   0x100   /* towards zero */
> > +#define FPU_CSR_RU   0x200   /* towards +Infinity */
> > +#define FPU_CSR_RD   0x300   /* towards -Infinity */
> > +
> > +#define read_fcsr(source)    \
> > +({   \
> > +     unsigned int __res;     \
> > +\
> > +     __asm__ __volatile__(   \
> > +     "       movfcsr2gr      %0, "STR(source)"       \n"     \
> > +     : "=r" (__res));        \
> > +     __res;  \
> > +})
> > +
> > +#define write_fcsr(dest, val) \
> > +do { \
> > +     __asm__ __volatile__(   \
> > +     "       movgr2fcsr      %0, "STR(dest)" \n"     \
> > +     : : "r" (val)); \
> > +} while (0)
> > +
> > +#endif /* _ASM_LOONGARCH_H */
> > diff --git a/arch/loongarch/include/asm/loongson.h b/arch/loongarch/include/asm/loongson.h
> > new file mode 100644
> > index 000000000000..4cefd393fd5c
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/loongson.h
> > @@ -0,0 +1,159 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Author: Huacai Chen <chenhuacai@loongson.cn>
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> Better explain this file's name and purpose... This is mostly
> definitions for models produced by Loongson Corporation, much like how
> MTI stands for MIPS the corporation while MIPS stands for the
> architecture. Don't overload the "Loongson" term, it's too ambiguous
> already...
> > + */
> > +
> > +#ifndef __ASM_LOONGSON_H
> > +#define __ASM_LOONGSON_H
> > +
> > +#include <linux/init.h>
> > +#include <linux/io.h>
> > +#include <linux/irq.h>
> > +#include <linux/pci.h>
> > +#include <asm/addrspace.h>
> > +#include <asm/boot_param.h>
> > +
> > +extern const struct plat_smp_ops loongson3_smp_ops;
> > +
> > +/* loongson-specific command line, env and memory initialization */
> > +extern void __init fw_init_environ(void);
> > +extern void __init fw_init_memory(void);
> > +extern void __init fw_init_numa_memory(void);
> > +
> > +#define LOONGSON_REG(x) \
> > +     (*(volatile u32 *)((char *)TO_UNCAC(LOONGSON_REG_BASE) + (x)))
> > +
> > +#define LOONGSON_LIO_BASE    0x18000000
> > +#define LOONGSON_LIO_SIZE    0x00100000      /* 1M */
> > +#define LOONGSON_LIO_TOP     (LOONGSON_LIO_BASE+LOONGSON_LIO_SIZE-1)
> > +
> > +#define LOONGSON_BOOT_BASE   0x1c000000
> > +#define LOONGSON_BOOT_SIZE   0x02000000      /* 32M */
> > +#define LOONGSON_BOOT_TOP    (LOONGSON_BOOT_BASE+LOONGSON_BOOT_SIZE-1)
> > +
> > +#define LOONGSON_REG_BASE    0x1fe00000
> > +#define LOONGSON_REG_SIZE    0x00100000      /* 1M */
> > +#define LOONGSON_REG_TOP     (LOONGSON_REG_BASE+LOONGSON_REG_SIZE-1)
> > +
> > +/* GPIO Regs - r/w */
> > +
> > +#define LOONGSON_GPIODATA            LOONGSON_REG(0x11c)
> > +#define LOONGSON_GPIOIE                      LOONGSON_REG(0x120)
> > +#define LOONGSON_REG_GPIO_BASE          (LOONGSON_REG_BASE + 0x11c)
> > +
> > +#define MAX_PACKAGES 16
> > +
> > +/* Chip Config registor of each physical cpu package */
> > +extern u64 loongson_chipcfg[MAX_PACKAGES];
> > +#define LOONGSON_CHIPCFG(id) (*(volatile u32 *)(loongson_chipcfg[id]))
> > +
> > +/* Chip Temperature registor of each physical cpu package */
> > +extern u64 loongson_chiptemp[MAX_PACKAGES];
> > +#define LOONGSON_CHIPTEMP(id) (*(volatile u32 *)(loongson_chiptemp[id]))
> > +
> > +/* Freq Control register of each physical cpu package */
> > +extern u64 loongson_freqctrl[MAX_PACKAGES];
> > +#define LOONGSON_FREQCTRL(id) (*(volatile u32 *)(loongson_freqctrl[id]))
> > +
> > +#define xconf_readl(addr) readl(addr)
> > +#define xconf_readq(addr) readq(addr)
> > +
> > +static inline void xconf_writel(u32 val, volatile void __iomem *addr)
> > +{
> > +     asm volatile (
> > +     "       st.w    %[v], %[hw], 0  \n"
> > +     "       ld.b    $r0, %[hw], 0   \n"
> > +     :
> > +     : [hw] "r" (addr), [v] "r" (val)
> > +     );
> > +}
> > +
> > +static inline void xconf_writeq(u64 val64, volatile void __iomem *addr)
> > +{
> > +     asm volatile (
> > +     "       st.d    %[v], %[hw], 0  \n"
> > +     "       ld.b    $r0, %[hw], 0   \n"
> > +     :
> > +     : [hw] "r" (addr),  [v] "r" (val64)
> > +     );
> > +}
> > +
> > +/* ============== LS7A registers =============== */
> > +#define LS7A_PCH_REG_BASE            0x10000000UL
> > +/* LPC regs */
> > +#define LS7A_LPC_REG_BASE            (LS7A_PCH_REG_BASE + 0x00002000)
> > +/* CHIPCFG regs */
> > +#define LS7A_CHIPCFG_REG_BASE                (LS7A_PCH_REG_BASE + 0x00010000)
> > +/* MISC reg base */
> > +#define LS7A_MISC_REG_BASE           (LS7A_PCH_REG_BASE + 0x00080000)
> > +/* ACPI regs */
> > +#define LS7A_ACPI_REG_BASE           (LS7A_MISC_REG_BASE + 0x00050000)
> > +/* RTC regs */
> > +#define LS7A_RTC_REG_BASE            (LS7A_MISC_REG_BASE + 0x00050100)
> > +
> > +#define LS7A_DMA_CFG                 (volatile void *)TO_UNCAC(LS7A_CHIPCFG_REG_BASE + 0x041c)
> > +#define LS7A_DMA_NODE_SHF            8
> > +#define LS7A_DMA_NODE_MASK           0x1F00
> > +
> > +#define LS7A_INT_MASK_REG            (volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x020)
> > +#define LS7A_INT_EDGE_REG            (volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x060)
> > +#define LS7A_INT_CLEAR_REG           (volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x080)
> > +#define LS7A_INT_HTMSI_EN_REG                (volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x040)
> > +#define LS7A_INT_ROUTE_ENTRY_REG     (volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x100)
> > +#define LS7A_INT_HTMSI_VEC_REG               (volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x200)
> > +#define LS7A_INT_STATUS_REG          (volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x3a0)
> > +#define LS7A_INT_POL_REG             (volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x3e0)
> > +#define LS7A_LPC_INT_CTL             (volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x2000)
> > +#define LS7A_LPC_INT_ENA             (volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x2004)
> > +#define LS7A_LPC_INT_STS             (volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x2008)
> > +#define LS7A_LPC_INT_CLR             (volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x200c)
> > +#define LS7A_LPC_INT_POL             (volatile void *)TO_UNCAC(LS7A_PCH_REG_BASE + 0x2010)
> > +
> > +#define LS7A_PMCON_SOC_REG           (volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x000)
> > +#define LS7A_PMCON_RESUME_REG                (volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x004)
> > +#define LS7A_PMCON_RTC_REG           (volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x008)
> > +#define LS7A_PM1_EVT_REG             (volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x00c)
> > +#define LS7A_PM1_ENA_REG             (volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x010)
> > +#define LS7A_PM1_CNT_REG             (volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x014)
> > +#define LS7A_PM1_TMR_REG             (volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x018)
> > +#define LS7A_P_CNT_REG                       (volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x01c)
> > +#define LS7A_GPE0_STS_REG            (volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x028)
> > +#define LS7A_GPE0_ENA_REG            (volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x02c)
> > +#define LS7A_RST_CNT_REG             (volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x030)
> > +#define LS7A_WD_SET_REG                      (volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x034)
> > +#define LS7A_WD_TIMER_REG            (volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x038)
> > +#define LS7A_THSENS_CNT_REG          (volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x04c)
> > +#define LS7A_GEN_RTC_1_REG           (volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x050)
> > +#define LS7A_GEN_RTC_2_REG           (volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x054)
> > +#define LS7A_DPM_CFG_REG             (volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x400)
> > +#define LS7A_DPM_STS_REG             (volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x404)
> > +#define LS7A_DPM_CNT_REG             (volatile void *)TO_UNCAC(LS7A_ACPI_REG_BASE + 0x408)
> > +
> > +typedef enum {
> > +     ACPI_PCI_HOTPLUG_STATUS = 1 << 1,
> > +     ACPI_CPU_HOTPLUG_STATUS = 1 << 2,
> > +     ACPI_MEM_HOTPLUG_STATUS = 1 << 3,
> > +     ACPI_POWERBUTTON_STATUS = 1 << 8,
> > +     ACPI_RTC_WAKE_STATUS    = 1 << 10,
> > +     ACPI_PCI_WAKE_STATUS    = 1 << 14,
> > +     ACPI_ANY_WAKE_STATUS    = 1 << 15,
> > +} AcpiEventStatusBits;
> > +
> > +#define HT1LO_OFFSET         0xe0000000000UL
> > +
> > +/* PCI Configuration Space Base */
> > +#define MCFG_EXT_PCICFG_BASE         0xefe00000000UL
> > +
> > +/* REG ACCESS*/
> Do we really need this tiny comment? The code is pretty self-explanatory
> and the comment end marker is lacking a space before.
> > +#define ls7a_readb(addr)                       (*(volatile unsigned char  *)TO_UNCAC(addr))
> > +#define ls7a_readw(addr)                       (*(volatile unsigned short *)TO_UNCAC(addr))
> > +#define ls7a_readl(addr)                       (*(volatile unsigned int   *)TO_UNCAC(addr))
> > +#define ls7a_readq(addr)                       (*(volatile unsigned long  *)TO_UNCAC(addr))
> > +#define ls7a_writeb(val, addr)               *(volatile unsigned char  *)TO_UNCAC(addr) = (val)
> > +#define ls7a_writew(val, addr)               *(volatile unsigned short *)TO_UNCAC(addr) = (val)
> > +#define ls7a_writel(val, addr)               ls7a_write_type(val, addr, uint32_t)
> > +#define ls7a_writeq(val, addr)               ls7a_write_type(val, addr, uint64_t)
> > +#define ls7a_write(val, addr)                ls7a_write_type(val, addr, uint64_t)
> > +
> > +#endif /* __ASM_LOONGSON_H */
> > diff --git a/arch/loongarch/include/asm/regdef.h b/arch/loongarch/include/asm/regdef.h
> > new file mode 100644
> > index 000000000000..9f24f0c05fe3
> > --- /dev/null
> > +++ b/arch/loongarch/include/asm/regdef.h
> > @@ -0,0 +1,43 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef _ASM_REGDEF_H
> > +#define _ASM_REGDEF_H
> > +
> > +#define zero $r0     /* wired zero */
> > +#define ra   $r1     /* return address */
> > +#define tp   $r2
> > +#define sp   $r3     /* stack pointer */
> > +#define v0   $r4     /* return value - caller saved */
> > +#define v1   $r5
> > +#define a0   $r4     /* argument registers */
> > +#define a1   $r5
> > +#define a2   $r6
> > +#define a3   $r7
> > +#define a4   $r8
> > +#define a5   $r9
> > +#define a6   $r10
> > +#define a7   $r11
> > +#define t0   $r12    /* caller saved */
> > +#define t1   $r13
> > +#define t2   $r14
> > +#define t3   $r15
> > +#define t4   $r16
> > +#define t5   $r17
> > +#define t6   $r18
> > +#define t7   $r19
> > +#define t8   $r20
> > +#define u0   $r21
> > +#define fp   $r22    /* frame pointer */
> > +#define s0   $r23    /* callee saved */
> > +#define s1   $r24
> > +#define s2   $r25
> > +#define s3   $r26
> > +#define s4   $r27
> > +#define s5   $r28
> > +#define s6   $r29
> > +#define s7   $r30
> > +#define s8   $r31
> > +
> > +#endif /* _ASM_REGDEF_H */
> Why can't this file be combined with the FP one (absorbing the FP
> definitions into this file)? While at it, remove $vX and document $u0 too.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 05/24] LoongArch: Add build infrastructure
  2022-05-01 10:09   ` WANG Xuerui
  2022-05-01 12:41     ` Huacai Chen
@ 2022-05-01 15:43     ` Xi Ruoyao
  1 sibling, 0 replies; 94+ messages in thread
From: Xi Ruoyao @ 2022-05-01 15:43 UTC (permalink / raw)
  To: WANG Xuerui, Huacai Chen, Arnd Bergmann, Andy Lutomirski,
	Thomas Gleixner, Peter Zijlstra, Andrew Morton, David Airlie,
	Jonathan Corbet, Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Jiaxun Yang

On Sun, 2022-05-01 at 18:09 +0800, WANG Xuerui wrote:
> > +cflags-y += $(call as-option,-Wa$(comma)-mno-fix-loongson3-llsc,)
> Unfortunately we're still working around the LL/SC hardware issue even
> after migrating to LoongArch... might be better to add a comment too. 
> (something along the line of "we work around the issue manually in the
> handwritten assembly, so no automatic workarounds should kick in")

There is no LoongArch assembler which is publicly available and has a -
m(no)?fix-loongson3-llsc option.  The people writing assembly have to
add the workarounds manually.

This line should be removed for a upstream patch.  If you need to
support "unpublic" toolchains with this option, keep it in a seperate
git repo or branch.

Regarding LL/SC workaround, GCC is using a different pattern:

(https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/loongarch/sync.md;h=0c4f1983;hb=HEAD#l132)

 132   return "%G5\\n\\t"
 133          "1:\\n\\t"
 134          "ll.<amo>\\t%0,%1\\n\\t"
 135          "bne\\t%0,%z2,2f\\n\\t"
 136          "or%i3\\t%6,$zero,%3\\n\\t"
 137          "sc.<amo>\\t%6,%1\\n\\t"
 138          "beq\\t$zero,%6,1b\\n\\t"
 139          "b\\t3f\\n\\t"
 140          "2:\\n\\t"
 141          "dbar\\t0x700\\n\\t"
 142          "3:\\n\\t";

Note that the dbar instruction has "hint" 0x700 instead of 0 (using a
special number was proposed by me, so an updated LoongArch
implementation can recognize and ignore this instruction when the
workaround is unneeded).  And the instruction is "skipped" by the
previous "b" instruction.  I guess it's enough to "force" LA464 to
behave correctly for the LL/SC loop w/o really inserting a barrier.

GCC LoongArch port maintainers have not publicly explained the rational
of this pattern, but it works fine in userspace on a 3A5000 processor. 
Maybe the kernel could also benefit from this "new" pattern but I'm not
sure.  I suggest to discuss this with your compiler team and hardware
team.
-- 
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 10/24] LoongArch: Add exception/interrupt handling
  2022-04-30  9:05 ` [PATCH V9 10/24] LoongArch: Add exception/interrupt handling Huacai Chen
@ 2022-05-01 16:27   ` Xi Ruoyao
  2022-05-01 17:08     ` Xi Ruoyao
  0 siblings, 1 reply; 94+ messages in thread
From: Xi Ruoyao @ 2022-05-01 16:27 UTC (permalink / raw)
  To: Huacai Chen, Arnd Bergmann, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang

On Sat, 2022-04-30 at 17:05 +0800, Huacai Chen wrote:
> +struct acpi_madt_lio_pic;
> +struct acpi_madt_eio_pic;
> +struct acpi_madt_ht_pic;
> +struct acpi_madt_bio_pic;
> +struct acpi_madt_msi_pic;
> +struct acpi_madt_lpc_pic;

Where are those defined?  I can't find them and the compilation fails with:

arch/loongarch/kernel/irq.c: In function ‘find_pch_pic’:
arch/loongarch/kernel/irq.c:48:32: error: invalid use of undefined type ‘struct acpi_madt_bio_pic’
   48 |                 start = irq_cfg->gsi_base;
      |                                ^~
arch/loongarch/kernel/irq.c:49:32: error: invalid use of undefined type ‘struct acpi_madt_bio_pic’
   49 |                 end   = irq_cfg->gsi_base + irq_cfg->size;
      |                                ^~
arch/loongarch/kernel/irq.c:49:52: error: invalid use of undefined type ‘struct acpi_madt_bio_pic’
   49 |                 end   = irq_cfg->gsi_base + irq_cfg->size;
      |                                                    ^~

-- 
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 10/24] LoongArch: Add exception/interrupt handling
  2022-05-01 16:27   ` Xi Ruoyao
@ 2022-05-01 17:08     ` Xi Ruoyao
  2022-05-02  0:01       ` Huacai Chen
  0 siblings, 1 reply; 94+ messages in thread
From: Xi Ruoyao @ 2022-05-01 17:08 UTC (permalink / raw)
  To: Huacai Chen, Arnd Bergmann, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds
  Cc: linux-arch, linux-doc, linux-kernel, Xuefeng Li, Yanteng Si,
	Huacai Chen, Guo Ren, Xuerui Wang, Jiaxun Yang

On Mon, 2022-05-02 at 00:27 +0800, Xi Ruoyao wrote:
> On Sat, 2022-04-30 at 17:05 +0800, Huacai Chen wrote:
> > +struct acpi_madt_lio_pic;
> > +struct acpi_madt_eio_pic;
> > +struct acpi_madt_ht_pic;
> > +struct acpi_madt_bio_pic;
> > +struct acpi_madt_msi_pic;
> > +struct acpi_madt_lpc_pic;
> 
> Where are those defined?  I can't find them and the compilation fails
> with:
> 
> arch/loongarch/kernel/irq.c: In function ‘find_pch_pic’:
> arch/loongarch/kernel/irq.c:48:32: error: invalid use of undefined
> type ‘struct acpi_madt_bio_pic’
>    48 |                 start = irq_cfg->gsi_base;
>       |                                ^~
> arch/loongarch/kernel/irq.c:49:32: error: invalid use of undefined
> type ‘struct acpi_madt_bio_pic’
>    49 |                 end   = irq_cfg->gsi_base + irq_cfg->size;
>       |                                ^~
> arch/loongarch/kernel/irq.c:49:52: error: invalid use of undefined
> type ‘struct acpi_madt_bio_pic’
>    49 |                 end   = irq_cfg->gsi_base + irq_cfg->size;
>       |                                                    ^~

Alright, my bad... I didn't realize the LoongArch patches are splitted
into multiple series for multiple lists.  But is this the SOP of kernel
patch reviewing?  Would it be easier to just send one series and CC all
relevent lists?

-- 
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
  2022-04-30 10:07     ` Arnd Bergmann
  (?)
@ 2022-05-01 23:36       ` Ard Biesheuvel
  -1 siblings, 0 replies; 94+ messages in thread
From: Ard Biesheuvel @ 2022-05-01 23:36 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Huacai Chen, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds,
	linux-arch, open list:DOCUMENTATION, Linux Kernel Mailing List,
	Xuefeng Li, Yanteng Si, Huacai Chen, Guo Ren, Xuerui Wang,
	Jiaxun Yang, Linux ARM, Catalin Marinas, Will Deacon,
	linux-riscv, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-efi

On Sat, 30 Apr 2022 at 13:07, Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> >
> > This patch adds zboot (self-extracting compressed kernel) support, all
> > existing in-kernel compressing algorithm and efistub are supported.
> >
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
>
> I have no objections to adding a decompressor in principle, and
> the implementation seems reasonable. However, I think we should try to
> be consistent between architectures. On both arm64 and riscv, the
> maintainers decided to not include a decompressor and instead leave
> it up to the boot loader to decompress the kernel and enter it from there.
>

The reason we don't want to add more decompressors is because it
forces us to do a bare-metal boot twice, i.e., create an ID map,
discover memory, etc etc.
If I am reading this patch correctly, the kernel image is just
decompressed to VMLINUX_LOAD_ADDRESS, regardless of what EFI thinks
that memory is being used for. That kind of misses the point of
booting with EFI.

> As I understand it, this is not part of the UEFI boot flow though, so it
> means that you don't get any compressed kernel images at all when
> booting using UEFI (let me know if that is wrong). I assume this is why
> you decided to include the decompressor here after all.
>

The PE/COFF executable format does not support compression, and so EFI
does not support this natively. Currently, it is left to the
bootloader to figure out whether the image is compressed or not, and
perform the decompression before calling the EFI entrypoint if needed.
This is what GRUB and systemd-boot do today (on non-x86)

I had a stab at doing something similar in EFI, but relying only on
the generic EFI boot services. The advantage of EFI is that you enter
a main() function in C with MMU and caches on, with a memory map, heap
allocator, etc available.

Code for arm64 is here:
https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=efi-decompressor

> I think we should first aim for consistency here, and handle this the
> same way across the modern architectures, either leaving the
> decompressor code out, or adding it consistently. Maybe it would
> even be possible to have the decompressor code as part of the
> EFI stub and share it between the three architectures (x86 and
> 32-bit arm already support loading compressed kernels using EFI).
>

Indeed. One disadvantage of my approach is that both the inner and
outer EFI executables need to be signed for secure boot, as it uses
the EFI boot services. But that is the point, really. The firmware
already knows how to load and start images, so better to make use of
it.


> Adding the arm64, risc-v and uefi maintainers for further discussion here,
> see full below.
>
>        Arnd
>
> > ---
> >  arch/loongarch/Kbuild                         |   2 +-
> >  arch/loongarch/Kconfig                        |  11 ++
> >  arch/loongarch/Makefile                       |  26 ++-
> >  arch/loongarch/boot/Makefile                  |  55 ++++++
> >  arch/loongarch/boot/boot.lds.S                |  64 +++++++
> >  arch/loongarch/boot/decompress.c              |  98 +++++++++++
> >  arch/loongarch/boot/string.c                  | 166 ++++++++++++++++++
> >  arch/loongarch/boot/zheader.S                 | 100 +++++++++++
> >  arch/loongarch/boot/zkernel.S                 |  99 +++++++++++
> >  arch/loongarch/tools/Makefile                 |  15 ++
> >  arch/loongarch/tools/calc_vmlinuz_load_addr.c |  51 ++++++
> >  arch/loongarch/tools/elf-entry.c              |  66 +++++++
> >  12 files changed, 749 insertions(+), 4 deletions(-)
> >  create mode 100644 arch/loongarch/boot/boot.lds.S
> >  create mode 100644 arch/loongarch/boot/decompress.c
> >  create mode 100644 arch/loongarch/boot/string.c
> >  create mode 100644 arch/loongarch/boot/zheader.S
> >  create mode 100644 arch/loongarch/boot/zkernel.S
> >  create mode 100644 arch/loongarch/tools/Makefile
> >  create mode 100644 arch/loongarch/tools/calc_vmlinuz_load_addr.c
> >  create mode 100644 arch/loongarch/tools/elf-entry.c
> >
> > diff --git a/arch/loongarch/Kbuild b/arch/loongarch/Kbuild
> > index ab5373d0a24f..d907fdd7ca08 100644
> > --- a/arch/loongarch/Kbuild
> > +++ b/arch/loongarch/Kbuild
> > @@ -3,4 +3,4 @@ obj-y += mm/
> >  obj-y += vdso/
> >
> >  # for cleaning
> > -subdir- += boot
> > +subdir- += boot tools
> > diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> > index 55225ee5f868..6c1042746b2d 100644
> > --- a/arch/loongarch/Kconfig
> > +++ b/arch/loongarch/Kconfig
> > @@ -107,6 +107,7 @@ config LOONGARCH
> >         select PERF_USE_VMALLOC
> >         select RTC_LIB
> >         select SPARSE_IRQ
> > +       select SYS_SUPPORTS_ZBOOT
> >         select SYSCTL_EXCEPTION_TRACE
> >         select SWIOTLB
> >         select TRACE_IRQFLAGS_SUPPORT
> > @@ -143,6 +144,16 @@ config LOCKDEP_SUPPORT
> >         bool
> >         default y
> >
> > +config SYS_SUPPORTS_ZBOOT
> > +       bool
> > +       select HAVE_KERNEL_GZIP
> > +       select HAVE_KERNEL_BZIP2
> > +       select HAVE_KERNEL_LZ4
> > +       select HAVE_KERNEL_LZMA
> > +       select HAVE_KERNEL_LZO
> > +       select HAVE_KERNEL_XZ
> > +       select HAVE_KERNEL_ZSTD
> > +
> >  config MACH_LOONGSON32
> >         def_bool 32BIT
> >
> > diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
> > index d88a792dafbe..1ed5b8466565 100644
> > --- a/arch/loongarch/Makefile
> > +++ b/arch/loongarch/Makefile
> > @@ -5,12 +5,31 @@
> >
> >  boot   := arch/loongarch/boot
> >
> > +ifndef CONFIG_SYS_SUPPORTS_ZBOOT
> > +
> >  ifndef CONFIG_EFI_STUB
> >  KBUILD_IMAGE   = $(boot)/vmlinux
> >  else
> >  KBUILD_IMAGE   = $(boot)/vmlinux.efi
> >  endif
> >
> > +else
> > +
> > +ifndef CONFIG_EFI_STUB
> > +KBUILD_IMAGE   = $(boot)/vmlinuz
> > +else
> > +KBUILD_IMAGE   = $(boot)/vmlinuz.efi
> > +endif
> > +
> > +endif
> > +
> > +load-y         = 0x9000000000200000
> > +bootvars-y     = VMLINUX_LOAD_ADDRESS=$(load-y)
> > +
> > +archscripts: scripts_basic
> > +       $(Q)$(MAKE) $(build)=arch/loongarch/tools elf-entry
> > +       $(Q)$(MAKE) $(build)=arch/loongarch/tools calc_vmlinuz_load_addr
> > +
> >  #
> >  # Select the object file format to substitute into the linker script.
> >  #
> > @@ -55,9 +74,6 @@ KBUILD_CFLAGS_MODULE          += -fplt -Wa,-mla-global-with-abs,-mla-local-with-abs
> >  cflags-y += -ffreestanding
> >  cflags-y += $(call as-option,-Wa$(comma)-mno-fix-loongson3-llsc,)
> >
> > -load-y         = 0x9000000000200000
> > -bootvars-y     = VMLINUX_LOAD_ADDRESS=$(load-y)
> > -
> >  drivers-$(CONFIG_PCI)          += arch/loongarch/pci/
> >
> >  KBUILD_AFLAGS  += $(cflags-y)
> > @@ -99,7 +115,11 @@ $(KBUILD_IMAGE): vmlinux
> >         $(Q)$(MAKE) $(build)=$(boot) $(bootvars-y) $@
> >
> >  install:
> > +ifndef CONFIG_SYS_SUPPORTS_ZBOOT
> >         $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
> > +else
> > +       $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinuz-$(KERNELRELEASE)
> > +endif
> >         $(Q)install -D -m 644 .config $(INSTALL_PATH)/config-$(KERNELRELEASE)
> >         $(Q)install -D -m 644 System.map $(INSTALL_PATH)/System.map-$(KERNELRELEASE)
> >
> > diff --git a/arch/loongarch/boot/Makefile b/arch/loongarch/boot/Makefile
> > index 66f2293c34b2..c26a36004ae2 100644
> > --- a/arch/loongarch/boot/Makefile
> > +++ b/arch/loongarch/boot/Makefile
> > @@ -21,3 +21,58 @@ quiet_cmd_eficopy = OBJCOPY $@
> >
> >  $(obj)/vmlinux.efi: $(obj)/vmlinux FORCE
> >         $(call if_changed,eficopy)
> > +
> > +# zboot
> > +extra-y        += boot.lds
> > +$(obj)/boot.lds: $(obj)/vmlinux.bin FORCE
> > +CPPFLAGS_boot.lds = $(KBUILD_CPPFLAGS) -DVMLINUZ_LOAD_ADDRESS=$(zload-y)
> > +
> > +entry-y        = $(shell $(objtree)/arch/loongarch/tools/elf-entry $(obj)/vmlinux)
> > +zload-y = $(shell $(objtree)/arch/loongarch/tools/calc_vmlinuz_load_addr \
> > +                               $(obj)/vmlinux.bin $(VMLINUX_LOAD_ADDRESS))
> > +
> > +BOOT_HEAP_SIZE := 0x400000
> > +BOOT_STACK_SIZE        := 0x002000
> > +
> > +KBUILD_AFLAGS := $(KBUILD_AFLAGS) -D__ASSEMBLY__ \
> > +       -DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \
> > +       -DBOOT_STACK_SIZE=$(BOOT_STACK_SIZE)
> > +
> > +KBUILD_CFLAGS := $(KBUILD_CFLAGS) -fpic -D__KERNEL__ \
> > +       -DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \
> > +       -DBOOT_STACK_SIZE=$(BOOT_STACK_SIZE)
> > +
> > +targets += vmlinux.bin
> > +OBJCOPYFLAGS_vmlinux.bin := $(OBJCOPYFLAGS) -O binary $(strip-flags)
> > +$(obj)/vmlinux.bin: $(obj)/vmlinux FORCE
> > +       $(call if_changed,objcopy)
> > +
> > +tool_$(CONFIG_KERNEL_GZIP)    = gzip
> > +tool_$(CONFIG_KERNEL_BZIP2)   = bzip2_with_size
> > +tool_$(CONFIG_KERNEL_LZ4)     = lz4_with_size
> > +tool_$(CONFIG_KERNEL_LZMA)    = lzma_with_size
> > +tool_$(CONFIG_KERNEL_LZO)     = lzo_with_size
> > +tool_$(CONFIG_KERNEL_XZ)      = xzkern_with_size
> > +tool_$(CONFIG_KERNEL_ZSTD)    = zstd22_with_size
> > +
> > +targets += vmlinux.bin.z
> > +$(obj)/vmlinux.bin.z: $(obj)/vmlinux.bin FORCE
> > +       $(call if_changed,$(tool_y))
> > +
> > +targets += $(notdir $(vmlinuzobjs-y))
> > +vmlinuzobjs-y := $(obj)/zkernel.o $(obj)/decompress.o $(obj)/string.o
> > +vmlinuzobjs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
> > +$(obj)/zkernel.o: $(obj)/vmlinux.bin.z
> > +AFLAGS_zkernel.o = $(KBUILD_AFLAGS) -DVMLINUZ_LOAD_ADDRESS=$(zload-y) -DKERNEL_ENTRY=$(entry-y)
> > +
> > +quiet_cmd_zld = LD      $@
> > +      cmd_zld = $(LD) $(KBUILD_LDFLAGS) -T $< $(vmlinuzobjs-y) -o $@
> > +
> > +targets += vmlinuz
> > +$(obj)/vmlinuz: $(src)/boot.lds $(vmlinuzobjs-y) FORCE
> > +       $(call if_changed,zld)
> > +       $(call if_changed,strip)
> > +
> > +targets += vmlinuz.efi
> > +$(obj)/vmlinuz.efi: $(obj)/vmlinuz FORCE
> > +       $(call if_changed,eficopy)
> > diff --git a/arch/loongarch/boot/boot.lds.S b/arch/loongarch/boot/boot.lds.S
> > new file mode 100644
> > index 000000000000..23e698782afd
> > --- /dev/null
> > +++ b/arch/loongarch/boot/boot.lds.S
> > @@ -0,0 +1,64 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * ld.script for compressed kernel support of LoongArch
> > + *
> > + * Author: Huacai Chen <chenhuacai@loongson.cn>
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include "../kernel/image-vars.h"
> > +
> > +/*
> > + * Max avaliable Page Size is 64K, so we set SectionAlignment
> > + * field of EFI application to 64K.
> > + */
> > +PECOFF_FILE_ALIGN = 0x200;
> > +PECOFF_SEGMENT_ALIGN = 0x10000;
> > +
> > +OUTPUT_ARCH(loongarch)
> > +ENTRY(kernel_entry)
> > +PHDRS {
> > +       text PT_LOAD FLAGS(7); /* RWX */
> > +}
> > +SECTIONS
> > +{
> > +       . = VMLINUZ_LOAD_ADDRESS;
> > +
> > +       _text = .;
> > +       .head.text : {
> > +               *(.head.text)
> > +       }
> > +
> > +       .text : {
> > +               *(.text)
> > +               *(.init.text)
> > +               *(.rodata)
> > +       }: text
> > +
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > +       _data = .;
> > +       .data : {
> > +               *(.data)
> > +               *(.init.data)
> > +               /* Put the compressed image here */
> > +               __image_begin = .;
> > +               *(.image)
> > +               __image_end = .;
> > +               CONSTRUCTORS
> > +               . = ALIGN(PECOFF_FILE_ALIGN);
> > +       }
> > +       _edata = .;
> > +
> > +       .bss : {
> > +               *(.bss)
> > +               *(.init.bss)
> > +       }
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > +       _end = .;
> > +
> > +       /DISCARD/ : {
> > +               *(.options)
> > +               *(.comment)
> > +               *(.note)
> > +       }
> > +}
> > diff --git a/arch/loongarch/boot/decompress.c b/arch/loongarch/boot/decompress.c
> > new file mode 100644
> > index 000000000000..8f55fcd8f285
> > --- /dev/null
> > +++ b/arch/loongarch/boot/decompress.c
> > @@ -0,0 +1,98 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +/*
> > + * Author: Huacai Chen <chenhuacai@loongson.cn>
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/types.h>
> > +#include <linux/kernel.h>
> > +#include <linux/string.h>
> > +#include <linux/libfdt.h>
> > +
> > +#include <asm/addrspace.h>
> > +
> > +/*
> > + * These two variables specify the free mem region
> > + * that can be used for temporary malloc area
> > + */
> > +unsigned long free_mem_ptr;
> > +unsigned long free_mem_end_ptr;
> > +
> > +/* The linker tells us where the image is. */
> > +extern unsigned char __image_begin, __image_end;
> > +
> > +#define puts(s) do {} while (0)
> > +#define puthex(val) do {} while (0)
> > +
> > +void error(char *x)
> > +{
> > +       puts("\n\n");
> > +       puts(x);
> > +       puts("\n\n -- System halted");
> > +
> > +       while (1)
> > +               ;       /* Halt */
> > +}
> > +
> > +/* activate the code for pre-boot environment */
> > +#define STATIC static
> > +
> > +#include "../../../../lib/ashldi3.c"
> > +
> > +#ifdef CONFIG_KERNEL_GZIP
> > +#include "../../../../lib/decompress_inflate.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_BZIP2
> > +#include "../../../../lib/decompress_bunzip2.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_LZ4
> > +#include "../../../../lib/decompress_unlz4.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_LZMA
> > +#include "../../../../lib/decompress_unlzma.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_LZO
> > +#include "../../../../lib/decompress_unlzo.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_XZ
> > +#include "../../../../lib/decompress_unxz.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_ZSTD
> > +#include "../../../../lib/decompress_unzstd.c"
> > +#endif
> > +
> > +void decompress_kernel(unsigned long boot_heap_start)
> > +{
> > +       unsigned long zimage_start, zimage_size;
> > +
> > +       zimage_start = (unsigned long)(&__image_begin);
> > +       zimage_size = (unsigned long)(&__image_end) -
> > +           (unsigned long)(&__image_begin);
> > +
> > +       puts("zimage at:     ");
> > +       puthex(zimage_start);
> > +       puts(" ");
> > +       puthex(zimage_size + zimage_start);
> > +       puts("\n");
> > +
> > +       /* This area are prepared for mallocing when decompressing */
> > +       free_mem_ptr = boot_heap_start;
> > +       free_mem_end_ptr = boot_heap_start + BOOT_HEAP_SIZE;
> > +
> > +       /* Display standard Linux/LoongArch boot prompt */
> > +       puts("Uncompressing Linux at load address ");
> > +       puthex(VMLINUX_LOAD_ADDRESS);
> > +       puts("\n");
> > +
> > +       /* Decompress the kernel with according algorithm */
> > +       __decompress((char *)zimage_start, zimage_size, 0, 0,
> > +                  (void *)VMLINUX_LOAD_ADDRESS, 0, 0, error);
> > +
> > +       puts("Now, booting the kernel...\n");
> > +}
> > diff --git a/arch/loongarch/boot/string.c b/arch/loongarch/boot/string.c
> > new file mode 100644
> > index 000000000000..3f746e7c2bb5
> > --- /dev/null
> > +++ b/arch/loongarch/boot/string.c
> > @@ -0,0 +1,166 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * arch/loongarch/boot/string.c
> > + *
> > + * Very small subset of simple string routines
> > + */
> > +
> > +#include <linux/types.h>
> > +
> > +void __weak *memset(void *s, int c, size_t n)
> > +{
> > +       int i;
> > +       char *ss = s;
> > +
> > +       for (i = 0; i < n; i++)
> > +               ss[i] = c;
> > +       return s;
> > +}
> > +
> > +void __weak *memcpy(void *dest, const void *src, size_t n)
> > +{
> > +       int i;
> > +       const char *s = src;
> > +       char *d = dest;
> > +
> > +       for (i = 0; i < n; i++)
> > +               d[i] = s[i];
> > +       return dest;
> > +}
> > +
> > +void __weak *memmove(void *dest, const void *src, size_t n)
> > +{
> > +       int i;
> > +       const char *s = src;
> > +       char *d = dest;
> > +
> > +       if (d < s) {
> > +               for (i = 0; i < n; i++)
> > +                       d[i] = s[i];
> > +       } else if (d > s) {
> > +               for (i = n - 1; i >= 0; i--)
> > +                       d[i] = s[i];
> > +       }
> > +
> > +       return dest;
> > +}
> > +
> > +int __weak memcmp(const void *cs, const void *ct, size_t count)
> > +{
> > +       int res = 0;
> > +       const unsigned char *su1, *su2;
> > +
> > +       for (su1 = cs, su2 = ct; 0 < count; ++su1, ++su2, count--) {
> > +               res = *su1 - *su2;
> > +               if (res != 0)
> > +                       break;
> > +       }
> > +       return res;
> > +}
> > +
> > +int __weak strcmp(const char *str1, const char *str2)
> > +{
> > +       int delta = 0;
> > +       const unsigned char *s1 = (const unsigned char *)str1;
> > +       const unsigned char *s2 = (const unsigned char *)str2;
> > +
> > +       while (*s1 || *s2) {
> > +               delta = *s1 - *s2;
> > +               if (delta)
> > +                       return delta;
> > +               s1++;
> > +               s2++;
> > +       }
> > +       return 0;
> > +}
> > +
> > +size_t __weak strlen(const char *s)
> > +{
> > +       const char *sc;
> > +
> > +       for (sc = s; *sc != '\0'; ++sc)
> > +               /* nothing */;
> > +       return sc - s;
> > +}
> > +
> > +size_t __weak strnlen(const char *s, size_t count)
> > +{
> > +       const char *sc;
> > +
> > +       for (sc = s; count-- && *sc != '\0'; ++sc)
> > +               /* nothing */;
> > +       return sc - s;
> > +}
> > +
> > +char * __weak strnstr(const char *s1, const char *s2, size_t len)
> > +{
> > +       size_t l2;
> > +
> > +       l2 = strlen(s2);
> > +       if (!l2)
> > +               return (char *)s1;
> > +       while (len >= l2) {
> > +               len--;
> > +               if (!memcmp(s1, s2, l2))
> > +                       return (char *)s1;
> > +               s1++;
> > +       }
> > +       return NULL;
> > +}
> > +
> > +#undef strcat
> > +char * __weak strcat(char *dest, const char *src)
> > +{
> > +       char *tmp = dest;
> > +
> > +       while (*dest)
> > +               dest++;
> > +       while ((*dest++ = *src++) != '\0')
> > +               ;
> > +       return tmp;
> > +}
> > +
> > +char * __weak strncat(char *dest, const char *src, size_t count)
> > +{
> > +       char *tmp = dest;
> > +
> > +       if (count) {
> > +               while (*dest)
> > +                       dest++;
> > +               while ((*dest++ = *src++) != 0) {
> > +                       if (--count == 0) {
> > +                               *dest = '\0';
> > +                               break;
> > +                       }
> > +               }
> > +       }
> > +       return tmp;
> > +}
> > +
> > +char * __weak strpbrk(const char *cs, const char *ct)
> > +{
> > +       const char *sc1, *sc2;
> > +
> > +       for (sc1 = cs; *sc1 != '\0'; ++sc1) {
> > +               for (sc2 = ct; *sc2 != '\0'; ++sc2) {
> > +                       if (*sc1 == *sc2)
> > +                               return (char *)sc1;
> > +               }
> > +       }
> > +       return NULL;
> > +}
> > +
> > +char * __weak strsep(char **s, const char *ct)
> > +{
> > +       char *sbegin = *s;
> > +       char *end;
> > +
> > +       if (sbegin == NULL)
> > +               return NULL;
> > +
> > +       end = strpbrk(sbegin, ct);
> > +       if (end)
> > +               *end++ = '\0';
> > +       *s = end;
> > +       return sbegin;
> > +}
> > diff --git a/arch/loongarch/boot/zheader.S b/arch/loongarch/boot/zheader.S
> > new file mode 100644
> > index 000000000000..4bc50d953ec7
> > --- /dev/null
> > +++ b/arch/loongarch/boot/zheader.S
> > @@ -0,0 +1,100 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/pe.h>
> > +#include <linux/sizes.h>
> > +
> > +       .macro  __EFI_PE_HEADER
> > +       .long   PE_MAGIC
> > +coff_header:
> > +       .short  IMAGE_FILE_MACHINE_LOONGARCH            /* Machine */
> > +       .short  section_count                           /* NumberOfSections */
> > +       .long   0                                       /* TimeDateStamp */
> > +       .long   0                                       /* PointerToSymbolTable */
> > +       .long   0                                       /* NumberOfSymbols */
> > +       .short  section_table - optional_header         /* SizeOfOptionalHeader */
> > +       .short  IMAGE_FILE_DEBUG_STRIPPED | \
> > +               IMAGE_FILE_EXECUTABLE_IMAGE | \
> > +               IMAGE_FILE_LINE_NUMS_STRIPPED           /* Characteristics */
> > +
> > +optional_header:
> > +       .short  PE_OPT_MAGIC_PE32PLUS                   /* PE32+ format */
> > +       .byte   0x02                                    /* MajorLinkerVersion */
> > +       .byte   0x14                                    /* MinorLinkerVersion */
> > +       .long   _data - efi_header_end                  /* SizeOfCode */
> > +       .long   _end - _data                            /* SizeOfInitializedData */
> > +       .long   0                                       /* SizeOfUninitializedData */
> > +       .long   __efistub_efi_pe_entry - _head          /* AddressOfEntryPoint */
> > +       .long   efi_header_end - _head                  /* BaseOfCode */
> > +
> > +extra_header_fields:
> > +       .quad   0                                       /* ImageBase */
> > +       .long   PECOFF_SEGMENT_ALIGN                    /* SectionAlignment */
> > +       .long   PECOFF_FILE_ALIGN                       /* FileAlignment */
> > +       .short  0                                       /* MajorOperatingSystemVersion */
> > +       .short  0                                       /* MinorOperatingSystemVersion */
> > +       .short  0                                       /* MajorImageVersion */
> > +       .short  0                                       /* MinorImageVersion */
> > +       .short  0                                       /* MajorSubsystemVersion */
> > +       .short  0                                       /* MinorSubsystemVersion */
> > +       .long   0                                       /* Win32VersionValue */
> > +
> > +       .long   _end - _head                            /* SizeOfImage */
> > +
> > +       /* Everything before the kernel image is considered part of the header */
> > +       .long   efi_header_end - _head                  /* SizeOfHeaders */
> > +       .long   0                                       /* CheckSum */
> > +       .short  IMAGE_SUBSYSTEM_EFI_APPLICATION         /* Subsystem */
> > +       .short  0                                       /* DllCharacteristics */
> > +       .quad   0                                       /* SizeOfStackReserve */
> > +       .quad   0                                       /* SizeOfStackCommit */
> > +       .quad   0                                       /* SizeOfHeapReserve */
> > +       .quad   0                                       /* SizeOfHeapCommit */
> > +       .long   0                                       /* LoaderFlags */
> > +       .long   (section_table - .) / 8                 /* NumberOfRvaAndSizes */
> > +
> > +       .quad   0                                       /* ExportTable */
> > +       .quad   0                                       /* ImportTable */
> > +       .quad   0                                       /* ResourceTable */
> > +       .quad   0                                       /* ExceptionTable */
> > +       .quad   0                                       /* CertificationTable */
> > +       .quad   0                                       /* BaseRelocationTable */
> > +
> > +       /* Section table */
> > +section_table:
> > +       .ascii  ".text\0\0\0"
> > +       .long   _data - efi_header_end                  /* VirtualSize */
> > +       .long   efi_header_end - _head                  /* VirtualAddress */
> > +       .long   _data - efi_header_end                  /* SizeOfRawData */
> > +       .long   efi_header_end - _head                  /* PointerToRawData */
> > +
> > +       .long   0                                       /* PointerToRelocations */
> > +       .long   0                                       /* PointerToLineNumbers */
> > +       .short  0                                       /* NumberOfRelocations */
> > +       .short  0                                       /* NumberOfLineNumbers */
> > +       .long   IMAGE_SCN_CNT_CODE | \
> > +               IMAGE_SCN_MEM_READ | \
> > +               IMAGE_SCN_MEM_EXECUTE                   /* Characteristics */
> > +
> > +       .ascii  ".data\0\0\0"
> > +       .long   _end - _data                            /* VirtualSize */
> > +       .long   _data - _head                           /* VirtualAddress */
> > +       .long   _edata - _data                          /* SizeOfRawData */
> > +       .long   _data - _head                           /* PointerToRawData */
> > +
> > +       .long   0                                       /* PointerToRelocations */
> > +       .long   0                                       /* PointerToLineNumbers */
> > +       .short  0                                       /* NumberOfRelocations */
> > +       .short  0                                       /* NumberOfLineNumbers */
> > +       .long   IMAGE_SCN_CNT_INITIALIZED_DATA | \
> > +               IMAGE_SCN_MEM_READ | \
> > +               IMAGE_SCN_MEM_WRITE                     /* Characteristics */
> > +
> > +       .org 0x20e
> > +       .word kernel_version - 512 -  _head
> > +
> > +       .set    section_count, (. - section_table) / 40
> > +efi_header_end:
> > +       .endm
> > diff --git a/arch/loongarch/boot/zkernel.S b/arch/loongarch/boot/zkernel.S
> > new file mode 100644
> > index 000000000000..13a8a14a2328
> > --- /dev/null
> > +++ b/arch/loongarch/boot/zkernel.S
> > @@ -0,0 +1,99 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/init.h>
> > +#include <linux/linkage.h>
> > +#include <asm/addrspace.h>
> > +#include <asm/asm.h>
> > +#include <asm/loongarch.h>
> > +#include <asm/regdef.h>
> > +#include <generated/compile.h>
> > +#include <generated/utsrelease.h>
> > +
> > +#ifdef CONFIG_EFI_STUB
> > +
> > +#include "zheader.S"
> > +
> > +       __HEAD
> > +
> > +_head:
> > +       /* "MZ", MS-DOS header */
> > +       .word   MZ_MAGIC
> > +       .org    0x28
> > +       .ascii  "Loongson\0"
> > +       .org    0x3c
> > +       /* Offset to the PE header */
> > +       .long   pe_header - _head
> > +
> > +pe_header:
> > +       __EFI_PE_HEADER
> > +
> > +kernel_asize:
> > +       .long _end - _text
> > +
> > +kernel_fsize:
> > +       .long _edata - _text
> > +
> > +kernel_vaddr:
> > +       .quad VMLINUZ_LOAD_ADDRESS
> > +
> > +kernel_offset:
> > +       .long kernel_offset - _text
> > +
> > +kernel_version:
> > +       .ascii  UTS_RELEASE " (" LINUX_COMPILE_BY "@" LINUX_COMPILE_HOST ") " UTS_VERSION "\0"
> > +
> > +SYM_L_GLOBAL(kernel_asize)
> > +SYM_L_GLOBAL(kernel_fsize)
> > +SYM_L_GLOBAL(kernel_vaddr)
> > +SYM_L_GLOBAL(kernel_offset)
> > +
> > +#endif
> > +
> > +       __INIT
> > +
> > +SYM_CODE_START(kernel_entry)
> > +       /* Save boot rom start args */
> > +       move    s0, a0
> > +       move    s1, a1
> > +       move    s2, a2
> > +       move    s3, a3
> > +
> > +       /* Config Direct Mapping */
> > +       li.d    t0, CSR_DMW0_INIT
> > +       csrwr   t0, LOONGARCH_CSR_DMWIN0
> > +       li.d    t0, CSR_DMW1_INIT
> > +       csrwr   t0, LOONGARCH_CSR_DMWIN1
> > +
> > +       /* Clear BSS */
> > +       la.abs  a0, _edata
> > +       la.abs  a2, _end
> > +1:     st.d    zero, a0, 0
> > +       addi.d  a0, a0, 8
> > +       bne     a2, a0, 1b
> > +
> > +       la.abs  a0, .heap          /* heap address */
> > +       la.abs  sp, .stack + 8192  /* stack address */
> > +
> > +       la      ra, 2f
> > +       la      t4, decompress_kernel
> > +       jirl    zero, t4, 0
> > +2:
> > +       move    a0, s0
> > +       move    a1, s1
> > +       move    a2, s2
> > +       move    a3, s3
> > +       PTR_LI  t4, KERNEL_ENTRY
> > +       jirl    zero, t4, 0
> > +3:
> > +       b       3b
> > +SYM_CODE_END(kernel_entry)
> > +
> > +       .comm .heap, BOOT_HEAP_SIZE, 4
> > +       .comm .stack, BOOT_STACK_SIZE, 4
> > +
> > +       .align 4
> > +       .section .image, "a", %progbits
> > +       .incbin "arch/loongarch/boot/vmlinux.bin.z"
> > diff --git a/arch/loongarch/tools/Makefile b/arch/loongarch/tools/Makefile
> > new file mode 100644
> > index 000000000000..8a6181c82a91
> > --- /dev/null
> > +++ b/arch/loongarch/tools/Makefile
> > @@ -0,0 +1,15 @@
> > +#
> > +# arch/loongarch/boot/Makefile
> > +#
> > +# Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > +#
> > +
> > +hostprogs := elf-entry
> > +PHONY += elf-entry
> > +elf-entry: $(obj)/elf-entry
> > +       @:
> > +
> > +hostprogs += calc_vmlinuz_load_addr
> > +PHONY += calc_vmlinuz_load_addr
> > +calc_vmlinuz_load_addr: $(obj)/calc_vmlinuz_load_addr
> > +       @:
> > diff --git a/arch/loongarch/tools/calc_vmlinuz_load_addr.c b/arch/loongarch/tools/calc_vmlinuz_load_addr.c
> > new file mode 100644
> > index 000000000000..5e2ca6b4dff6
> > --- /dev/null
> > +++ b/arch/loongarch/tools/calc_vmlinuz_load_addr.c
> > @@ -0,0 +1,51 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <errno.h>
> > +#include <stdint.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <sys/stat.h>
> > +
> > +int main(int argc, char *argv[])
> > +{
> > +       unsigned long long vmlinux_size, vmlinux_load_addr, vmlinuz_load_addr;
> > +       struct stat sb;
> > +
> > +       if (argc != 3) {
> > +               fprintf(stderr, "Usage: %s <pathname> <vmlinux_load_addr>\n", argv[0]);
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       if (stat(argv[1], &sb) == -1) {
> > +               perror("stat");
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       /* Convert hex characters to dec number */
> > +       errno = 0;
> > +       if (sscanf(argv[2], "%llx", &vmlinux_load_addr) != 1) {
> > +               if (errno != 0)
> > +                       perror("sscanf");
> > +               else
> > +                       fprintf(stderr, "No matching characters\n");
> > +
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       vmlinux_size = (uint64_t)sb.st_size;
> > +       vmlinuz_load_addr = vmlinux_load_addr + vmlinux_size;
> > +
> > +       /*
> > +        * Align with 64KB: KEXEC needs load sections to be aligned to PAGE_SIZE,
> > +        * which may be as large as 64KB depending on the kernel configuration.
> > +        */
> > +
> > +       vmlinuz_load_addr += (0x10000 - vmlinux_size % 0x10000);
> > +
> > +       printf("0x%llx\n", vmlinuz_load_addr);
> > +
> > +       return EXIT_SUCCESS;
> > +}
> > diff --git a/arch/loongarch/tools/elf-entry.c b/arch/loongarch/tools/elf-entry.c
> > new file mode 100644
> > index 000000000000..c80721e0dee1
> > --- /dev/null
> > +++ b/arch/loongarch/tools/elf-entry.c
> > @@ -0,0 +1,66 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <elf.h>
> > +#include <inttypes.h>
> > +#include <stdint.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <string.h>
> > +
> > +__attribute__((noreturn))
> > +static void die(const char *msg)
> > +{
> > +       fputs(msg, stderr);
> > +       exit(EXIT_FAILURE);
> > +}
> > +
> > +int main(int argc, const char *argv[])
> > +{
> > +       uint64_t entry;
> > +       size_t nread;
> > +       FILE *file;
> > +       union {
> > +               Elf32_Ehdr ehdr32;
> > +               Elf64_Ehdr ehdr64;
> > +       } hdr;
> > +
> > +       if (argc != 2)
> > +               die("Usage: elf-entry <elf-file>\n");
> > +
> > +       file = fopen(argv[1], "r");
> > +       if (!file) {
> > +               perror("Unable to open input file");
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       nread = fread(&hdr, 1, sizeof(hdr), file);
> > +       if (nread != sizeof(hdr)) {
> > +               fclose(file);
> > +               perror("Unable to read input file");
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       if (memcmp(hdr.ehdr32.e_ident, ELFMAG, SELFMAG)) {
> > +               fclose(file);
> > +               die("Input is not an ELF\n");
> > +       }
> > +
> > +       switch (hdr.ehdr32.e_ident[EI_CLASS]) {
> > +       case ELFCLASS32:
> > +               /* Sign extend to form a canonical address */
> > +               entry = (int64_t)(int32_t)hdr.ehdr32.e_entry;
> > +               break;
> > +
> > +       case ELFCLASS64:
> > +               entry = hdr.ehdr64.e_entry;
> > +               break;
> > +
> > +       default:
> > +               fclose(file);
> > +               die("Invalid ELF class\n");
> > +       }
> > +
> > +       fclose(file);
> > +       printf("0x%016" PRIx64 "\n", entry);
> > +
> > +       return EXIT_SUCCESS;
> > +}
> > --
> > 2.27.0
> >
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
@ 2022-05-01 23:36       ` Ard Biesheuvel
  0 siblings, 0 replies; 94+ messages in thread
From: Ard Biesheuvel @ 2022-05-01 23:36 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Huacai Chen, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds,
	linux-arch, open list:DOCUMENTATION, Linux Kernel Mailing List,
	Xuefeng Li, Yanteng Si, Huacai Chen, Guo Ren, Xuerui Wang,
	Jiaxun Yang, Linux ARM, Catalin Marinas, Will Deacon,
	linux-riscv, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-efi

On Sat, 30 Apr 2022 at 13:07, Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> >
> > This patch adds zboot (self-extracting compressed kernel) support, all
> > existing in-kernel compressing algorithm and efistub are supported.
> >
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
>
> I have no objections to adding a decompressor in principle, and
> the implementation seems reasonable. However, I think we should try to
> be consistent between architectures. On both arm64 and riscv, the
> maintainers decided to not include a decompressor and instead leave
> it up to the boot loader to decompress the kernel and enter it from there.
>

The reason we don't want to add more decompressors is because it
forces us to do a bare-metal boot twice, i.e., create an ID map,
discover memory, etc etc.
If I am reading this patch correctly, the kernel image is just
decompressed to VMLINUX_LOAD_ADDRESS, regardless of what EFI thinks
that memory is being used for. That kind of misses the point of
booting with EFI.

> As I understand it, this is not part of the UEFI boot flow though, so it
> means that you don't get any compressed kernel images at all when
> booting using UEFI (let me know if that is wrong). I assume this is why
> you decided to include the decompressor here after all.
>

The PE/COFF executable format does not support compression, and so EFI
does not support this natively. Currently, it is left to the
bootloader to figure out whether the image is compressed or not, and
perform the decompression before calling the EFI entrypoint if needed.
This is what GRUB and systemd-boot do today (on non-x86)

I had a stab at doing something similar in EFI, but relying only on
the generic EFI boot services. The advantage of EFI is that you enter
a main() function in C with MMU and caches on, with a memory map, heap
allocator, etc available.

Code for arm64 is here:
https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=efi-decompressor

> I think we should first aim for consistency here, and handle this the
> same way across the modern architectures, either leaving the
> decompressor code out, or adding it consistently. Maybe it would
> even be possible to have the decompressor code as part of the
> EFI stub and share it between the three architectures (x86 and
> 32-bit arm already support loading compressed kernels using EFI).
>

Indeed. One disadvantage of my approach is that both the inner and
outer EFI executables need to be signed for secure boot, as it uses
the EFI boot services. But that is the point, really. The firmware
already knows how to load and start images, so better to make use of
it.


> Adding the arm64, risc-v and uefi maintainers for further discussion here,
> see full below.
>
>        Arnd
>
> > ---
> >  arch/loongarch/Kbuild                         |   2 +-
> >  arch/loongarch/Kconfig                        |  11 ++
> >  arch/loongarch/Makefile                       |  26 ++-
> >  arch/loongarch/boot/Makefile                  |  55 ++++++
> >  arch/loongarch/boot/boot.lds.S                |  64 +++++++
> >  arch/loongarch/boot/decompress.c              |  98 +++++++++++
> >  arch/loongarch/boot/string.c                  | 166 ++++++++++++++++++
> >  arch/loongarch/boot/zheader.S                 | 100 +++++++++++
> >  arch/loongarch/boot/zkernel.S                 |  99 +++++++++++
> >  arch/loongarch/tools/Makefile                 |  15 ++
> >  arch/loongarch/tools/calc_vmlinuz_load_addr.c |  51 ++++++
> >  arch/loongarch/tools/elf-entry.c              |  66 +++++++
> >  12 files changed, 749 insertions(+), 4 deletions(-)
> >  create mode 100644 arch/loongarch/boot/boot.lds.S
> >  create mode 100644 arch/loongarch/boot/decompress.c
> >  create mode 100644 arch/loongarch/boot/string.c
> >  create mode 100644 arch/loongarch/boot/zheader.S
> >  create mode 100644 arch/loongarch/boot/zkernel.S
> >  create mode 100644 arch/loongarch/tools/Makefile
> >  create mode 100644 arch/loongarch/tools/calc_vmlinuz_load_addr.c
> >  create mode 100644 arch/loongarch/tools/elf-entry.c
> >
> > diff --git a/arch/loongarch/Kbuild b/arch/loongarch/Kbuild
> > index ab5373d0a24f..d907fdd7ca08 100644
> > --- a/arch/loongarch/Kbuild
> > +++ b/arch/loongarch/Kbuild
> > @@ -3,4 +3,4 @@ obj-y += mm/
> >  obj-y += vdso/
> >
> >  # for cleaning
> > -subdir- += boot
> > +subdir- += boot tools
> > diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> > index 55225ee5f868..6c1042746b2d 100644
> > --- a/arch/loongarch/Kconfig
> > +++ b/arch/loongarch/Kconfig
> > @@ -107,6 +107,7 @@ config LOONGARCH
> >         select PERF_USE_VMALLOC
> >         select RTC_LIB
> >         select SPARSE_IRQ
> > +       select SYS_SUPPORTS_ZBOOT
> >         select SYSCTL_EXCEPTION_TRACE
> >         select SWIOTLB
> >         select TRACE_IRQFLAGS_SUPPORT
> > @@ -143,6 +144,16 @@ config LOCKDEP_SUPPORT
> >         bool
> >         default y
> >
> > +config SYS_SUPPORTS_ZBOOT
> > +       bool
> > +       select HAVE_KERNEL_GZIP
> > +       select HAVE_KERNEL_BZIP2
> > +       select HAVE_KERNEL_LZ4
> > +       select HAVE_KERNEL_LZMA
> > +       select HAVE_KERNEL_LZO
> > +       select HAVE_KERNEL_XZ
> > +       select HAVE_KERNEL_ZSTD
> > +
> >  config MACH_LOONGSON32
> >         def_bool 32BIT
> >
> > diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
> > index d88a792dafbe..1ed5b8466565 100644
> > --- a/arch/loongarch/Makefile
> > +++ b/arch/loongarch/Makefile
> > @@ -5,12 +5,31 @@
> >
> >  boot   := arch/loongarch/boot
> >
> > +ifndef CONFIG_SYS_SUPPORTS_ZBOOT
> > +
> >  ifndef CONFIG_EFI_STUB
> >  KBUILD_IMAGE   = $(boot)/vmlinux
> >  else
> >  KBUILD_IMAGE   = $(boot)/vmlinux.efi
> >  endif
> >
> > +else
> > +
> > +ifndef CONFIG_EFI_STUB
> > +KBUILD_IMAGE   = $(boot)/vmlinuz
> > +else
> > +KBUILD_IMAGE   = $(boot)/vmlinuz.efi
> > +endif
> > +
> > +endif
> > +
> > +load-y         = 0x9000000000200000
> > +bootvars-y     = VMLINUX_LOAD_ADDRESS=$(load-y)
> > +
> > +archscripts: scripts_basic
> > +       $(Q)$(MAKE) $(build)=arch/loongarch/tools elf-entry
> > +       $(Q)$(MAKE) $(build)=arch/loongarch/tools calc_vmlinuz_load_addr
> > +
> >  #
> >  # Select the object file format to substitute into the linker script.
> >  #
> > @@ -55,9 +74,6 @@ KBUILD_CFLAGS_MODULE          += -fplt -Wa,-mla-global-with-abs,-mla-local-with-abs
> >  cflags-y += -ffreestanding
> >  cflags-y += $(call as-option,-Wa$(comma)-mno-fix-loongson3-llsc,)
> >
> > -load-y         = 0x9000000000200000
> > -bootvars-y     = VMLINUX_LOAD_ADDRESS=$(load-y)
> > -
> >  drivers-$(CONFIG_PCI)          += arch/loongarch/pci/
> >
> >  KBUILD_AFLAGS  += $(cflags-y)
> > @@ -99,7 +115,11 @@ $(KBUILD_IMAGE): vmlinux
> >         $(Q)$(MAKE) $(build)=$(boot) $(bootvars-y) $@
> >
> >  install:
> > +ifndef CONFIG_SYS_SUPPORTS_ZBOOT
> >         $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
> > +else
> > +       $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinuz-$(KERNELRELEASE)
> > +endif
> >         $(Q)install -D -m 644 .config $(INSTALL_PATH)/config-$(KERNELRELEASE)
> >         $(Q)install -D -m 644 System.map $(INSTALL_PATH)/System.map-$(KERNELRELEASE)
> >
> > diff --git a/arch/loongarch/boot/Makefile b/arch/loongarch/boot/Makefile
> > index 66f2293c34b2..c26a36004ae2 100644
> > --- a/arch/loongarch/boot/Makefile
> > +++ b/arch/loongarch/boot/Makefile
> > @@ -21,3 +21,58 @@ quiet_cmd_eficopy = OBJCOPY $@
> >
> >  $(obj)/vmlinux.efi: $(obj)/vmlinux FORCE
> >         $(call if_changed,eficopy)
> > +
> > +# zboot
> > +extra-y        += boot.lds
> > +$(obj)/boot.lds: $(obj)/vmlinux.bin FORCE
> > +CPPFLAGS_boot.lds = $(KBUILD_CPPFLAGS) -DVMLINUZ_LOAD_ADDRESS=$(zload-y)
> > +
> > +entry-y        = $(shell $(objtree)/arch/loongarch/tools/elf-entry $(obj)/vmlinux)
> > +zload-y = $(shell $(objtree)/arch/loongarch/tools/calc_vmlinuz_load_addr \
> > +                               $(obj)/vmlinux.bin $(VMLINUX_LOAD_ADDRESS))
> > +
> > +BOOT_HEAP_SIZE := 0x400000
> > +BOOT_STACK_SIZE        := 0x002000
> > +
> > +KBUILD_AFLAGS := $(KBUILD_AFLAGS) -D__ASSEMBLY__ \
> > +       -DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \
> > +       -DBOOT_STACK_SIZE=$(BOOT_STACK_SIZE)
> > +
> > +KBUILD_CFLAGS := $(KBUILD_CFLAGS) -fpic -D__KERNEL__ \
> > +       -DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \
> > +       -DBOOT_STACK_SIZE=$(BOOT_STACK_SIZE)
> > +
> > +targets += vmlinux.bin
> > +OBJCOPYFLAGS_vmlinux.bin := $(OBJCOPYFLAGS) -O binary $(strip-flags)
> > +$(obj)/vmlinux.bin: $(obj)/vmlinux FORCE
> > +       $(call if_changed,objcopy)
> > +
> > +tool_$(CONFIG_KERNEL_GZIP)    = gzip
> > +tool_$(CONFIG_KERNEL_BZIP2)   = bzip2_with_size
> > +tool_$(CONFIG_KERNEL_LZ4)     = lz4_with_size
> > +tool_$(CONFIG_KERNEL_LZMA)    = lzma_with_size
> > +tool_$(CONFIG_KERNEL_LZO)     = lzo_with_size
> > +tool_$(CONFIG_KERNEL_XZ)      = xzkern_with_size
> > +tool_$(CONFIG_KERNEL_ZSTD)    = zstd22_with_size
> > +
> > +targets += vmlinux.bin.z
> > +$(obj)/vmlinux.bin.z: $(obj)/vmlinux.bin FORCE
> > +       $(call if_changed,$(tool_y))
> > +
> > +targets += $(notdir $(vmlinuzobjs-y))
> > +vmlinuzobjs-y := $(obj)/zkernel.o $(obj)/decompress.o $(obj)/string.o
> > +vmlinuzobjs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
> > +$(obj)/zkernel.o: $(obj)/vmlinux.bin.z
> > +AFLAGS_zkernel.o = $(KBUILD_AFLAGS) -DVMLINUZ_LOAD_ADDRESS=$(zload-y) -DKERNEL_ENTRY=$(entry-y)
> > +
> > +quiet_cmd_zld = LD      $@
> > +      cmd_zld = $(LD) $(KBUILD_LDFLAGS) -T $< $(vmlinuzobjs-y) -o $@
> > +
> > +targets += vmlinuz
> > +$(obj)/vmlinuz: $(src)/boot.lds $(vmlinuzobjs-y) FORCE
> > +       $(call if_changed,zld)
> > +       $(call if_changed,strip)
> > +
> > +targets += vmlinuz.efi
> > +$(obj)/vmlinuz.efi: $(obj)/vmlinuz FORCE
> > +       $(call if_changed,eficopy)
> > diff --git a/arch/loongarch/boot/boot.lds.S b/arch/loongarch/boot/boot.lds.S
> > new file mode 100644
> > index 000000000000..23e698782afd
> > --- /dev/null
> > +++ b/arch/loongarch/boot/boot.lds.S
> > @@ -0,0 +1,64 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * ld.script for compressed kernel support of LoongArch
> > + *
> > + * Author: Huacai Chen <chenhuacai@loongson.cn>
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include "../kernel/image-vars.h"
> > +
> > +/*
> > + * Max avaliable Page Size is 64K, so we set SectionAlignment
> > + * field of EFI application to 64K.
> > + */
> > +PECOFF_FILE_ALIGN = 0x200;
> > +PECOFF_SEGMENT_ALIGN = 0x10000;
> > +
> > +OUTPUT_ARCH(loongarch)
> > +ENTRY(kernel_entry)
> > +PHDRS {
> > +       text PT_LOAD FLAGS(7); /* RWX */
> > +}
> > +SECTIONS
> > +{
> > +       . = VMLINUZ_LOAD_ADDRESS;
> > +
> > +       _text = .;
> > +       .head.text : {
> > +               *(.head.text)
> > +       }
> > +
> > +       .text : {
> > +               *(.text)
> > +               *(.init.text)
> > +               *(.rodata)
> > +       }: text
> > +
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > +       _data = .;
> > +       .data : {
> > +               *(.data)
> > +               *(.init.data)
> > +               /* Put the compressed image here */
> > +               __image_begin = .;
> > +               *(.image)
> > +               __image_end = .;
> > +               CONSTRUCTORS
> > +               . = ALIGN(PECOFF_FILE_ALIGN);
> > +       }
> > +       _edata = .;
> > +
> > +       .bss : {
> > +               *(.bss)
> > +               *(.init.bss)
> > +       }
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > +       _end = .;
> > +
> > +       /DISCARD/ : {
> > +               *(.options)
> > +               *(.comment)
> > +               *(.note)
> > +       }
> > +}
> > diff --git a/arch/loongarch/boot/decompress.c b/arch/loongarch/boot/decompress.c
> > new file mode 100644
> > index 000000000000..8f55fcd8f285
> > --- /dev/null
> > +++ b/arch/loongarch/boot/decompress.c
> > @@ -0,0 +1,98 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +/*
> > + * Author: Huacai Chen <chenhuacai@loongson.cn>
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/types.h>
> > +#include <linux/kernel.h>
> > +#include <linux/string.h>
> > +#include <linux/libfdt.h>
> > +
> > +#include <asm/addrspace.h>
> > +
> > +/*
> > + * These two variables specify the free mem region
> > + * that can be used for temporary malloc area
> > + */
> > +unsigned long free_mem_ptr;
> > +unsigned long free_mem_end_ptr;
> > +
> > +/* The linker tells us where the image is. */
> > +extern unsigned char __image_begin, __image_end;
> > +
> > +#define puts(s) do {} while (0)
> > +#define puthex(val) do {} while (0)
> > +
> > +void error(char *x)
> > +{
> > +       puts("\n\n");
> > +       puts(x);
> > +       puts("\n\n -- System halted");
> > +
> > +       while (1)
> > +               ;       /* Halt */
> > +}
> > +
> > +/* activate the code for pre-boot environment */
> > +#define STATIC static
> > +
> > +#include "../../../../lib/ashldi3.c"
> > +
> > +#ifdef CONFIG_KERNEL_GZIP
> > +#include "../../../../lib/decompress_inflate.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_BZIP2
> > +#include "../../../../lib/decompress_bunzip2.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_LZ4
> > +#include "../../../../lib/decompress_unlz4.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_LZMA
> > +#include "../../../../lib/decompress_unlzma.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_LZO
> > +#include "../../../../lib/decompress_unlzo.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_XZ
> > +#include "../../../../lib/decompress_unxz.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_ZSTD
> > +#include "../../../../lib/decompress_unzstd.c"
> > +#endif
> > +
> > +void decompress_kernel(unsigned long boot_heap_start)
> > +{
> > +       unsigned long zimage_start, zimage_size;
> > +
> > +       zimage_start = (unsigned long)(&__image_begin);
> > +       zimage_size = (unsigned long)(&__image_end) -
> > +           (unsigned long)(&__image_begin);
> > +
> > +       puts("zimage at:     ");
> > +       puthex(zimage_start);
> > +       puts(" ");
> > +       puthex(zimage_size + zimage_start);
> > +       puts("\n");
> > +
> > +       /* This area are prepared for mallocing when decompressing */
> > +       free_mem_ptr = boot_heap_start;
> > +       free_mem_end_ptr = boot_heap_start + BOOT_HEAP_SIZE;
> > +
> > +       /* Display standard Linux/LoongArch boot prompt */
> > +       puts("Uncompressing Linux at load address ");
> > +       puthex(VMLINUX_LOAD_ADDRESS);
> > +       puts("\n");
> > +
> > +       /* Decompress the kernel with according algorithm */
> > +       __decompress((char *)zimage_start, zimage_size, 0, 0,
> > +                  (void *)VMLINUX_LOAD_ADDRESS, 0, 0, error);
> > +
> > +       puts("Now, booting the kernel...\n");
> > +}
> > diff --git a/arch/loongarch/boot/string.c b/arch/loongarch/boot/string.c
> > new file mode 100644
> > index 000000000000..3f746e7c2bb5
> > --- /dev/null
> > +++ b/arch/loongarch/boot/string.c
> > @@ -0,0 +1,166 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * arch/loongarch/boot/string.c
> > + *
> > + * Very small subset of simple string routines
> > + */
> > +
> > +#include <linux/types.h>
> > +
> > +void __weak *memset(void *s, int c, size_t n)
> > +{
> > +       int i;
> > +       char *ss = s;
> > +
> > +       for (i = 0; i < n; i++)
> > +               ss[i] = c;
> > +       return s;
> > +}
> > +
> > +void __weak *memcpy(void *dest, const void *src, size_t n)
> > +{
> > +       int i;
> > +       const char *s = src;
> > +       char *d = dest;
> > +
> > +       for (i = 0; i < n; i++)
> > +               d[i] = s[i];
> > +       return dest;
> > +}
> > +
> > +void __weak *memmove(void *dest, const void *src, size_t n)
> > +{
> > +       int i;
> > +       const char *s = src;
> > +       char *d = dest;
> > +
> > +       if (d < s) {
> > +               for (i = 0; i < n; i++)
> > +                       d[i] = s[i];
> > +       } else if (d > s) {
> > +               for (i = n - 1; i >= 0; i--)
> > +                       d[i] = s[i];
> > +       }
> > +
> > +       return dest;
> > +}
> > +
> > +int __weak memcmp(const void *cs, const void *ct, size_t count)
> > +{
> > +       int res = 0;
> > +       const unsigned char *su1, *su2;
> > +
> > +       for (su1 = cs, su2 = ct; 0 < count; ++su1, ++su2, count--) {
> > +               res = *su1 - *su2;
> > +               if (res != 0)
> > +                       break;
> > +       }
> > +       return res;
> > +}
> > +
> > +int __weak strcmp(const char *str1, const char *str2)
> > +{
> > +       int delta = 0;
> > +       const unsigned char *s1 = (const unsigned char *)str1;
> > +       const unsigned char *s2 = (const unsigned char *)str2;
> > +
> > +       while (*s1 || *s2) {
> > +               delta = *s1 - *s2;
> > +               if (delta)
> > +                       return delta;
> > +               s1++;
> > +               s2++;
> > +       }
> > +       return 0;
> > +}
> > +
> > +size_t __weak strlen(const char *s)
> > +{
> > +       const char *sc;
> > +
> > +       for (sc = s; *sc != '\0'; ++sc)
> > +               /* nothing */;
> > +       return sc - s;
> > +}
> > +
> > +size_t __weak strnlen(const char *s, size_t count)
> > +{
> > +       const char *sc;
> > +
> > +       for (sc = s; count-- && *sc != '\0'; ++sc)
> > +               /* nothing */;
> > +       return sc - s;
> > +}
> > +
> > +char * __weak strnstr(const char *s1, const char *s2, size_t len)
> > +{
> > +       size_t l2;
> > +
> > +       l2 = strlen(s2);
> > +       if (!l2)
> > +               return (char *)s1;
> > +       while (len >= l2) {
> > +               len--;
> > +               if (!memcmp(s1, s2, l2))
> > +                       return (char *)s1;
> > +               s1++;
> > +       }
> > +       return NULL;
> > +}
> > +
> > +#undef strcat
> > +char * __weak strcat(char *dest, const char *src)
> > +{
> > +       char *tmp = dest;
> > +
> > +       while (*dest)
> > +               dest++;
> > +       while ((*dest++ = *src++) != '\0')
> > +               ;
> > +       return tmp;
> > +}
> > +
> > +char * __weak strncat(char *dest, const char *src, size_t count)
> > +{
> > +       char *tmp = dest;
> > +
> > +       if (count) {
> > +               while (*dest)
> > +                       dest++;
> > +               while ((*dest++ = *src++) != 0) {
> > +                       if (--count == 0) {
> > +                               *dest = '\0';
> > +                               break;
> > +                       }
> > +               }
> > +       }
> > +       return tmp;
> > +}
> > +
> > +char * __weak strpbrk(const char *cs, const char *ct)
> > +{
> > +       const char *sc1, *sc2;
> > +
> > +       for (sc1 = cs; *sc1 != '\0'; ++sc1) {
> > +               for (sc2 = ct; *sc2 != '\0'; ++sc2) {
> > +                       if (*sc1 == *sc2)
> > +                               return (char *)sc1;
> > +               }
> > +       }
> > +       return NULL;
> > +}
> > +
> > +char * __weak strsep(char **s, const char *ct)
> > +{
> > +       char *sbegin = *s;
> > +       char *end;
> > +
> > +       if (sbegin == NULL)
> > +               return NULL;
> > +
> > +       end = strpbrk(sbegin, ct);
> > +       if (end)
> > +               *end++ = '\0';
> > +       *s = end;
> > +       return sbegin;
> > +}
> > diff --git a/arch/loongarch/boot/zheader.S b/arch/loongarch/boot/zheader.S
> > new file mode 100644
> > index 000000000000..4bc50d953ec7
> > --- /dev/null
> > +++ b/arch/loongarch/boot/zheader.S
> > @@ -0,0 +1,100 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/pe.h>
> > +#include <linux/sizes.h>
> > +
> > +       .macro  __EFI_PE_HEADER
> > +       .long   PE_MAGIC
> > +coff_header:
> > +       .short  IMAGE_FILE_MACHINE_LOONGARCH            /* Machine */
> > +       .short  section_count                           /* NumberOfSections */
> > +       .long   0                                       /* TimeDateStamp */
> > +       .long   0                                       /* PointerToSymbolTable */
> > +       .long   0                                       /* NumberOfSymbols */
> > +       .short  section_table - optional_header         /* SizeOfOptionalHeader */
> > +       .short  IMAGE_FILE_DEBUG_STRIPPED | \
> > +               IMAGE_FILE_EXECUTABLE_IMAGE | \
> > +               IMAGE_FILE_LINE_NUMS_STRIPPED           /* Characteristics */
> > +
> > +optional_header:
> > +       .short  PE_OPT_MAGIC_PE32PLUS                   /* PE32+ format */
> > +       .byte   0x02                                    /* MajorLinkerVersion */
> > +       .byte   0x14                                    /* MinorLinkerVersion */
> > +       .long   _data - efi_header_end                  /* SizeOfCode */
> > +       .long   _end - _data                            /* SizeOfInitializedData */
> > +       .long   0                                       /* SizeOfUninitializedData */
> > +       .long   __efistub_efi_pe_entry - _head          /* AddressOfEntryPoint */
> > +       .long   efi_header_end - _head                  /* BaseOfCode */
> > +
> > +extra_header_fields:
> > +       .quad   0                                       /* ImageBase */
> > +       .long   PECOFF_SEGMENT_ALIGN                    /* SectionAlignment */
> > +       .long   PECOFF_FILE_ALIGN                       /* FileAlignment */
> > +       .short  0                                       /* MajorOperatingSystemVersion */
> > +       .short  0                                       /* MinorOperatingSystemVersion */
> > +       .short  0                                       /* MajorImageVersion */
> > +       .short  0                                       /* MinorImageVersion */
> > +       .short  0                                       /* MajorSubsystemVersion */
> > +       .short  0                                       /* MinorSubsystemVersion */
> > +       .long   0                                       /* Win32VersionValue */
> > +
> > +       .long   _end - _head                            /* SizeOfImage */
> > +
> > +       /* Everything before the kernel image is considered part of the header */
> > +       .long   efi_header_end - _head                  /* SizeOfHeaders */
> > +       .long   0                                       /* CheckSum */
> > +       .short  IMAGE_SUBSYSTEM_EFI_APPLICATION         /* Subsystem */
> > +       .short  0                                       /* DllCharacteristics */
> > +       .quad   0                                       /* SizeOfStackReserve */
> > +       .quad   0                                       /* SizeOfStackCommit */
> > +       .quad   0                                       /* SizeOfHeapReserve */
> > +       .quad   0                                       /* SizeOfHeapCommit */
> > +       .long   0                                       /* LoaderFlags */
> > +       .long   (section_table - .) / 8                 /* NumberOfRvaAndSizes */
> > +
> > +       .quad   0                                       /* ExportTable */
> > +       .quad   0                                       /* ImportTable */
> > +       .quad   0                                       /* ResourceTable */
> > +       .quad   0                                       /* ExceptionTable */
> > +       .quad   0                                       /* CertificationTable */
> > +       .quad   0                                       /* BaseRelocationTable */
> > +
> > +       /* Section table */
> > +section_table:
> > +       .ascii  ".text\0\0\0"
> > +       .long   _data - efi_header_end                  /* VirtualSize */
> > +       .long   efi_header_end - _head                  /* VirtualAddress */
> > +       .long   _data - efi_header_end                  /* SizeOfRawData */
> > +       .long   efi_header_end - _head                  /* PointerToRawData */
> > +
> > +       .long   0                                       /* PointerToRelocations */
> > +       .long   0                                       /* PointerToLineNumbers */
> > +       .short  0                                       /* NumberOfRelocations */
> > +       .short  0                                       /* NumberOfLineNumbers */
> > +       .long   IMAGE_SCN_CNT_CODE | \
> > +               IMAGE_SCN_MEM_READ | \
> > +               IMAGE_SCN_MEM_EXECUTE                   /* Characteristics */
> > +
> > +       .ascii  ".data\0\0\0"
> > +       .long   _end - _data                            /* VirtualSize */
> > +       .long   _data - _head                           /* VirtualAddress */
> > +       .long   _edata - _data                          /* SizeOfRawData */
> > +       .long   _data - _head                           /* PointerToRawData */
> > +
> > +       .long   0                                       /* PointerToRelocations */
> > +       .long   0                                       /* PointerToLineNumbers */
> > +       .short  0                                       /* NumberOfRelocations */
> > +       .short  0                                       /* NumberOfLineNumbers */
> > +       .long   IMAGE_SCN_CNT_INITIALIZED_DATA | \
> > +               IMAGE_SCN_MEM_READ | \
> > +               IMAGE_SCN_MEM_WRITE                     /* Characteristics */
> > +
> > +       .org 0x20e
> > +       .word kernel_version - 512 -  _head
> > +
> > +       .set    section_count, (. - section_table) / 40
> > +efi_header_end:
> > +       .endm
> > diff --git a/arch/loongarch/boot/zkernel.S b/arch/loongarch/boot/zkernel.S
> > new file mode 100644
> > index 000000000000..13a8a14a2328
> > --- /dev/null
> > +++ b/arch/loongarch/boot/zkernel.S
> > @@ -0,0 +1,99 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/init.h>
> > +#include <linux/linkage.h>
> > +#include <asm/addrspace.h>
> > +#include <asm/asm.h>
> > +#include <asm/loongarch.h>
> > +#include <asm/regdef.h>
> > +#include <generated/compile.h>
> > +#include <generated/utsrelease.h>
> > +
> > +#ifdef CONFIG_EFI_STUB
> > +
> > +#include "zheader.S"
> > +
> > +       __HEAD
> > +
> > +_head:
> > +       /* "MZ", MS-DOS header */
> > +       .word   MZ_MAGIC
> > +       .org    0x28
> > +       .ascii  "Loongson\0"
> > +       .org    0x3c
> > +       /* Offset to the PE header */
> > +       .long   pe_header - _head
> > +
> > +pe_header:
> > +       __EFI_PE_HEADER
> > +
> > +kernel_asize:
> > +       .long _end - _text
> > +
> > +kernel_fsize:
> > +       .long _edata - _text
> > +
> > +kernel_vaddr:
> > +       .quad VMLINUZ_LOAD_ADDRESS
> > +
> > +kernel_offset:
> > +       .long kernel_offset - _text
> > +
> > +kernel_version:
> > +       .ascii  UTS_RELEASE " (" LINUX_COMPILE_BY "@" LINUX_COMPILE_HOST ") " UTS_VERSION "\0"
> > +
> > +SYM_L_GLOBAL(kernel_asize)
> > +SYM_L_GLOBAL(kernel_fsize)
> > +SYM_L_GLOBAL(kernel_vaddr)
> > +SYM_L_GLOBAL(kernel_offset)
> > +
> > +#endif
> > +
> > +       __INIT
> > +
> > +SYM_CODE_START(kernel_entry)
> > +       /* Save boot rom start args */
> > +       move    s0, a0
> > +       move    s1, a1
> > +       move    s2, a2
> > +       move    s3, a3
> > +
> > +       /* Config Direct Mapping */
> > +       li.d    t0, CSR_DMW0_INIT
> > +       csrwr   t0, LOONGARCH_CSR_DMWIN0
> > +       li.d    t0, CSR_DMW1_INIT
> > +       csrwr   t0, LOONGARCH_CSR_DMWIN1
> > +
> > +       /* Clear BSS */
> > +       la.abs  a0, _edata
> > +       la.abs  a2, _end
> > +1:     st.d    zero, a0, 0
> > +       addi.d  a0, a0, 8
> > +       bne     a2, a0, 1b
> > +
> > +       la.abs  a0, .heap          /* heap address */
> > +       la.abs  sp, .stack + 8192  /* stack address */
> > +
> > +       la      ra, 2f
> > +       la      t4, decompress_kernel
> > +       jirl    zero, t4, 0
> > +2:
> > +       move    a0, s0
> > +       move    a1, s1
> > +       move    a2, s2
> > +       move    a3, s3
> > +       PTR_LI  t4, KERNEL_ENTRY
> > +       jirl    zero, t4, 0
> > +3:
> > +       b       3b
> > +SYM_CODE_END(kernel_entry)
> > +
> > +       .comm .heap, BOOT_HEAP_SIZE, 4
> > +       .comm .stack, BOOT_STACK_SIZE, 4
> > +
> > +       .align 4
> > +       .section .image, "a", %progbits
> > +       .incbin "arch/loongarch/boot/vmlinux.bin.z"
> > diff --git a/arch/loongarch/tools/Makefile b/arch/loongarch/tools/Makefile
> > new file mode 100644
> > index 000000000000..8a6181c82a91
> > --- /dev/null
> > +++ b/arch/loongarch/tools/Makefile
> > @@ -0,0 +1,15 @@
> > +#
> > +# arch/loongarch/boot/Makefile
> > +#
> > +# Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > +#
> > +
> > +hostprogs := elf-entry
> > +PHONY += elf-entry
> > +elf-entry: $(obj)/elf-entry
> > +       @:
> > +
> > +hostprogs += calc_vmlinuz_load_addr
> > +PHONY += calc_vmlinuz_load_addr
> > +calc_vmlinuz_load_addr: $(obj)/calc_vmlinuz_load_addr
> > +       @:
> > diff --git a/arch/loongarch/tools/calc_vmlinuz_load_addr.c b/arch/loongarch/tools/calc_vmlinuz_load_addr.c
> > new file mode 100644
> > index 000000000000..5e2ca6b4dff6
> > --- /dev/null
> > +++ b/arch/loongarch/tools/calc_vmlinuz_load_addr.c
> > @@ -0,0 +1,51 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <errno.h>
> > +#include <stdint.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <sys/stat.h>
> > +
> > +int main(int argc, char *argv[])
> > +{
> > +       unsigned long long vmlinux_size, vmlinux_load_addr, vmlinuz_load_addr;
> > +       struct stat sb;
> > +
> > +       if (argc != 3) {
> > +               fprintf(stderr, "Usage: %s <pathname> <vmlinux_load_addr>\n", argv[0]);
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       if (stat(argv[1], &sb) == -1) {
> > +               perror("stat");
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       /* Convert hex characters to dec number */
> > +       errno = 0;
> > +       if (sscanf(argv[2], "%llx", &vmlinux_load_addr) != 1) {
> > +               if (errno != 0)
> > +                       perror("sscanf");
> > +               else
> > +                       fprintf(stderr, "No matching characters\n");
> > +
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       vmlinux_size = (uint64_t)sb.st_size;
> > +       vmlinuz_load_addr = vmlinux_load_addr + vmlinux_size;
> > +
> > +       /*
> > +        * Align with 64KB: KEXEC needs load sections to be aligned to PAGE_SIZE,
> > +        * which may be as large as 64KB depending on the kernel configuration.
> > +        */
> > +
> > +       vmlinuz_load_addr += (0x10000 - vmlinux_size % 0x10000);
> > +
> > +       printf("0x%llx\n", vmlinuz_load_addr);
> > +
> > +       return EXIT_SUCCESS;
> > +}
> > diff --git a/arch/loongarch/tools/elf-entry.c b/arch/loongarch/tools/elf-entry.c
> > new file mode 100644
> > index 000000000000..c80721e0dee1
> > --- /dev/null
> > +++ b/arch/loongarch/tools/elf-entry.c
> > @@ -0,0 +1,66 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <elf.h>
> > +#include <inttypes.h>
> > +#include <stdint.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <string.h>
> > +
> > +__attribute__((noreturn))
> > +static void die(const char *msg)
> > +{
> > +       fputs(msg, stderr);
> > +       exit(EXIT_FAILURE);
> > +}
> > +
> > +int main(int argc, const char *argv[])
> > +{
> > +       uint64_t entry;
> > +       size_t nread;
> > +       FILE *file;
> > +       union {
> > +               Elf32_Ehdr ehdr32;
> > +               Elf64_Ehdr ehdr64;
> > +       } hdr;
> > +
> > +       if (argc != 2)
> > +               die("Usage: elf-entry <elf-file>\n");
> > +
> > +       file = fopen(argv[1], "r");
> > +       if (!file) {
> > +               perror("Unable to open input file");
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       nread = fread(&hdr, 1, sizeof(hdr), file);
> > +       if (nread != sizeof(hdr)) {
> > +               fclose(file);
> > +               perror("Unable to read input file");
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       if (memcmp(hdr.ehdr32.e_ident, ELFMAG, SELFMAG)) {
> > +               fclose(file);
> > +               die("Input is not an ELF\n");
> > +       }
> > +
> > +       switch (hdr.ehdr32.e_ident[EI_CLASS]) {
> > +       case ELFCLASS32:
> > +               /* Sign extend to form a canonical address */
> > +               entry = (int64_t)(int32_t)hdr.ehdr32.e_entry;
> > +               break;
> > +
> > +       case ELFCLASS64:
> > +               entry = hdr.ehdr64.e_entry;
> > +               break;
> > +
> > +       default:
> > +               fclose(file);
> > +               die("Invalid ELF class\n");
> > +       }
> > +
> > +       fclose(file);
> > +       printf("0x%016" PRIx64 "\n", entry);
> > +
> > +       return EXIT_SUCCESS;
> > +}
> > --
> > 2.27.0
> >
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support
@ 2022-05-01 23:36       ` Ard Biesheuvel
  0 siblings, 0 replies; 94+ messages in thread
From: Ard Biesheuvel @ 2022-05-01 23:36 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Huacai Chen, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds,
	linux-arch, open list:DOCUMENTATION, Linux Kernel Mailing List,
	Xuefeng Li, Yanteng Si, Huacai Chen, Guo Ren, Xuerui Wang,
	Jiaxun Yang, Linux ARM, Catalin Marinas, Will Deacon,
	linux-riscv, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-efi

On Sat, 30 Apr 2022 at 13:07, Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> >
> > This patch adds zboot (self-extracting compressed kernel) support, all
> > existing in-kernel compressing algorithm and efistub are supported.
> >
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
>
> I have no objections to adding a decompressor in principle, and
> the implementation seems reasonable. However, I think we should try to
> be consistent between architectures. On both arm64 and riscv, the
> maintainers decided to not include a decompressor and instead leave
> it up to the boot loader to decompress the kernel and enter it from there.
>

The reason we don't want to add more decompressors is because it
forces us to do a bare-metal boot twice, i.e., create an ID map,
discover memory, etc etc.
If I am reading this patch correctly, the kernel image is just
decompressed to VMLINUX_LOAD_ADDRESS, regardless of what EFI thinks
that memory is being used for. That kind of misses the point of
booting with EFI.

> As I understand it, this is not part of the UEFI boot flow though, so it
> means that you don't get any compressed kernel images at all when
> booting using UEFI (let me know if that is wrong). I assume this is why
> you decided to include the decompressor here after all.
>

The PE/COFF executable format does not support compression, and so EFI
does not support this natively. Currently, it is left to the
bootloader to figure out whether the image is compressed or not, and
perform the decompression before calling the EFI entrypoint if needed.
This is what GRUB and systemd-boot do today (on non-x86)

I had a stab at doing something similar in EFI, but relying only on
the generic EFI boot services. The advantage of EFI is that you enter
a main() function in C with MMU and caches on, with a memory map, heap
allocator, etc available.

Code for arm64 is here:
https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=efi-decompressor

> I think we should first aim for consistency here, and handle this the
> same way across the modern architectures, either leaving the
> decompressor code out, or adding it consistently. Maybe it would
> even be possible to have the decompressor code as part of the
> EFI stub and share it between the three architectures (x86 and
> 32-bit arm already support loading compressed kernels using EFI).
>

Indeed. One disadvantage of my approach is that both the inner and
outer EFI executables need to be signed for secure boot, as it uses
the EFI boot services. But that is the point, really. The firmware
already knows how to load and start images, so better to make use of
it.


> Adding the arm64, risc-v and uefi maintainers for further discussion here,
> see full below.
>
>        Arnd
>
> > ---
> >  arch/loongarch/Kbuild                         |   2 +-
> >  arch/loongarch/Kconfig                        |  11 ++
> >  arch/loongarch/Makefile                       |  26 ++-
> >  arch/loongarch/boot/Makefile                  |  55 ++++++
> >  arch/loongarch/boot/boot.lds.S                |  64 +++++++
> >  arch/loongarch/boot/decompress.c              |  98 +++++++++++
> >  arch/loongarch/boot/string.c                  | 166 ++++++++++++++++++
> >  arch/loongarch/boot/zheader.S                 | 100 +++++++++++
> >  arch/loongarch/boot/zkernel.S                 |  99 +++++++++++
> >  arch/loongarch/tools/Makefile                 |  15 ++
> >  arch/loongarch/tools/calc_vmlinuz_load_addr.c |  51 ++++++
> >  arch/loongarch/tools/elf-entry.c              |  66 +++++++
> >  12 files changed, 749 insertions(+), 4 deletions(-)
> >  create mode 100644 arch/loongarch/boot/boot.lds.S
> >  create mode 100644 arch/loongarch/boot/decompress.c
> >  create mode 100644 arch/loongarch/boot/string.c
> >  create mode 100644 arch/loongarch/boot/zheader.S
> >  create mode 100644 arch/loongarch/boot/zkernel.S
> >  create mode 100644 arch/loongarch/tools/Makefile
> >  create mode 100644 arch/loongarch/tools/calc_vmlinuz_load_addr.c
> >  create mode 100644 arch/loongarch/tools/elf-entry.c
> >
> > diff --git a/arch/loongarch/Kbuild b/arch/loongarch/Kbuild
> > index ab5373d0a24f..d907fdd7ca08 100644
> > --- a/arch/loongarch/Kbuild
> > +++ b/arch/loongarch/Kbuild
> > @@ -3,4 +3,4 @@ obj-y += mm/
> >  obj-y += vdso/
> >
> >  # for cleaning
> > -subdir- += boot
> > +subdir- += boot tools
> > diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> > index 55225ee5f868..6c1042746b2d 100644
> > --- a/arch/loongarch/Kconfig
> > +++ b/arch/loongarch/Kconfig
> > @@ -107,6 +107,7 @@ config LOONGARCH
> >         select PERF_USE_VMALLOC
> >         select RTC_LIB
> >         select SPARSE_IRQ
> > +       select SYS_SUPPORTS_ZBOOT
> >         select SYSCTL_EXCEPTION_TRACE
> >         select SWIOTLB
> >         select TRACE_IRQFLAGS_SUPPORT
> > @@ -143,6 +144,16 @@ config LOCKDEP_SUPPORT
> >         bool
> >         default y
> >
> > +config SYS_SUPPORTS_ZBOOT
> > +       bool
> > +       select HAVE_KERNEL_GZIP
> > +       select HAVE_KERNEL_BZIP2
> > +       select HAVE_KERNEL_LZ4
> > +       select HAVE_KERNEL_LZMA
> > +       select HAVE_KERNEL_LZO
> > +       select HAVE_KERNEL_XZ
> > +       select HAVE_KERNEL_ZSTD
> > +
> >  config MACH_LOONGSON32
> >         def_bool 32BIT
> >
> > diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
> > index d88a792dafbe..1ed5b8466565 100644
> > --- a/arch/loongarch/Makefile
> > +++ b/arch/loongarch/Makefile
> > @@ -5,12 +5,31 @@
> >
> >  boot   := arch/loongarch/boot
> >
> > +ifndef CONFIG_SYS_SUPPORTS_ZBOOT
> > +
> >  ifndef CONFIG_EFI_STUB
> >  KBUILD_IMAGE   = $(boot)/vmlinux
> >  else
> >  KBUILD_IMAGE   = $(boot)/vmlinux.efi
> >  endif
> >
> > +else
> > +
> > +ifndef CONFIG_EFI_STUB
> > +KBUILD_IMAGE   = $(boot)/vmlinuz
> > +else
> > +KBUILD_IMAGE   = $(boot)/vmlinuz.efi
> > +endif
> > +
> > +endif
> > +
> > +load-y         = 0x9000000000200000
> > +bootvars-y     = VMLINUX_LOAD_ADDRESS=$(load-y)
> > +
> > +archscripts: scripts_basic
> > +       $(Q)$(MAKE) $(build)=arch/loongarch/tools elf-entry
> > +       $(Q)$(MAKE) $(build)=arch/loongarch/tools calc_vmlinuz_load_addr
> > +
> >  #
> >  # Select the object file format to substitute into the linker script.
> >  #
> > @@ -55,9 +74,6 @@ KBUILD_CFLAGS_MODULE          += -fplt -Wa,-mla-global-with-abs,-mla-local-with-abs
> >  cflags-y += -ffreestanding
> >  cflags-y += $(call as-option,-Wa$(comma)-mno-fix-loongson3-llsc,)
> >
> > -load-y         = 0x9000000000200000
> > -bootvars-y     = VMLINUX_LOAD_ADDRESS=$(load-y)
> > -
> >  drivers-$(CONFIG_PCI)          += arch/loongarch/pci/
> >
> >  KBUILD_AFLAGS  += $(cflags-y)
> > @@ -99,7 +115,11 @@ $(KBUILD_IMAGE): vmlinux
> >         $(Q)$(MAKE) $(build)=$(boot) $(bootvars-y) $@
> >
> >  install:
> > +ifndef CONFIG_SYS_SUPPORTS_ZBOOT
> >         $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
> > +else
> > +       $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinuz-$(KERNELRELEASE)
> > +endif
> >         $(Q)install -D -m 644 .config $(INSTALL_PATH)/config-$(KERNELRELEASE)
> >         $(Q)install -D -m 644 System.map $(INSTALL_PATH)/System.map-$(KERNELRELEASE)
> >
> > diff --git a/arch/loongarch/boot/Makefile b/arch/loongarch/boot/Makefile
> > index 66f2293c34b2..c26a36004ae2 100644
> > --- a/arch/loongarch/boot/Makefile
> > +++ b/arch/loongarch/boot/Makefile
> > @@ -21,3 +21,58 @@ quiet_cmd_eficopy = OBJCOPY $@
> >
> >  $(obj)/vmlinux.efi: $(obj)/vmlinux FORCE
> >         $(call if_changed,eficopy)
> > +
> > +# zboot
> > +extra-y        += boot.lds
> > +$(obj)/boot.lds: $(obj)/vmlinux.bin FORCE
> > +CPPFLAGS_boot.lds = $(KBUILD_CPPFLAGS) -DVMLINUZ_LOAD_ADDRESS=$(zload-y)
> > +
> > +entry-y        = $(shell $(objtree)/arch/loongarch/tools/elf-entry $(obj)/vmlinux)
> > +zload-y = $(shell $(objtree)/arch/loongarch/tools/calc_vmlinuz_load_addr \
> > +                               $(obj)/vmlinux.bin $(VMLINUX_LOAD_ADDRESS))
> > +
> > +BOOT_HEAP_SIZE := 0x400000
> > +BOOT_STACK_SIZE        := 0x002000
> > +
> > +KBUILD_AFLAGS := $(KBUILD_AFLAGS) -D__ASSEMBLY__ \
> > +       -DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \
> > +       -DBOOT_STACK_SIZE=$(BOOT_STACK_SIZE)
> > +
> > +KBUILD_CFLAGS := $(KBUILD_CFLAGS) -fpic -D__KERNEL__ \
> > +       -DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \
> > +       -DBOOT_STACK_SIZE=$(BOOT_STACK_SIZE)
> > +
> > +targets += vmlinux.bin
> > +OBJCOPYFLAGS_vmlinux.bin := $(OBJCOPYFLAGS) -O binary $(strip-flags)
> > +$(obj)/vmlinux.bin: $(obj)/vmlinux FORCE
> > +       $(call if_changed,objcopy)
> > +
> > +tool_$(CONFIG_KERNEL_GZIP)    = gzip
> > +tool_$(CONFIG_KERNEL_BZIP2)   = bzip2_with_size
> > +tool_$(CONFIG_KERNEL_LZ4)     = lz4_with_size
> > +tool_$(CONFIG_KERNEL_LZMA)    = lzma_with_size
> > +tool_$(CONFIG_KERNEL_LZO)     = lzo_with_size
> > +tool_$(CONFIG_KERNEL_XZ)      = xzkern_with_size
> > +tool_$(CONFIG_KERNEL_ZSTD)    = zstd22_with_size
> > +
> > +targets += vmlinux.bin.z
> > +$(obj)/vmlinux.bin.z: $(obj)/vmlinux.bin FORCE
> > +       $(call if_changed,$(tool_y))
> > +
> > +targets += $(notdir $(vmlinuzobjs-y))
> > +vmlinuzobjs-y := $(obj)/zkernel.o $(obj)/decompress.o $(obj)/string.o
> > +vmlinuzobjs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
> > +$(obj)/zkernel.o: $(obj)/vmlinux.bin.z
> > +AFLAGS_zkernel.o = $(KBUILD_AFLAGS) -DVMLINUZ_LOAD_ADDRESS=$(zload-y) -DKERNEL_ENTRY=$(entry-y)
> > +
> > +quiet_cmd_zld = LD      $@
> > +      cmd_zld = $(LD) $(KBUILD_LDFLAGS) -T $< $(vmlinuzobjs-y) -o $@
> > +
> > +targets += vmlinuz
> > +$(obj)/vmlinuz: $(src)/boot.lds $(vmlinuzobjs-y) FORCE
> > +       $(call if_changed,zld)
> > +       $(call if_changed,strip)
> > +
> > +targets += vmlinuz.efi
> > +$(obj)/vmlinuz.efi: $(obj)/vmlinuz FORCE
> > +       $(call if_changed,eficopy)
> > diff --git a/arch/loongarch/boot/boot.lds.S b/arch/loongarch/boot/boot.lds.S
> > new file mode 100644
> > index 000000000000..23e698782afd
> > --- /dev/null
> > +++ b/arch/loongarch/boot/boot.lds.S
> > @@ -0,0 +1,64 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * ld.script for compressed kernel support of LoongArch
> > + *
> > + * Author: Huacai Chen <chenhuacai@loongson.cn>
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include "../kernel/image-vars.h"
> > +
> > +/*
> > + * Max avaliable Page Size is 64K, so we set SectionAlignment
> > + * field of EFI application to 64K.
> > + */
> > +PECOFF_FILE_ALIGN = 0x200;
> > +PECOFF_SEGMENT_ALIGN = 0x10000;
> > +
> > +OUTPUT_ARCH(loongarch)
> > +ENTRY(kernel_entry)
> > +PHDRS {
> > +       text PT_LOAD FLAGS(7); /* RWX */
> > +}
> > +SECTIONS
> > +{
> > +       . = VMLINUZ_LOAD_ADDRESS;
> > +
> > +       _text = .;
> > +       .head.text : {
> > +               *(.head.text)
> > +       }
> > +
> > +       .text : {
> > +               *(.text)
> > +               *(.init.text)
> > +               *(.rodata)
> > +       }: text
> > +
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > +       _data = .;
> > +       .data : {
> > +               *(.data)
> > +               *(.init.data)
> > +               /* Put the compressed image here */
> > +               __image_begin = .;
> > +               *(.image)
> > +               __image_end = .;
> > +               CONSTRUCTORS
> > +               . = ALIGN(PECOFF_FILE_ALIGN);
> > +       }
> > +       _edata = .;
> > +
> > +       .bss : {
> > +               *(.bss)
> > +               *(.init.bss)
> > +       }
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > +       _end = .;
> > +
> > +       /DISCARD/ : {
> > +               *(.options)
> > +               *(.comment)
> > +               *(.note)
> > +       }
> > +}
> > diff --git a/arch/loongarch/boot/decompress.c b/arch/loongarch/boot/decompress.c
> > new file mode 100644
> > index 000000000000..8f55fcd8f285
> > --- /dev/null
> > +++ b/arch/loongarch/boot/decompress.c
> > @@ -0,0 +1,98 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +/*
> > + * Author: Huacai Chen <chenhuacai@loongson.cn>
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/types.h>
> > +#include <linux/kernel.h>
> > +#include <linux/string.h>
> > +#include <linux/libfdt.h>
> > +
> > +#include <asm/addrspace.h>
> > +
> > +/*
> > + * These two variables specify the free mem region
> > + * that can be used for temporary malloc area
> > + */
> > +unsigned long free_mem_ptr;
> > +unsigned long free_mem_end_ptr;
> > +
> > +/* The linker tells us where the image is. */
> > +extern unsigned char __image_begin, __image_end;
> > +
> > +#define puts(s) do {} while (0)
> > +#define puthex(val) do {} while (0)
> > +
> > +void error(char *x)
> > +{
> > +       puts("\n\n");
> > +       puts(x);
> > +       puts("\n\n -- System halted");
> > +
> > +       while (1)
> > +               ;       /* Halt */
> > +}
> > +
> > +/* activate the code for pre-boot environment */
> > +#define STATIC static
> > +
> > +#include "../../../../lib/ashldi3.c"
> > +
> > +#ifdef CONFIG_KERNEL_GZIP
> > +#include "../../../../lib/decompress_inflate.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_BZIP2
> > +#include "../../../../lib/decompress_bunzip2.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_LZ4
> > +#include "../../../../lib/decompress_unlz4.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_LZMA
> > +#include "../../../../lib/decompress_unlzma.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_LZO
> > +#include "../../../../lib/decompress_unlzo.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_XZ
> > +#include "../../../../lib/decompress_unxz.c"
> > +#endif
> > +
> > +#ifdef CONFIG_KERNEL_ZSTD
> > +#include "../../../../lib/decompress_unzstd.c"
> > +#endif
> > +
> > +void decompress_kernel(unsigned long boot_heap_start)
> > +{
> > +       unsigned long zimage_start, zimage_size;
> > +
> > +       zimage_start = (unsigned long)(&__image_begin);
> > +       zimage_size = (unsigned long)(&__image_end) -
> > +           (unsigned long)(&__image_begin);
> > +
> > +       puts("zimage at:     ");
> > +       puthex(zimage_start);
> > +       puts(" ");
> > +       puthex(zimage_size + zimage_start);
> > +       puts("\n");
> > +
> > +       /* This area are prepared for mallocing when decompressing */
> > +       free_mem_ptr = boot_heap_start;
> > +       free_mem_end_ptr = boot_heap_start + BOOT_HEAP_SIZE;
> > +
> > +       /* Display standard Linux/LoongArch boot prompt */
> > +       puts("Uncompressing Linux at load address ");
> > +       puthex(VMLINUX_LOAD_ADDRESS);
> > +       puts("\n");
> > +
> > +       /* Decompress the kernel with according algorithm */
> > +       __decompress((char *)zimage_start, zimage_size, 0, 0,
> > +                  (void *)VMLINUX_LOAD_ADDRESS, 0, 0, error);
> > +
> > +       puts("Now, booting the kernel...\n");
> > +}
> > diff --git a/arch/loongarch/boot/string.c b/arch/loongarch/boot/string.c
> > new file mode 100644
> > index 000000000000..3f746e7c2bb5
> > --- /dev/null
> > +++ b/arch/loongarch/boot/string.c
> > @@ -0,0 +1,166 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * arch/loongarch/boot/string.c
> > + *
> > + * Very small subset of simple string routines
> > + */
> > +
> > +#include <linux/types.h>
> > +
> > +void __weak *memset(void *s, int c, size_t n)
> > +{
> > +       int i;
> > +       char *ss = s;
> > +
> > +       for (i = 0; i < n; i++)
> > +               ss[i] = c;
> > +       return s;
> > +}
> > +
> > +void __weak *memcpy(void *dest, const void *src, size_t n)
> > +{
> > +       int i;
> > +       const char *s = src;
> > +       char *d = dest;
> > +
> > +       for (i = 0; i < n; i++)
> > +               d[i] = s[i];
> > +       return dest;
> > +}
> > +
> > +void __weak *memmove(void *dest, const void *src, size_t n)
> > +{
> > +       int i;
> > +       const char *s = src;
> > +       char *d = dest;
> > +
> > +       if (d < s) {
> > +               for (i = 0; i < n; i++)
> > +                       d[i] = s[i];
> > +       } else if (d > s) {
> > +               for (i = n - 1; i >= 0; i--)
> > +                       d[i] = s[i];
> > +       }
> > +
> > +       return dest;
> > +}
> > +
> > +int __weak memcmp(const void *cs, const void *ct, size_t count)
> > +{
> > +       int res = 0;
> > +       const unsigned char *su1, *su2;
> > +
> > +       for (su1 = cs, su2 = ct; 0 < count; ++su1, ++su2, count--) {
> > +               res = *su1 - *su2;
> > +               if (res != 0)
> > +                       break;
> > +       }
> > +       return res;
> > +}
> > +
> > +int __weak strcmp(const char *str1, const char *str2)
> > +{
> > +       int delta = 0;
> > +       const unsigned char *s1 = (const unsigned char *)str1;
> > +       const unsigned char *s2 = (const unsigned char *)str2;
> > +
> > +       while (*s1 || *s2) {
> > +               delta = *s1 - *s2;
> > +               if (delta)
> > +                       return delta;
> > +               s1++;
> > +               s2++;
> > +       }
> > +       return 0;
> > +}
> > +
> > +size_t __weak strlen(const char *s)
> > +{
> > +       const char *sc;
> > +
> > +       for (sc = s; *sc != '\0'; ++sc)
> > +               /* nothing */;
> > +       return sc - s;
> > +}
> > +
> > +size_t __weak strnlen(const char *s, size_t count)
> > +{
> > +       const char *sc;
> > +
> > +       for (sc = s; count-- && *sc != '\0'; ++sc)
> > +               /* nothing */;
> > +       return sc - s;
> > +}
> > +
> > +char * __weak strnstr(const char *s1, const char *s2, size_t len)
> > +{
> > +       size_t l2;
> > +
> > +       l2 = strlen(s2);
> > +       if (!l2)
> > +               return (char *)s1;
> > +       while (len >= l2) {
> > +               len--;
> > +               if (!memcmp(s1, s2, l2))
> > +                       return (char *)s1;
> > +               s1++;
> > +       }
> > +       return NULL;
> > +}
> > +
> > +#undef strcat
> > +char * __weak strcat(char *dest, const char *src)
> > +{
> > +       char *tmp = dest;
> > +
> > +       while (*dest)
> > +               dest++;
> > +       while ((*dest++ = *src++) != '\0')
> > +               ;
> > +       return tmp;
> > +}
> > +
> > +char * __weak strncat(char *dest, const char *src, size_t count)
> > +{
> > +       char *tmp = dest;
> > +
> > +       if (count) {
> > +               while (*dest)
> > +                       dest++;
> > +               while ((*dest++ = *src++) != 0) {
> > +                       if (--count == 0) {
> > +                               *dest = '\0';
> > +                               break;
> > +                       }
> > +               }
> > +       }
> > +       return tmp;
> > +}
> > +
> > +char * __weak strpbrk(const char *cs, const char *ct)
> > +{
> > +       const char *sc1, *sc2;
> > +
> > +       for (sc1 = cs; *sc1 != '\0'; ++sc1) {
> > +               for (sc2 = ct; *sc2 != '\0'; ++sc2) {
> > +                       if (*sc1 == *sc2)
> > +                               return (char *)sc1;
> > +               }
> > +       }
> > +       return NULL;
> > +}
> > +
> > +char * __weak strsep(char **s, const char *ct)
> > +{
> > +       char *sbegin = *s;
> > +       char *end;
> > +
> > +       if (sbegin == NULL)
> > +               return NULL;
> > +
> > +       end = strpbrk(sbegin, ct);
> > +       if (end)
> > +               *end++ = '\0';
> > +       *s = end;
> > +       return sbegin;
> > +}
> > diff --git a/arch/loongarch/boot/zheader.S b/arch/loongarch/boot/zheader.S
> > new file mode 100644
> > index 000000000000..4bc50d953ec7
> > --- /dev/null
> > +++ b/arch/loongarch/boot/zheader.S
> > @@ -0,0 +1,100 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/pe.h>
> > +#include <linux/sizes.h>
> > +
> > +       .macro  __EFI_PE_HEADER
> > +       .long   PE_MAGIC
> > +coff_header:
> > +       .short  IMAGE_FILE_MACHINE_LOONGARCH            /* Machine */
> > +       .short  section_count                           /* NumberOfSections */
> > +       .long   0                                       /* TimeDateStamp */
> > +       .long   0                                       /* PointerToSymbolTable */
> > +       .long   0                                       /* NumberOfSymbols */
> > +       .short  section_table - optional_header         /* SizeOfOptionalHeader */
> > +       .short  IMAGE_FILE_DEBUG_STRIPPED | \
> > +               IMAGE_FILE_EXECUTABLE_IMAGE | \
> > +               IMAGE_FILE_LINE_NUMS_STRIPPED           /* Characteristics */
> > +
> > +optional_header:
> > +       .short  PE_OPT_MAGIC_PE32PLUS                   /* PE32+ format */
> > +       .byte   0x02                                    /* MajorLinkerVersion */
> > +       .byte   0x14                                    /* MinorLinkerVersion */
> > +       .long   _data - efi_header_end                  /* SizeOfCode */
> > +       .long   _end - _data                            /* SizeOfInitializedData */
> > +       .long   0                                       /* SizeOfUninitializedData */
> > +       .long   __efistub_efi_pe_entry - _head          /* AddressOfEntryPoint */
> > +       .long   efi_header_end - _head                  /* BaseOfCode */
> > +
> > +extra_header_fields:
> > +       .quad   0                                       /* ImageBase */
> > +       .long   PECOFF_SEGMENT_ALIGN                    /* SectionAlignment */
> > +       .long   PECOFF_FILE_ALIGN                       /* FileAlignment */
> > +       .short  0                                       /* MajorOperatingSystemVersion */
> > +       .short  0                                       /* MinorOperatingSystemVersion */
> > +       .short  0                                       /* MajorImageVersion */
> > +       .short  0                                       /* MinorImageVersion */
> > +       .short  0                                       /* MajorSubsystemVersion */
> > +       .short  0                                       /* MinorSubsystemVersion */
> > +       .long   0                                       /* Win32VersionValue */
> > +
> > +       .long   _end - _head                            /* SizeOfImage */
> > +
> > +       /* Everything before the kernel image is considered part of the header */
> > +       .long   efi_header_end - _head                  /* SizeOfHeaders */
> > +       .long   0                                       /* CheckSum */
> > +       .short  IMAGE_SUBSYSTEM_EFI_APPLICATION         /* Subsystem */
> > +       .short  0                                       /* DllCharacteristics */
> > +       .quad   0                                       /* SizeOfStackReserve */
> > +       .quad   0                                       /* SizeOfStackCommit */
> > +       .quad   0                                       /* SizeOfHeapReserve */
> > +       .quad   0                                       /* SizeOfHeapCommit */
> > +       .long   0                                       /* LoaderFlags */
> > +       .long   (section_table - .) / 8                 /* NumberOfRvaAndSizes */
> > +
> > +       .quad   0                                       /* ExportTable */
> > +       .quad   0                                       /* ImportTable */
> > +       .quad   0                                       /* ResourceTable */
> > +       .quad   0                                       /* ExceptionTable */
> > +       .quad   0                                       /* CertificationTable */
> > +       .quad   0                                       /* BaseRelocationTable */
> > +
> > +       /* Section table */
> > +section_table:
> > +       .ascii  ".text\0\0\0"
> > +       .long   _data - efi_header_end                  /* VirtualSize */
> > +       .long   efi_header_end - _head                  /* VirtualAddress */
> > +       .long   _data - efi_header_end                  /* SizeOfRawData */
> > +       .long   efi_header_end - _head                  /* PointerToRawData */
> > +
> > +       .long   0                                       /* PointerToRelocations */
> > +       .long   0                                       /* PointerToLineNumbers */
> > +       .short  0                                       /* NumberOfRelocations */
> > +       .short  0                                       /* NumberOfLineNumbers */
> > +       .long   IMAGE_SCN_CNT_CODE | \
> > +               IMAGE_SCN_MEM_READ | \
> > +               IMAGE_SCN_MEM_EXECUTE                   /* Characteristics */
> > +
> > +       .ascii  ".data\0\0\0"
> > +       .long   _end - _data                            /* VirtualSize */
> > +       .long   _data - _head                           /* VirtualAddress */
> > +       .long   _edata - _data                          /* SizeOfRawData */
> > +       .long   _data - _head                           /* PointerToRawData */
> > +
> > +       .long   0                                       /* PointerToRelocations */
> > +       .long   0                                       /* PointerToLineNumbers */
> > +       .short  0                                       /* NumberOfRelocations */
> > +       .short  0                                       /* NumberOfLineNumbers */
> > +       .long   IMAGE_SCN_CNT_INITIALIZED_DATA | \
> > +               IMAGE_SCN_MEM_READ | \
> > +               IMAGE_SCN_MEM_WRITE                     /* Characteristics */
> > +
> > +       .org 0x20e
> > +       .word kernel_version - 512 -  _head
> > +
> > +       .set    section_count, (. - section_table) / 40
> > +efi_header_end:
> > +       .endm
> > diff --git a/arch/loongarch/boot/zkernel.S b/arch/loongarch/boot/zkernel.S
> > new file mode 100644
> > index 000000000000..13a8a14a2328
> > --- /dev/null
> > +++ b/arch/loongarch/boot/zkernel.S
> > @@ -0,0 +1,99 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/init.h>
> > +#include <linux/linkage.h>
> > +#include <asm/addrspace.h>
> > +#include <asm/asm.h>
> > +#include <asm/loongarch.h>
> > +#include <asm/regdef.h>
> > +#include <generated/compile.h>
> > +#include <generated/utsrelease.h>
> > +
> > +#ifdef CONFIG_EFI_STUB
> > +
> > +#include "zheader.S"
> > +
> > +       __HEAD
> > +
> > +_head:
> > +       /* "MZ", MS-DOS header */
> > +       .word   MZ_MAGIC
> > +       .org    0x28
> > +       .ascii  "Loongson\0"
> > +       .org    0x3c
> > +       /* Offset to the PE header */
> > +       .long   pe_header - _head
> > +
> > +pe_header:
> > +       __EFI_PE_HEADER
> > +
> > +kernel_asize:
> > +       .long _end - _text
> > +
> > +kernel_fsize:
> > +       .long _edata - _text
> > +
> > +kernel_vaddr:
> > +       .quad VMLINUZ_LOAD_ADDRESS
> > +
> > +kernel_offset:
> > +       .long kernel_offset - _text
> > +
> > +kernel_version:
> > +       .ascii  UTS_RELEASE " (" LINUX_COMPILE_BY "@" LINUX_COMPILE_HOST ") " UTS_VERSION "\0"
> > +
> > +SYM_L_GLOBAL(kernel_asize)
> > +SYM_L_GLOBAL(kernel_fsize)
> > +SYM_L_GLOBAL(kernel_vaddr)
> > +SYM_L_GLOBAL(kernel_offset)
> > +
> > +#endif
> > +
> > +       __INIT
> > +
> > +SYM_CODE_START(kernel_entry)
> > +       /* Save boot rom start args */
> > +       move    s0, a0
> > +       move    s1, a1
> > +       move    s2, a2
> > +       move    s3, a3
> > +
> > +       /* Config Direct Mapping */
> > +       li.d    t0, CSR_DMW0_INIT
> > +       csrwr   t0, LOONGARCH_CSR_DMWIN0
> > +       li.d    t0, CSR_DMW1_INIT
> > +       csrwr   t0, LOONGARCH_CSR_DMWIN1
> > +
> > +       /* Clear BSS */
> > +       la.abs  a0, _edata
> > +       la.abs  a2, _end
> > +1:     st.d    zero, a0, 0
> > +       addi.d  a0, a0, 8
> > +       bne     a2, a0, 1b
> > +
> > +       la.abs  a0, .heap          /* heap address */
> > +       la.abs  sp, .stack + 8192  /* stack address */
> > +
> > +       la      ra, 2f
> > +       la      t4, decompress_kernel
> > +       jirl    zero, t4, 0
> > +2:
> > +       move    a0, s0
> > +       move    a1, s1
> > +       move    a2, s2
> > +       move    a3, s3
> > +       PTR_LI  t4, KERNEL_ENTRY
> > +       jirl    zero, t4, 0
> > +3:
> > +       b       3b
> > +SYM_CODE_END(kernel_entry)
> > +
> > +       .comm .heap, BOOT_HEAP_SIZE, 4
> > +       .comm .stack, BOOT_STACK_SIZE, 4
> > +
> > +       .align 4
> > +       .section .image, "a", %progbits
> > +       .incbin "arch/loongarch/boot/vmlinux.bin.z"
> > diff --git a/arch/loongarch/tools/Makefile b/arch/loongarch/tools/Makefile
> > new file mode 100644
> > index 000000000000..8a6181c82a91
> > --- /dev/null
> > +++ b/arch/loongarch/tools/Makefile
> > @@ -0,0 +1,15 @@
> > +#
> > +# arch/loongarch/boot/Makefile
> > +#
> > +# Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > +#
> > +
> > +hostprogs := elf-entry
> > +PHONY += elf-entry
> > +elf-entry: $(obj)/elf-entry
> > +       @:
> > +
> > +hostprogs += calc_vmlinuz_load_addr
> > +PHONY += calc_vmlinuz_load_addr
> > +calc_vmlinuz_load_addr: $(obj)/calc_vmlinuz_load_addr
> > +       @:
> > diff --git a/arch/loongarch/tools/calc_vmlinuz_load_addr.c b/arch/loongarch/tools/calc_vmlinuz_load_addr.c
> > new file mode 100644
> > index 000000000000..5e2ca6b4dff6
> > --- /dev/null
> > +++ b/arch/loongarch/tools/calc_vmlinuz_load_addr.c
> > @@ -0,0 +1,51 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <errno.h>
> > +#include <stdint.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <sys/stat.h>
> > +
> > +int main(int argc, char *argv[])
> > +{
> > +       unsigned long long vmlinux_size, vmlinux_load_addr, vmlinuz_load_addr;
> > +       struct stat sb;
> > +
> > +       if (argc != 3) {
> > +               fprintf(stderr, "Usage: %s <pathname> <vmlinux_load_addr>\n", argv[0]);
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       if (stat(argv[1], &sb) == -1) {
> > +               perror("stat");
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       /* Convert hex characters to dec number */
> > +       errno = 0;
> > +       if (sscanf(argv[2], "%llx", &vmlinux_load_addr) != 1) {
> > +               if (errno != 0)
> > +                       perror("sscanf");
> > +               else
> > +                       fprintf(stderr, "No matching characters\n");
> > +
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       vmlinux_size = (uint64_t)sb.st_size;
> > +       vmlinuz_load_addr = vmlinux_load_addr + vmlinux_size;
> > +
> > +       /*
> > +        * Align with 64KB: KEXEC needs load sections to be aligned to PAGE_SIZE,
> > +        * which may be as large as 64KB depending on the kernel configuration.
> > +        */
> > +
> > +       vmlinuz_load_addr += (0x10000 - vmlinux_size % 0x10000);
> > +
> > +       printf("0x%llx\n", vmlinuz_load_addr);
> > +
> > +       return EXIT_SUCCESS;
> > +}
> > diff --git a/arch/loongarch/tools/elf-entry.c b/arch/loongarch/tools/elf-entry.c
> > new file mode 100644
> > index 000000000000..c80721e0dee1
> > --- /dev/null
> > +++ b/arch/loongarch/tools/elf-entry.c
> > @@ -0,0 +1,66 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <elf.h>
> > +#include <inttypes.h>
> > +#include <stdint.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <string.h>
> > +
> > +__attribute__((noreturn))
> > +static void die(const char *msg)
> > +{
> > +       fputs(msg, stderr);
> > +       exit(EXIT_FAILURE);
> > +}
> > +
> > +int main(int argc, const char *argv[])
> > +{
> > +       uint64_t entry;
> > +       size_t nread;
> > +       FILE *file;
> > +       union {
> > +               Elf32_Ehdr ehdr32;
> > +               Elf64_Ehdr ehdr64;
> > +       } hdr;
> > +
> > +       if (argc != 2)
> > +               die("Usage: elf-entry <elf-file>\n");
> > +
> > +       file = fopen(argv[1], "r");
> > +       if (!file) {
> > +               perror("Unable to open input file");
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       nread = fread(&hdr, 1, sizeof(hdr), file);
> > +       if (nread != sizeof(hdr)) {
> > +               fclose(file);
> > +               perror("Unable to read input file");
> > +               return EXIT_FAILURE;
> > +       }
> > +
> > +       if (memcmp(hdr.ehdr32.e_ident, ELFMAG, SELFMAG)) {
> > +               fclose(file);
> > +               die("Input is not an ELF\n");
> > +       }
> > +
> > +       switch (hdr.ehdr32.e_ident[EI_CLASS]) {
> > +       case ELFCLASS32:
> > +               /* Sign extend to form a canonical address */
> > +               entry = (int64_t)(int32_t)hdr.ehdr32.e_entry;
> > +               break;
> > +
> > +       case ELFCLASS64:
> > +               entry = hdr.ehdr64.e_entry;
> > +               break;
> > +
> > +       default:
> > +               fclose(file);
> > +               die("Invalid ELF class\n");
> > +       }
> > +
> > +       fclose(file);
> > +       printf("0x%016" PRIx64 "\n", entry);
> > +
> > +       return EXIT_SUCCESS;
> > +}
> > --
> > 2.27.0
> >
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 10/24] LoongArch: Add exception/interrupt handling
  2022-05-01 17:08     ` Xi Ruoyao
@ 2022-05-02  0:01       ` Huacai Chen
  0 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-05-02  0:01 UTC (permalink / raw)
  To: Xi Ruoyao
  Cc: Huacai Chen, Arnd Bergmann, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION, LKML,
	Xuefeng Li, Yanteng Si, Guo Ren, Xuerui Wang, Jiaxun Yang

Hi, Ruoyao,

On Mon, May 2, 2022 at 1:08 AM Xi Ruoyao <xry111@mengyan1223.wang> wrote:
>
> On Mon, 2022-05-02 at 00:27 +0800, Xi Ruoyao wrote:
> > On Sat, 2022-04-30 at 17:05 +0800, Huacai Chen wrote:
> > > +struct acpi_madt_lio_pic;
> > > +struct acpi_madt_eio_pic;
> > > +struct acpi_madt_ht_pic;
> > > +struct acpi_madt_bio_pic;
> > > +struct acpi_madt_msi_pic;
> > > +struct acpi_madt_lpc_pic;
> >
> > Where are those defined?  I can't find them and the compilation fails
> > with:
> >
> > arch/loongarch/kernel/irq.c: In function ‘find_pch_pic’:
> > arch/loongarch/kernel/irq.c:48:32: error: invalid use of undefined
> > type ‘struct acpi_madt_bio_pic’
> >    48 |                 start = irq_cfg->gsi_base;
> >       |                                ^~
> > arch/loongarch/kernel/irq.c:49:32: error: invalid use of undefined
> > type ‘struct acpi_madt_bio_pic’
> >    49 |                 end   = irq_cfg->gsi_base + irq_cfg->size;
> >       |                                ^~
> > arch/loongarch/kernel/irq.c:49:52: error: invalid use of undefined
> > type ‘struct acpi_madt_bio_pic’
> >    49 |                 end   = irq_cfg->gsi_base + irq_cfg->size;
> >       |                                                    ^~
>
> Alright, my bad... I didn't realize the LoongArch patches are splitted
> into multiple series for multiple lists.  But is this the SOP of kernel
> patch reviewing?  Would it be easier to just send one series and CC all
> relevent lists?
The acpi stuff should go to acpica project first, then Rafael sync the
code to the kernel. The current status is acpica has merged LoongArch
support, but hasn't yet gone to the kernel.
ACPI: Add LoongArch-related definitions by chenhuacai · Pull Request
#757 · acpica/acpica · GitHub

Huacai
>
> --
> Xi Ruoyao <xry111@mengyan1223.wang>
> School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 20/24] LoongArch: Add efistub booting support
  2022-04-30  9:56   ` Arnd Bergmann
  2022-04-30 10:02     ` Huacai Chen
@ 2022-05-03  7:23     ` Ard Biesheuvel
  2022-05-05  9:59       ` Huacai Chen
  1 sibling, 1 reply; 94+ messages in thread
From: Ard Biesheuvel @ 2022-05-03  7:23 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Huacai Chen, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, David Airlie, Jonathan Corbet, Linus Torvalds,
	linux-arch, open list:DOCUMENTATION, Linux Kernel Mailing List,
	Xuefeng Li, Yanteng Si, Huacai Chen, Guo Ren, Xuerui Wang,
	Jiaxun Yang

On Sat, 30 Apr 2022 at 11:56, Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> >
> > This patch adds efistub booting support, which is the standard UEFI boot
> > protocol for us to use.
> >
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
>
> It's good to see that you completed this. Unfortunately you did not add Ard
> Biesheuvel to Cc, he is the one who needs to review this code. Adding him
> to Cc now, with the full patch quoted below for him (no more comments
> from me there).
>

Thanks Arnd,

>
> > ---
> >  arch/loongarch/Kbuild                         |   3 +
> >  arch/loongarch/Kconfig                        |   8 +
> >  arch/loongarch/Makefile                       |  18 +-
> >  arch/loongarch/boot/Makefile                  |  23 +
> >  arch/loongarch/kernel/efi-header.S            | 100 +++++
> >  arch/loongarch/kernel/head.S                  |  44 +-
> >  arch/loongarch/kernel/image-vars.h            |  30 ++
> >  arch/loongarch/kernel/vmlinux.lds.S           |  23 +-
> >  drivers/firmware/efi/Kconfig                  |   4 +-
> >  drivers/firmware/efi/libstub/Makefile         |  14 +-
> >  drivers/firmware/efi/libstub/loongarch-stub.c | 425 ++++++++++++++++++
> >  include/linux/pe.h                            |   1 +
> >  12 files changed, 680 insertions(+), 13 deletions(-)
> >  create mode 100644 arch/loongarch/boot/Makefile
> >  create mode 100644 arch/loongarch/kernel/efi-header.S
> >  create mode 100644 arch/loongarch/kernel/image-vars.h
> >  create mode 100644 drivers/firmware/efi/libstub/loongarch-stub.c
> >
> > diff --git a/arch/loongarch/Kbuild b/arch/loongarch/Kbuild
> > index 1ad35aabdd16..ab5373d0a24f 100644
> > --- a/arch/loongarch/Kbuild
> > +++ b/arch/loongarch/Kbuild
> > @@ -1,3 +1,6 @@
> >  obj-y += kernel/
> >  obj-y += mm/
> >  obj-y += vdso/
> > +
> > +# for cleaning
> > +subdir- += boot
> > diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> > index 44b763046893..55225ee5f868 100644
> > --- a/arch/loongarch/Kconfig
> > +++ b/arch/loongarch/Kconfig
> > @@ -265,6 +265,14 @@ config EFI
> >           resultant kernel should continue to boot on existing non-EFI
> >           platforms.
> >
> > +config EFI_STUB
> > +       bool "EFI boot stub support"
> > +       default y
> > +       depends on EFI
> > +       help
> > +         This kernel feature allows the kernel to be loaded directly by
> > +         EFI firmware without the use of a bootloader.
> > +

Please enable EFI_GENERIC_STUB here

> >  config FORCE_MAX_ZONEORDER
> >         int "Maximum zone order"
> >         range 14 64 if PAGE_SIZE_64KB
> > diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
> > index c4b3f53cd276..d88a792dafbe 100644
> > --- a/arch/loongarch/Makefile
> > +++ b/arch/loongarch/Makefile
> > @@ -3,6 +3,14 @@
> >  # Author: Huacai Chen <chenhuacai@loongson.cn>
> >  # Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> >
> > +boot   := arch/loongarch/boot
> > +
> > +ifndef CONFIG_EFI_STUB
> > +KBUILD_IMAGE   = $(boot)/vmlinux
> > +else
> > +KBUILD_IMAGE   = $(boot)/vmlinux.efi
> > +endif
> > +
> >  #
> >  # Select the object file format to substitute into the linker script.
> >  #
> > @@ -30,8 +38,6 @@ ld-emul                       = $(64bit-emul)
> >  cflags-y               += -mabi=lp64s
> >  endif
> >
> > -all-y                  := vmlinux
> > -
> >  #
> >  # GCC uses -G0 -mabicalls -fpic as default.  We don't want PIC in the kernel
> >  # code since it only slows down the whole thing.  At some point we might make
> > @@ -75,6 +81,7 @@ endif
> >  head-y := arch/loongarch/kernel/head.o
> >
> >  libs-y += arch/loongarch/lib/
> > +libs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
> >
> >  ifeq ($(KBUILD_EXTMOD),)
> >  prepare: vdso_prepare
> > @@ -86,12 +93,13 @@ PHONY += vdso_install
> >  vdso_install:
> >         $(Q)$(MAKE) $(build)=arch/loongarch/vdso $@
> >
> > -all:   $(all-y)
> > +all:   $(KBUILD_IMAGE)
> >
> > -CLEAN_FILES += vmlinux
> > +$(KBUILD_IMAGE): vmlinux
> > +       $(Q)$(MAKE) $(build)=$(boot) $(bootvars-y) $@
> >
> >  install:
> > -       $(Q)install -D -m 755 vmlinux $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
> > +       $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
> >         $(Q)install -D -m 644 .config $(INSTALL_PATH)/config-$(KERNELRELEASE)
> >         $(Q)install -D -m 644 System.map $(INSTALL_PATH)/System.map-$(KERNELRELEASE)
> >
> > diff --git a/arch/loongarch/boot/Makefile b/arch/loongarch/boot/Makefile
> > new file mode 100644
> > index 000000000000..66f2293c34b2
> > --- /dev/null
> > +++ b/arch/loongarch/boot/Makefile
> > @@ -0,0 +1,23 @@
> > +#
> > +# arch/loongarch/boot/Makefile
> > +#
> > +# Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > +#
> > +
> > +drop-sections := .comment .note .options .note.gnu.build-id
> > +strip-flags   := $(addprefix --remove-section=,$(drop-sections)) -S
> > +
> > +targets := vmlinux
> > +quiet_cmd_strip = STRIP          $@
> > +      cmd_strip = $(STRIP) -s $@
> > +
> > +$(obj)/vmlinux: vmlinux FORCE
> > +       $(call if_changed,copy)
> > +       $(call if_changed,strip)
> > +

I don't think you are supposed to use if_changed twice on the same target.

> > +targets += vmlinux.efi
> > +quiet_cmd_eficopy = OBJCOPY $@
> > +      cmd_eficopy = $(OBJCOPY) -O binary $(strip-flags) $< $@
> > +

You could use the generic cmd_objcopy here instead of inventing your
own. Just set OBJCOPYFLAGS_vmlinux.efi to the right value.

> > +$(obj)/vmlinux.efi: $(obj)/vmlinux FORCE
> > +       $(call if_changed,eficopy)
> > diff --git a/arch/loongarch/kernel/efi-header.S b/arch/loongarch/kernel/efi-header.S
> > new file mode 100644
> > index 000000000000..ceb44524944a
> > --- /dev/null
> > +++ b/arch/loongarch/kernel/efi-header.S
> > @@ -0,0 +1,100 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/pe.h>
> > +#include <linux/sizes.h>
> > +
> > +       .macro  __EFI_PE_HEADER
> > +       .long   PE_MAGIC
> > +coff_header:

Please use .L prefixed local symbol definitions in this file, so we
don't clutter up the core kernel's global symbol table.

> > +       .short  IMAGE_FILE_MACHINE_LOONGARCH            /* Machine */
> > +       .short  section_count                           /* NumberOfSections */
> > +       .long   0                                       /* TimeDateStamp */
> > +       .long   0                                       /* PointerToSymbolTable */
> > +       .long   0                                       /* NumberOfSymbols */
> > +       .short  section_table - optional_header         /* SizeOfOptionalHeader */
> > +       .short  IMAGE_FILE_DEBUG_STRIPPED | \
> > +               IMAGE_FILE_EXECUTABLE_IMAGE | \
> > +               IMAGE_FILE_LINE_NUMS_STRIPPED           /* Characteristics */
> > +
> > +optional_header:
> > +       .short  PE_OPT_MAGIC_PE32PLUS                   /* PE32+ format */
> > +       .byte   0x02                                    /* MajorLinkerVersion */
> > +       .byte   0x14                                    /* MinorLinkerVersion */
> > +       .long   __inittext_end - efi_header_end         /* SizeOfCode */
> > +       .long   _end - __initdata_begin                 /* SizeOfInitializedData */
> > +       .long   0                                       /* SizeOfUninitializedData */
> > +       .long   __efistub_efi_pe_entry - _head          /* AddressOfEntryPoint */
> > +       .long   efi_header_end - _head                  /* BaseOfCode */
> > +
> > +extra_header_fields:
> > +       .quad   0                                       /* ImageBase */
> > +       .long   PECOFF_SEGMENT_ALIGN                    /* SectionAlignment */
> > +       .long   PECOFF_FILE_ALIGN                       /* FileAlignment */
> > +       .short  0                                       /* MajorOperatingSystemVersion */
> > +       .short  0                                       /* MinorOperatingSystemVersion */
> > +       .short  0                                       /* MajorImageVersion */
> > +       .short  0                                       /* MinorImageVersion */

Once you enable EFI_GENERIC_STUB, set the above fields to
EFISTUB_MAJOR_IMAGE_VERSION/EFISTUB_MINOR_IMAGE_VERSION, so
bootloaders know they can use the LoadFile2 based initrd loader.

> > +       .short  0                                       /* MajorSubsystemVersion */
> > +       .short  0                                       /* MinorSubsystemVersion */
> > +       .long   0                                       /* Win32VersionValue */
> > +
> > +       .long   _end - _head                            /* SizeOfImage */
> > +
> > +       /* Everything before the kernel image is considered part of the header */
> > +       .long   efi_header_end - _head                  /* SizeOfHeaders */
> > +       .long   0                                       /* CheckSum */
> > +       .short  IMAGE_SUBSYSTEM_EFI_APPLICATION         /* Subsystem */
> > +       .short  0                                       /* DllCharacteristics */
> > +       .quad   0                                       /* SizeOfStackReserve */
> > +       .quad   0                                       /* SizeOfStackCommit */
> > +       .quad   0                                       /* SizeOfHeapReserve */
> > +       .quad   0                                       /* SizeOfHeapCommit */
> > +       .long   0                                       /* LoaderFlags */
> > +       .long   (section_table - .) / 8                 /* NumberOfRvaAndSizes */
> > +
> > +       .quad   0                                       /* ExportTable */
> > +       .quad   0                                       /* ImportTable */
> > +       .quad   0                                       /* ResourceTable */
> > +       .quad   0                                       /* ExceptionTable */
> > +       .quad   0                                       /* CertificationTable */
> > +       .quad   0                                       /* BaseRelocationTable */
> > +
> > +       /* Section table */
> > +section_table:
> > +       .ascii  ".text\0\0\0"
> > +       .long   __inittext_end - efi_header_end         /* VirtualSize */
> > +       .long   efi_header_end - _head                  /* VirtualAddress */
> > +       .long   __inittext_end - efi_header_end         /* SizeOfRawData */
> > +       .long   efi_header_end - _head                  /* PointerToRawData */
> > +
> > +       .long   0                                       /* PointerToRelocations */
> > +       .long   0                                       /* PointerToLineNumbers */
> > +       .short  0                                       /* NumberOfRelocations */
> > +       .short  0                                       /* NumberOfLineNumbers */
> > +       .long   IMAGE_SCN_CNT_CODE | \
> > +               IMAGE_SCN_MEM_READ | \
> > +               IMAGE_SCN_MEM_EXECUTE                   /* Characteristics */
> > +
> > +       .ascii  ".data\0\0\0"
> > +       .long   _end - __initdata_begin                 /* VirtualSize */
> > +       .long   __initdata_begin - _head                /* VirtualAddress */
> > +       .long   _edata - __initdata_begin               /* SizeOfRawData */
> > +       .long   __initdata_begin - _head                /* PointerToRawData */
> > +
> > +       .long   0                                       /* PointerToRelocations */
> > +       .long   0                                       /* PointerToLineNumbers */
> > +       .short  0                                       /* NumberOfRelocations */
> > +       .short  0                                       /* NumberOfLineNumbers */
> > +       .long   IMAGE_SCN_CNT_INITIALIZED_DATA | \
> > +               IMAGE_SCN_MEM_READ | \
> > +               IMAGE_SCN_MEM_WRITE                     /* Characteristics */
> > +
> > +       .org 0x20e
> > +       .word kernel_version - 512 -  _head
> > +
> > +       .set    section_count, (. - section_table) / 40
> > +efi_header_end:
> > +       .endm
> > diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
> > index b4a0b28da3e7..361b72e8bfc5 100644
> > --- a/arch/loongarch/kernel/head.S
> > +++ b/arch/loongarch/kernel/head.S
> > @@ -11,11 +11,53 @@
> >  #include <asm/regdef.h>
> >  #include <asm/loongarch.h>
> >  #include <asm/stackframe.h>
> > +#include <generated/compile.h>
> > +#include <generated/utsrelease.h>
> >
> > -SYM_ENTRY(_stext, SYM_L_GLOBAL, SYM_A_NONE)
> > +#ifdef CONFIG_EFI_STUB
> > +
> > +#include "efi-header.S"
> > +
> > +       __HEAD
> > +
> > +_head:
> > +       /* "MZ", MS-DOS header */
> > +       .word   MZ_MAGIC
> > +       .org    0x28
> > +       .ascii  "Loongson\0"

Is this part of a special boot protocol? It would be better not to
overload EFI and PE/COFF with your own hacks if we can avoid it.

> > +       .org    0x3c
> > +       /* Offset to the PE header */
> > +       .long   pe_header - _head
> > +
> > +pe_header:
> > +       __EFI_PE_HEADER
> > +
> > +kernel_asize:
> > +       .long _end - _text
> > +
> > +kernel_fsize:
> > +       .long _edata - _text
> > +
> > +kernel_vaddr:
> > +       .quad VMLINUX_LOAD_ADDRESS
> > +
> > +kernel_offset:
> > +       .long kernel_offset - _text
> > +
> > +kernel_version:
> > +       .ascii  UTS_RELEASE " (" LINUX_COMPILE_BY "@" LINUX_COMPILE_HOST ") " UTS_VERSION "\0"
> > +
> > +SYM_L_GLOBAL(kernel_asize)
> > +SYM_L_GLOBAL(kernel_fsize)
> > +SYM_L_GLOBAL(kernel_vaddr)
> > +SYM_L_GLOBAL(kernel_offset)

I think you can simplify this to

SYM_DATA(kernel_asize, .long _end - _text);

etc etc (which implies the .globl annotation)


> > +
> > +#endif
> >
> >         __REF
> >
> > +SYM_ENTRY(_stext, SYM_L_GLOBAL, SYM_A_NONE)
> > +
> >  SYM_CODE_START(kernel_entry)                   # kernel entry point
> >
> >         /* Config direct window and set PG */
> > diff --git a/arch/loongarch/kernel/image-vars.h b/arch/loongarch/kernel/image-vars.h
> > new file mode 100644
> > index 000000000000..0162402b6212
> > --- /dev/null
> > +++ b/arch/loongarch/kernel/image-vars.h
> > @@ -0,0 +1,30 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +#ifndef __LOONGARCH_KERNEL_IMAGE_VARS_H
> > +#define __LOONGARCH_KERNEL_IMAGE_VARS_H
> > +
> > +#ifdef CONFIG_EFI_STUB
> > +
> > +__efistub_memcmp               = memcmp;
> > +__efistub_memcpy               = memcpy;
> > +__efistub_memmove              = memmove;
> > +__efistub_memset               = memset;
> > +__efistub_strcat               = strcat;
> > +__efistub_strcmp               = strcmp;
> > +__efistub_strlen               = strlen;
> > +__efistub_strncat              = strncat;
> > +__efistub_strnstr              = strnstr;
> > +__efistub_strnlen              = strnlen;
> > +__efistub_strpbrk              = strpbrk;
> > +__efistub_strsep               = strsep;
> > +__efistub_kernel_entry         = kernel_entry;
> > +__efistub_kernel_asize         = kernel_asize;
> > +__efistub_kernel_fsize         = kernel_fsize;
> > +__efistub_kernel_vaddr         = kernel_vaddr;
> > +__efistub_kernel_offset                = kernel_offset;
> > +
> > +#endif
> > +
> > +#endif /* __LOONGARCH_KERNEL_IMAGE_VARS_H */
> > diff --git a/arch/loongarch/kernel/vmlinux.lds.S b/arch/loongarch/kernel/vmlinux.lds.S
> > index 02abfaaa4892..7da4c4d7c50d 100644
> > --- a/arch/loongarch/kernel/vmlinux.lds.S
> > +++ b/arch/loongarch/kernel/vmlinux.lds.S
> > @@ -12,6 +12,14 @@
> >  #define BSS_FIRST_SECTIONS *(.bss..swapper_pg_dir)
> >
> >  #include <asm-generic/vmlinux.lds.h>
> > +#include "image-vars.h"
> > +
> > +/*
> > + * Max avaliable Page Size is 64K, so we set SectionAlignment
> > + * field of EFI application to 64K.
> > + */
> > +PECOFF_FILE_ALIGN = 0x200;
> > +PECOFF_SEGMENT_ALIGN = 0x10000;
> >
> >  OUTPUT_ARCH(loongarch)
> >  ENTRY(kernel_entry)
> > @@ -27,6 +35,9 @@ SECTIONS
> >         . = VMLINUX_LOAD_ADDRESS;
> >
> >         _text = .;
> > +       HEAD_TEXT_SECTION
> > +
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> >         .text : {
> >                 TEXT_TEXT
> >                 SCHED_TEXT
> > @@ -38,11 +49,12 @@ SECTIONS
> >                 *(.fixup)
> >                 *(.gnu.warning)
> >         } :text = 0
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> >         _etext = .;
> >
> >         EXCEPTION_TABLE(16)
> >
> > -       . = ALIGN(PAGE_SIZE);
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> >         __init_begin = .;
> >         __inittext_begin = .;
> >
> > @@ -51,6 +63,7 @@ SECTIONS
> >                 EXIT_TEXT
> >         }
> >
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> >         __inittext_end = .;
> >
> >         __initdata_begin = .;
> > @@ -60,6 +73,10 @@ SECTIONS
> >                 EXIT_DATA
> >         }
> >
> > +       .init.bss : {
> > +               *(.init.bss)
> > +       }
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> >         __initdata_end = .;
> >
> >         __init_end = .;
> > @@ -71,11 +88,11 @@ SECTIONS
> >         .sdata : {
> >                 *(.sdata)
> >         }
> > -
> > -       . = ALIGN(SZ_64K);
> > +       .edata_padding : { BYTE(0); . = ALIGN(PECOFF_FILE_ALIGN); }
> >         _edata =  .;
> >
> >         BSS_SECTION(0, SZ_64K, 8)
> > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> >
> >         _end = .;
> >
> > diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig
> > index 2c3dac5ecb36..ecb4e0b1295a 100644
> > --- a/drivers/firmware/efi/Kconfig
> > +++ b/drivers/firmware/efi/Kconfig
> > @@ -121,9 +121,9 @@ config EFI_ARMSTUB_DTB_LOADER
> >
> >  config EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER
> >         bool "Enable the command line initrd loader" if !X86
> > -       depends on EFI_STUB && (EFI_GENERIC_STUB || X86)
> > -       default y if X86
> >         depends on !RISCV
> > +       depends on EFI_STUB && (EFI_GENERIC_STUB || X86 || LOONGARCH)
> > +       default y if (X86 || LOONGARCH)

Don't enable the command line initrd loader please. It is deprecated,
and has been replaced with the LoadFile2 protocol based one, which is
more flexible.

Uboot already implements it, as well as EDK2. GRUB does not implement
this yet afair, but it should not be that hard to add.

> >         help
> >           Select this config option to add support for the initrd= command
> >           line parameter, allowing an initrd that resides on the same volume
> > diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
> > index d0537573501e..663e9d317299 100644
> > --- a/drivers/firmware/efi/libstub/Makefile
> > +++ b/drivers/firmware/efi/libstub/Makefile
> > @@ -26,6 +26,8 @@ cflags-$(CONFIG_ARM)          := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> >                                    $(call cc-option,-mno-single-pic-base)
> >  cflags-$(CONFIG_RISCV)         := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> >                                    -fpic
> > +cflags-$(CONFIG_LOONGARCH)     := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > +                                  -fpic
> >
> >  cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
> >
> > @@ -55,7 +57,7 @@ KCOV_INSTRUMENT                       := n
> >  lib-y                          := efi-stub-helper.o gop.o secureboot.o tpm.o \
> >                                    file.o mem.o random.o randomalloc.o pci.o \
> >                                    skip_spaces.o lib-cmdline.o lib-ctype.o \
> > -                                  alignedmem.o relocate.o vsprintf.o
> > +                                  alignedmem.o relocate.o string.o vsprintf.o
> >
> >  # include the stub's generic dependencies from lib/ when building for ARM/arm64
> >  efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c
> > @@ -63,13 +65,15 @@ efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c
> >  $(obj)/lib-%.o: $(srctree)/lib/%.c FORCE
> >         $(call if_changed_rule,cc_o_c)
> >
> > -lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o fdt.o string.o \
> > +lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o fdt.o \
> >                                    $(patsubst %.c,lib-%.o,$(efi-deps-y))
> >
> >  lib-$(CONFIG_ARM)              += arm32-stub.o
> >  lib-$(CONFIG_ARM64)            += arm64-stub.o
> >  lib-$(CONFIG_X86)              += x86-stub.o
> >  lib-$(CONFIG_RISCV)            += riscv-stub.o
> > +lib-$(CONFIG_LOONGARCH)                += loongarch-stub.o
> > +
> >  CFLAGS_arm32-stub.o            := -DTEXT_OFFSET=$(TEXT_OFFSET)
> >
> >  # Even when -mbranch-protection=none is set, Clang will generate a
> > @@ -125,6 +129,12 @@ STUBCOPY_FLAGS-$(CONFIG_RISCV)     += --prefix-alloc-sections=.init \
> >                                    --prefix-symbols=__efistub_
> >  STUBCOPY_RELOC-$(CONFIG_RISCV) := R_RISCV_HI20
> >
> > +# For LoongArch, keep all the symbols in .init section and make sure that no
> > +# absolute symbols references doesn't exist.
> > +STUBCOPY_FLAGS-$(CONFIG_LOONGARCH)     += --prefix-alloc-sections=.init \
> > +                                          --prefix-symbols=__efistub_
> > +STUBCOPY_RELOC-$(CONFIG_LOONGARCH)     := R_LARCH_MARK_LA
> > +
> >  $(obj)/%.stub.o: $(obj)/%.o FORCE
> >         $(call if_changed,stubcopy)
> >
> > diff --git a/drivers/firmware/efi/libstub/loongarch-stub.c b/drivers/firmware/efi/libstub/loongarch-stub.c
> > new file mode 100644
> > index 000000000000..399641a0b0cb
> > --- /dev/null
> > +++ b/drivers/firmware/efi/libstub/loongarch-stub.c
> > @@ -0,0 +1,425 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Author: Yun Liu <liuyun@loongson.cn>
> > + *         Huacai Chen <chenhuacai@loongson.cn>
> > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > + */
> > +
> > +#include <linux/efi.h>
> > +#include <linux/sort.h>
> > +#include <asm/efi.h>
> > +#include <asm/addrspace.h>
> > +#include <asm/boot_param.h>
> > +#include "efistub.h"
> > +
> > +#define MAX_ARG_COUNT          128
> > +#define CMDLINE_MAX_SIZE       0x200
> > +
> > +static int argc;
> > +static char **argv;
> > +const efi_system_table_t *efi_system_table;
> > +static efi_guid_t screen_info_guid = LINUX_EFI_LARCH_SCREEN_INFO_TABLE_GUID;
> > +static unsigned int map_entry[LOONGSON3_BOOT_MEM_MAP_MAX];
> > +static struct efi_mmap mmap_array[EFI_MAX_MEMORY_TYPE][LOONGSON3_BOOT_MEM_MAP_MAX];
> > +
> > +struct exit_boot_struct {
> > +       struct boot_params *bp;
> > +       unsigned int *runtime_entry_count;
> > +};
> > +
> > +typedef void (*kernel_entry_t)(int argc, char *argv[], struct boot_params *boot_p);
> > +
> > +extern int kernel_asize;
> > +extern int kernel_fsize;
> > +extern int kernel_offset;
> > +extern unsigned long kernel_vaddr;
> > +extern kernel_entry_t kernel_entry;
> > +
> > +unsigned char efi_crc8(char *buff, int size)
> > +{
> > +       int sum, cnt;
> > +
> > +       for (sum = 0, cnt = 0; cnt < size; cnt++)
> > +               sum = (char) (sum + *(buff + cnt));
> > +
> > +       return (char)(0x100 - sum);
> > +}
> > +
> > +struct screen_info *alloc_screen_info(void)
> > +{
> > +       efi_status_t status;
> > +       struct screen_info *si;
> > +
> > +       status = efi_bs_call(allocate_pool,
> > +                       EFI_RUNTIME_SERVICES_DATA, sizeof(*si), (void **)&si);
> > +       if (status != EFI_SUCCESS)
> > +               return NULL;
> > +
> > +       status = efi_bs_call(install_configuration_table, &screen_info_guid, si);
> > +       if (status == EFI_SUCCESS)
> > +               return si;
> > +
> > +       efi_bs_call(free_pool, si);
> > +
> > +       return NULL;
> > +}
> > +
> > +static void setup_graphics(void)
> > +{
> > +       unsigned long size;
> > +       efi_status_t status;
> > +       efi_guid_t gop_proto = EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID;
> > +       void **gop_handle = NULL;
> > +       struct screen_info *si = NULL;
> > +
> > +       size = 0;
> > +       status = efi_bs_call(locate_handle, EFI_LOCATE_BY_PROTOCOL,
> > +                               &gop_proto, NULL, &size, gop_handle);
> > +       if (status == EFI_BUFFER_TOO_SMALL) {
> > +               si = alloc_screen_info();
> > +               efi_setup_gop(si, &gop_proto, size);
> > +       }
> > +}
> > +
> > +struct boot_params *bootparams_init(efi_system_table_t *sys_table)
> > +{
> > +       efi_status_t status;
> > +       struct boot_params *p;
> > +       unsigned char sig[8] = {'B', 'P', 'I', '0', '1', '0', '0', '2'};
> > +
> > +       status = efi_bs_call(allocate_pool, EFI_RUNTIME_SERVICES_DATA, SZ_64K, (void **)&p);
> > +       if (status != EFI_SUCCESS)
> > +               return NULL;
> > +
> > +       memset(p, 0, SZ_64K);
> > +       memcpy(&p->signature, sig, sizeof(long));
> > +
> > +       return p;
> > +}
> > +
> > +static unsigned long convert_priv_cmdline(char *cmdline_ptr,
> > +               unsigned long rd_addr, unsigned long rd_size)
> > +{
> > +       unsigned int rdprev_size;
> > +       unsigned int cmdline_size;
> > +       efi_status_t status;
> > +       char *pstr, *substr;
> > +       char *initrd_ptr = NULL;
> > +       char convert_str[CMDLINE_MAX_SIZE];
> > +       static char cmdline_array[CMDLINE_MAX_SIZE];
> > +
> > +       cmdline_size = strlen(cmdline_ptr);
> > +       snprintf(cmdline_array, CMDLINE_MAX_SIZE, "kernel ");
> > +
> > +       initrd_ptr = strstr(cmdline_ptr, "initrd=");
> > +       if (!initrd_ptr) {
> > +               snprintf(cmdline_array, CMDLINE_MAX_SIZE, "kernel %s", cmdline_ptr);
> > +               goto completed;
> > +       }
> > +       snprintf(convert_str, CMDLINE_MAX_SIZE, " initrd=0x%lx,0x%lx", rd_addr, rd_size);
> > +       rdprev_size = cmdline_size - strlen(initrd_ptr);
> > +       strncat(cmdline_array, cmdline_ptr, rdprev_size);
> > +
> > +       cmdline_ptr = strnstr(initrd_ptr, " ", CMDLINE_MAX_SIZE);
> > +       strcat(cmdline_array, convert_str);
> > +       if (!cmdline_ptr)
> > +               goto completed;
> > +
> > +       strcat(cmdline_array, cmdline_ptr);
> > +
> > +completed:
> > +       status = efi_allocate_pages((MAX_ARG_COUNT + 1) * (sizeof(char *)),
> > +                                       (unsigned long *)&argv, ULONG_MAX);
> > +       if (status != EFI_SUCCESS) {
> > +               efi_err("Alloc argv mmap_array error\n");
> > +               return status;
> > +       }
> > +
> > +       argc = 0;
> > +       pstr = cmdline_array;
> > +
> > +       substr = strsep(&pstr, " \t");
> > +       while (substr != NULL) {
> > +               if (strlen(substr)) {
> > +                       argv[argc++] = substr;
> > +                       if (argc == MAX_ARG_COUNT) {
> > +                               efi_err("Argv mmap_array full!\n");
> > +                               break;
> > +                       }
> > +               }
> > +               substr = strsep(&pstr, " \t");
> > +       }
> > +
> > +       return EFI_SUCCESS;
> > +}
> > +
> > +unsigned int efi_memmap_sort(struct loongsonlist_mem_map *memmap,
> > +                       unsigned int index, unsigned int mem_type)
> > +{
> > +       unsigned int i, t;
> > +       unsigned long msize;
> > +
> > +       for (i = 0; i < map_entry[mem_type]; i = t) {
> > +               msize = mmap_array[mem_type][i].mem_size;
> > +               for (t = i + 1; t < map_entry[mem_type]; t++) {
> > +                       if (mmap_array[mem_type][i].mem_start + msize <
> > +                                       mmap_array[mem_type][t].mem_start)
> > +                               break;
> > +
> > +                       msize += mmap_array[mem_type][t].mem_size;
> > +               }
> > +               memmap->map[index].mem_type = mem_type;
> > +               memmap->map[index].mem_start = mmap_array[mem_type][i].mem_start;
> > +               memmap->map[index].mem_size = msize;
> > +               memmap->map[index].attribute = mmap_array[mem_type][i].attribute;
> > +               index++;
> > +       }
> > +
> > +       return index;
> > +}
> > +
> > +static efi_status_t mk_mmap(struct efi_boot_memmap *map, struct boot_params *p)
> > +{

Are you passing a different representation of the memory map to the
core kernel? I think it would be easier just to pass the EFI memory
map like other EFI arches do, and reuse all of the code that we
already have.

> > +       char checksum;
> > +       unsigned int i;
> > +       unsigned int nr_desc;
> > +       unsigned int mem_type;
> > +       unsigned long count;
> > +       efi_memory_desc_t *mem_desc;
> > +       struct loongsonlist_mem_map *mhp = NULL;
> > +
> > +       memset(map_entry, 0, sizeof(map_entry));
> > +       memset(mmap_array, 0, sizeof(mmap_array));
> > +
> > +       if (!strncmp((char *)p, "BPI", 3)) {
> > +               p->flags |= BPI_FLAGS_UEFI_SUPPORTED;
> > +               p->systemtable = (efi_system_table_t *)efi_system_table;
> > +               p->extlist_offset = sizeof(*p) + sizeof(unsigned long);
> > +               mhp = (struct loongsonlist_mem_map *)((char *)p + p->extlist_offset);
> > +
> > +               memcpy(&mhp->header.signature, "MEM", sizeof(unsigned long));
> > +               mhp->header.length = sizeof(*mhp);
> > +               mhp->desc_version = *map->desc_ver;
> > +               mhp->map_count = 0;
> > +       }
> > +       if (!(*(map->map_size)) || !(*(map->desc_size)) || !mhp) {
> > +               efi_err("get memory info error\n");
> > +               return EFI_INVALID_PARAMETER;
> > +       }
> > +       nr_desc = *(map->map_size) / *(map->desc_size);
> > +
> > +       /*
> > +        * According to UEFI SPEC, mmap_buf is the accurate Memory Map
> > +        * mmap_array now we can fill platform specific memory structure.
> > +        */
> > +       for (i = 0; i < nr_desc; i++) {
> > +               mem_desc = (efi_memory_desc_t *)((void *)(*map->map) + (i * (*(map->desc_size))));
> > +               switch (mem_desc->type) {
> > +               case EFI_RESERVED_TYPE:
> > +               case EFI_RUNTIME_SERVICES_CODE:
> > +               case EFI_RUNTIME_SERVICES_DATA:
> > +               case EFI_MEMORY_MAPPED_IO:
> > +               case EFI_MEMORY_MAPPED_IO_PORT_SPACE:
> > +               case EFI_UNUSABLE_MEMORY:
> > +               case EFI_PAL_CODE:
> > +                       mem_type = ADDRESS_TYPE_RESERVED;
> > +                       break;
> > +
> > +               case EFI_ACPI_MEMORY_NVS:
> > +                       mem_type = ADDRESS_TYPE_NVS;
> > +                       break;
> > +
> > +               case EFI_ACPI_RECLAIM_MEMORY:
> > +                       mem_type = ADDRESS_TYPE_ACPI;
> > +                       break;
> > +
> > +               case EFI_LOADER_CODE:
> > +               case EFI_LOADER_DATA:
> > +               case EFI_PERSISTENT_MEMORY:
> > +               case EFI_BOOT_SERVICES_CODE:
> > +               case EFI_BOOT_SERVICES_DATA:
> > +               case EFI_CONVENTIONAL_MEMORY:
> > +                       mem_type = ADDRESS_TYPE_SYSRAM;
> > +                       break;
> > +
> > +               default:
> > +                       continue;
> > +               }
> > +
> > +               mmap_array[mem_type][map_entry[mem_type]].mem_type = mem_type;
> > +               mmap_array[mem_type][map_entry[mem_type]].mem_start =
> > +                                               mem_desc->phys_addr & TO_PHYS_MASK;
> > +               mmap_array[mem_type][map_entry[mem_type]].mem_size =
> > +                                               mem_desc->num_pages << EFI_PAGE_SHIFT;
> > +               mmap_array[mem_type][map_entry[mem_type]].attribute =
> > +                                               mem_desc->attribute;
> > +               map_entry[mem_type]++;
> > +       }
> > +
> > +       count = mhp->map_count;
> > +       /* Sort EFI memmap and add to BPI for kernel */
> > +       for (i = 0; i < LOONGSON3_BOOT_MEM_MAP_MAX; i++) {
> > +               if (!map_entry[i])
> > +                       continue;
> > +               count = efi_memmap_sort(mhp, count, i);
> > +       }
> > +
> > +       mhp->map_count = count;
> > +       mhp->header.checksum = 0;
> > +
> > +       checksum = efi_crc8((char *)mhp, mhp->header.length);
> > +       mhp->header.checksum = checksum;
> > +
> > +       return EFI_SUCCESS;
> > +}
> > +
> > +static efi_status_t exit_boot_func(struct efi_boot_memmap *map, void *priv)
> > +{
> > +       efi_status_t status;
> > +       struct exit_boot_struct *p = priv;
> > +
> > +       status = mk_mmap(map, p->bp);
> > +       if (status != EFI_SUCCESS) {
> > +               efi_err("Make kernel memory map failed!\n");
> > +               return status;
> > +       }
> > +
> > +       return EFI_SUCCESS;
> > +}
> > +
> > +static efi_status_t exit_boot_services(struct boot_params *boot_params, void *handle)
> > +{
> > +       unsigned int desc_version;
> > +       unsigned int runtime_entry_count = 0;
> > +       unsigned long map_size, key, desc_size, buff_size;
> > +       efi_status_t status;
> > +       efi_memory_desc_t *mem_map;
> > +       struct efi_boot_memmap map;
> > +       struct exit_boot_struct priv;
> > +
> > +       map.map                 = &mem_map;
> > +       map.map_size            = &map_size;
> > +       map.desc_size           = &desc_size;
> > +       map.desc_ver            = &desc_version;
> > +       map.key_ptr             = &key;
> > +       map.buff_size           = &buff_size;
> > +       status = efi_get_memory_map(&map);
> > +       if (status != EFI_SUCCESS) {
> > +               efi_err("Unable to retrieve UEFI memory map.\n");
> > +               return status;
> > +       }
> > +
> > +       priv.bp = boot_params;
> > +       priv.runtime_entry_count = &runtime_entry_count;
> > +
> > +       /* Might as well exit boot services now */
> > +       status = efi_exit_boot_services(handle, &map, &priv, exit_boot_func);
> > +       if (status != EFI_SUCCESS)
> > +               return status;
> > +
> > +       return EFI_SUCCESS;
> > +}
> > +
> > +/*
> > + * EFI entry point for the LoongArch EFI stub.
> > + */
> > +efi_status_t __efiapi efi_pe_entry(efi_handle_t handle, efi_system_table_t *sys_table)

Why are you not using the generic EFI stub boot flow?

> > +{
> > +       unsigned int cmdline_size = 0;
> > +       unsigned long kernel_addr = 0;
> > +       unsigned long initrd_addr = 0;
> > +       unsigned long initrd_size = 0;
> > +       enum efi_secureboot_mode secure_boot;
> > +       char *cmdline_ptr = NULL;
> > +       struct boot_params *boot_p;
> > +       efi_status_t status;
> > +       efi_loaded_image_t *image;
> > +       efi_guid_t loaded_image_proto;
> > +       kernel_entry_t real_kernel_entry;
> > +
> > +       /* Config Direct Mapping */
> > +       csr_writeq(CSR_DMW0_INIT, LOONGARCH_CSR_DMWIN0);
> > +       csr_writeq(CSR_DMW1_INIT, LOONGARCH_CSR_DMWIN1);
> > +

Why is this needed? Doesn't the EFI firmware enter the EFI loader with
this mapping enabled?

> > +       efi_system_table = sys_table;
> > +       loaded_image_proto = LOADED_IMAGE_PROTOCOL_GUID;
> > +       kernel_addr = (unsigned long)&kernel_offset - kernel_offset;
> > +       real_kernel_entry = (kernel_entry_t)
> > +               ((unsigned long)&kernel_entry - kernel_addr + kernel_vaddr);
> > +
> > +       /* Check if we were booted by the EFI firmware */
> > +       if (sys_table->hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE)
> > +               goto fail;
> > +
> > +       /*
> > +        * Get a handle to the loaded image protocol.  This is used to get
> > +        * information about the running image, such as size and the command
> > +        * line.
> > +        */
> > +       status = sys_table->boottime->handle_protocol(handle,
> > +                                       &loaded_image_proto, (void *)&image);
> > +       if (status != EFI_SUCCESS) {
> > +               efi_err("Failed to get loaded image protocol\n");
> > +               goto fail;
> > +       }
> > +
> > +       /* Get the command line from EFI, using the LOADED_IMAGE protocol. */
> > +       cmdline_ptr = efi_convert_cmdline(image, &cmdline_size);
> > +       if (!cmdline_ptr) {
> > +               efi_err("Getting command line failed!\n");
> > +               goto fail_free_cmdline;
> > +       }
> > +
> > +#ifdef CONFIG_CMDLINE_BOOL
> > +       if (cmdline_size == 0)
> > +               efi_parse_options(CONFIG_CMDLINE);
> > +#endif
> > +       if (!IS_ENABLED(CONFIG_CMDLINE_OVERRIDE) && cmdline_size > 0)
> > +               efi_parse_options(cmdline_ptr);
> > +
> > +       efi_info("Booting Linux Kernel...\n");
> > +
> > +       efi_relocate_kernel(&kernel_addr, kernel_fsize, kernel_asize,
> > +                           PHYSADDR(kernel_vaddr), SZ_2M, PHYSADDR(kernel_vaddr));
> > +
> > +       setup_graphics();
> > +       secure_boot = efi_get_secureboot();
> > +       efi_enable_reset_attack_mitigation();
> > +
> > +       status = efi_load_initrd(image, &initrd_addr, &initrd_size, SZ_4G, ULONG_MAX);
> > +       if (status != EFI_SUCCESS) {
> > +               efi_err("Failed get initrd addr!\n");
> > +               goto fail_free;
> > +       }
> > +
> > +       status = convert_priv_cmdline(cmdline_ptr, initrd_addr, initrd_size);
> > +       if (status != EFI_SUCCESS) {
> > +               efi_err("Covert cmdline failed!\n");
> > +               goto fail_free;
> > +       }
> > +
> > +       boot_p = bootparams_init(sys_table);
> > +       if (!boot_p) {
> > +               efi_err("Create BPI struct error!\n");
> > +               goto fail;
> > +       }
> > +
> > +       status = exit_boot_services(boot_p, handle);
> > +       if (status != EFI_SUCCESS) {
> > +               efi_err("exit_boot services failed!\n");
> > +               goto fail_free;
> > +       }
> > +
> > +       real_kernel_entry(argc, argv, boot_p);
> > +
> > +       return EFI_SUCCESS;
> > +
> > +fail_free:
> > +       efi_free(initrd_size, initrd_addr);
> > +
> > +fail_free_cmdline:
> > +       efi_free(cmdline_size, (unsigned long)cmdline_ptr);
> > +
> > +fail:
> > +       return status;
> > +}
> > diff --git a/include/linux/pe.h b/include/linux/pe.h
> > index daf09ffffe38..f4bb0b6a416d 100644
> > --- a/include/linux/pe.h
> > +++ b/include/linux/pe.h
> > @@ -65,6 +65,7 @@
> >  #define        IMAGE_FILE_MACHINE_SH5          0x01a8
> >  #define        IMAGE_FILE_MACHINE_THUMB        0x01c2
> >  #define        IMAGE_FILE_MACHINE_WCEMIPSV2    0x0169
> > +#define        IMAGE_FILE_MACHINE_LOONGARCH    0x6264
> >
> >  /* flags */
> >  #define IMAGE_FILE_RELOCS_STRIPPED           0x0001
> > --
> > 2.27.0
> >

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 20/24] LoongArch: Add efistub booting support
  2022-05-03  7:23     ` Ard Biesheuvel
@ 2022-05-05  9:59       ` Huacai Chen
  2022-05-06  8:14         ` Ard Biesheuvel
  0 siblings, 1 reply; 94+ messages in thread
From: Huacai Chen @ 2022-05-05  9:59 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Arnd Bergmann, Huacai Chen, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Guo Ren,
	Xuerui Wang, Jiaxun Yang

Hi, Ard,

On Tue, May 3, 2022 at 3:24 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Sat, 30 Apr 2022 at 11:56, Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> > >
> > > This patch adds efistub booting support, which is the standard UEFI boot
> > > protocol for us to use.
> > >
> > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> >
> > It's good to see that you completed this. Unfortunately you did not add Ard
> > Biesheuvel to Cc, he is the one who needs to review this code. Adding him
> > to Cc now, with the full patch quoted below for him (no more comments
> > from me there).
> >
>
> Thanks Arnd,
>
> >
> > > ---
> > >  arch/loongarch/Kbuild                         |   3 +
> > >  arch/loongarch/Kconfig                        |   8 +
> > >  arch/loongarch/Makefile                       |  18 +-
> > >  arch/loongarch/boot/Makefile                  |  23 +
> > >  arch/loongarch/kernel/efi-header.S            | 100 +++++
> > >  arch/loongarch/kernel/head.S                  |  44 +-
> > >  arch/loongarch/kernel/image-vars.h            |  30 ++
> > >  arch/loongarch/kernel/vmlinux.lds.S           |  23 +-
> > >  drivers/firmware/efi/Kconfig                  |   4 +-
> > >  drivers/firmware/efi/libstub/Makefile         |  14 +-
> > >  drivers/firmware/efi/libstub/loongarch-stub.c | 425 ++++++++++++++++++
> > >  include/linux/pe.h                            |   1 +
> > >  12 files changed, 680 insertions(+), 13 deletions(-)
> > >  create mode 100644 arch/loongarch/boot/Makefile
> > >  create mode 100644 arch/loongarch/kernel/efi-header.S
> > >  create mode 100644 arch/loongarch/kernel/image-vars.h
> > >  create mode 100644 drivers/firmware/efi/libstub/loongarch-stub.c
> > >
> > > diff --git a/arch/loongarch/Kbuild b/arch/loongarch/Kbuild
> > > index 1ad35aabdd16..ab5373d0a24f 100644
> > > --- a/arch/loongarch/Kbuild
> > > +++ b/arch/loongarch/Kbuild
> > > @@ -1,3 +1,6 @@
> > >  obj-y += kernel/
> > >  obj-y += mm/
> > >  obj-y += vdso/
> > > +
> > > +# for cleaning
> > > +subdir- += boot
> > > diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> > > index 44b763046893..55225ee5f868 100644
> > > --- a/arch/loongarch/Kconfig
> > > +++ b/arch/loongarch/Kconfig
> > > @@ -265,6 +265,14 @@ config EFI
> > >           resultant kernel should continue to boot on existing non-EFI
> > >           platforms.
> > >
> > > +config EFI_STUB
> > > +       bool "EFI boot stub support"
> > > +       default y
> > > +       depends on EFI
> > > +       help
> > > +         This kernel feature allows the kernel to be loaded directly by
> > > +         EFI firmware without the use of a bootloader.
> > > +
>
> Please enable EFI_GENERIC_STUB here
>
> > >  config FORCE_MAX_ZONEORDER
> > >         int "Maximum zone order"
> > >         range 14 64 if PAGE_SIZE_64KB
> > > diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
> > > index c4b3f53cd276..d88a792dafbe 100644
> > > --- a/arch/loongarch/Makefile
> > > +++ b/arch/loongarch/Makefile
> > > @@ -3,6 +3,14 @@
> > >  # Author: Huacai Chen <chenhuacai@loongson.cn>
> > >  # Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > >
> > > +boot   := arch/loongarch/boot
> > > +
> > > +ifndef CONFIG_EFI_STUB
> > > +KBUILD_IMAGE   = $(boot)/vmlinux
> > > +else
> > > +KBUILD_IMAGE   = $(boot)/vmlinux.efi
> > > +endif
> > > +
> > >  #
> > >  # Select the object file format to substitute into the linker script.
> > >  #
> > > @@ -30,8 +38,6 @@ ld-emul                       = $(64bit-emul)
> > >  cflags-y               += -mabi=lp64s
> > >  endif
> > >
> > > -all-y                  := vmlinux
> > > -
> > >  #
> > >  # GCC uses -G0 -mabicalls -fpic as default.  We don't want PIC in the kernel
> > >  # code since it only slows down the whole thing.  At some point we might make
> > > @@ -75,6 +81,7 @@ endif
> > >  head-y := arch/loongarch/kernel/head.o
> > >
> > >  libs-y += arch/loongarch/lib/
> > > +libs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
> > >
> > >  ifeq ($(KBUILD_EXTMOD),)
> > >  prepare: vdso_prepare
> > > @@ -86,12 +93,13 @@ PHONY += vdso_install
> > >  vdso_install:
> > >         $(Q)$(MAKE) $(build)=arch/loongarch/vdso $@
> > >
> > > -all:   $(all-y)
> > > +all:   $(KBUILD_IMAGE)
> > >
> > > -CLEAN_FILES += vmlinux
> > > +$(KBUILD_IMAGE): vmlinux
> > > +       $(Q)$(MAKE) $(build)=$(boot) $(bootvars-y) $@
> > >
> > >  install:
> > > -       $(Q)install -D -m 755 vmlinux $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
> > > +       $(Q)install -D -m 755 $(KBUILD_IMAGE) $(INSTALL_PATH)/vmlinux-$(KERNELRELEASE)
> > >         $(Q)install -D -m 644 .config $(INSTALL_PATH)/config-$(KERNELRELEASE)
> > >         $(Q)install -D -m 644 System.map $(INSTALL_PATH)/System.map-$(KERNELRELEASE)
> > >
> > > diff --git a/arch/loongarch/boot/Makefile b/arch/loongarch/boot/Makefile
> > > new file mode 100644
> > > index 000000000000..66f2293c34b2
> > > --- /dev/null
> > > +++ b/arch/loongarch/boot/Makefile
> > > @@ -0,0 +1,23 @@
> > > +#
> > > +# arch/loongarch/boot/Makefile
> > > +#
> > > +# Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > > +#
> > > +
> > > +drop-sections := .comment .note .options .note.gnu.build-id
> > > +strip-flags   := $(addprefix --remove-section=,$(drop-sections)) -S
> > > +
> > > +targets := vmlinux
> > > +quiet_cmd_strip = STRIP          $@
> > > +      cmd_strip = $(STRIP) -s $@
> > > +
> > > +$(obj)/vmlinux: vmlinux FORCE
> > > +       $(call if_changed,copy)
> > > +       $(call if_changed,strip)
> > > +
>
> I don't think you are supposed to use if_changed twice on the same target.
OK, thanks.

>
> > > +targets += vmlinux.efi
> > > +quiet_cmd_eficopy = OBJCOPY $@
> > > +      cmd_eficopy = $(OBJCOPY) -O binary $(strip-flags) $< $@
> > > +
>
> You could use the generic cmd_objcopy here instead of inventing your
> own. Just set OBJCOPYFLAGS_vmlinux.efi to the right value.
OK, thanks.

>
> > > +$(obj)/vmlinux.efi: $(obj)/vmlinux FORCE
> > > +       $(call if_changed,eficopy)
> > > diff --git a/arch/loongarch/kernel/efi-header.S b/arch/loongarch/kernel/efi-header.S
> > > new file mode 100644
> > > index 000000000000..ceb44524944a
> > > --- /dev/null
> > > +++ b/arch/loongarch/kernel/efi-header.S
> > > @@ -0,0 +1,100 @@
> > > +/* SPDX-License-Identifier: GPL-2.0 */
> > > +/*
> > > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > > + */
> > > +
> > > +#include <linux/pe.h>
> > > +#include <linux/sizes.h>
> > > +
> > > +       .macro  __EFI_PE_HEADER
> > > +       .long   PE_MAGIC
> > > +coff_header:
>
> Please use .L prefixed local symbol definitions in this file, so we
> don't clutter up the core kernel's global symbol table.
I found that ARM64 uses .L prefix while RISCV doesn't, so I suppose
that both OK?

>
> > > +       .short  IMAGE_FILE_MACHINE_LOONGARCH            /* Machine */
> > > +       .short  section_count                           /* NumberOfSections */
> > > +       .long   0                                       /* TimeDateStamp */
> > > +       .long   0                                       /* PointerToSymbolTable */
> > > +       .long   0                                       /* NumberOfSymbols */
> > > +       .short  section_table - optional_header         /* SizeOfOptionalHeader */
> > > +       .short  IMAGE_FILE_DEBUG_STRIPPED | \
> > > +               IMAGE_FILE_EXECUTABLE_IMAGE | \
> > > +               IMAGE_FILE_LINE_NUMS_STRIPPED           /* Characteristics */
> > > +
> > > +optional_header:
> > > +       .short  PE_OPT_MAGIC_PE32PLUS                   /* PE32+ format */
> > > +       .byte   0x02                                    /* MajorLinkerVersion */
> > > +       .byte   0x14                                    /* MinorLinkerVersion */
> > > +       .long   __inittext_end - efi_header_end         /* SizeOfCode */
> > > +       .long   _end - __initdata_begin                 /* SizeOfInitializedData */
> > > +       .long   0                                       /* SizeOfUninitializedData */
> > > +       .long   __efistub_efi_pe_entry - _head          /* AddressOfEntryPoint */
> > > +       .long   efi_header_end - _head                  /* BaseOfCode */
> > > +
> > > +extra_header_fields:
> > > +       .quad   0                                       /* ImageBase */
> > > +       .long   PECOFF_SEGMENT_ALIGN                    /* SectionAlignment */
> > > +       .long   PECOFF_FILE_ALIGN                       /* FileAlignment */
> > > +       .short  0                                       /* MajorOperatingSystemVersion */
> > > +       .short  0                                       /* MinorOperatingSystemVersion */
> > > +       .short  0                                       /* MajorImageVersion */
> > > +       .short  0                                       /* MinorImageVersion */
>
> Once you enable EFI_GENERIC_STUB, set the above fields to
> EFISTUB_MAJOR_IMAGE_VERSION/EFISTUB_MINOR_IMAGE_VERSION, so
> bootloaders know they can use the LoadFile2 based initrd loader.
OK, versions will be filled.

>
> > > +       .short  0                                       /* MajorSubsystemVersion */
> > > +       .short  0                                       /* MinorSubsystemVersion */
> > > +       .long   0                                       /* Win32VersionValue */
> > > +
> > > +       .long   _end - _head                            /* SizeOfImage */
> > > +
> > > +       /* Everything before the kernel image is considered part of the header */
> > > +       .long   efi_header_end - _head                  /* SizeOfHeaders */
> > > +       .long   0                                       /* CheckSum */
> > > +       .short  IMAGE_SUBSYSTEM_EFI_APPLICATION         /* Subsystem */
> > > +       .short  0                                       /* DllCharacteristics */
> > > +       .quad   0                                       /* SizeOfStackReserve */
> > > +       .quad   0                                       /* SizeOfStackCommit */
> > > +       .quad   0                                       /* SizeOfHeapReserve */
> > > +       .quad   0                                       /* SizeOfHeapCommit */
> > > +       .long   0                                       /* LoaderFlags */
> > > +       .long   (section_table - .) / 8                 /* NumberOfRvaAndSizes */
> > > +
> > > +       .quad   0                                       /* ExportTable */
> > > +       .quad   0                                       /* ImportTable */
> > > +       .quad   0                                       /* ResourceTable */
> > > +       .quad   0                                       /* ExceptionTable */
> > > +       .quad   0                                       /* CertificationTable */
> > > +       .quad   0                                       /* BaseRelocationTable */
> > > +
> > > +       /* Section table */
> > > +section_table:
> > > +       .ascii  ".text\0\0\0"
> > > +       .long   __inittext_end - efi_header_end         /* VirtualSize */
> > > +       .long   efi_header_end - _head                  /* VirtualAddress */
> > > +       .long   __inittext_end - efi_header_end         /* SizeOfRawData */
> > > +       .long   efi_header_end - _head                  /* PointerToRawData */
> > > +
> > > +       .long   0                                       /* PointerToRelocations */
> > > +       .long   0                                       /* PointerToLineNumbers */
> > > +       .short  0                                       /* NumberOfRelocations */
> > > +       .short  0                                       /* NumberOfLineNumbers */
> > > +       .long   IMAGE_SCN_CNT_CODE | \
> > > +               IMAGE_SCN_MEM_READ | \
> > > +               IMAGE_SCN_MEM_EXECUTE                   /* Characteristics */
> > > +
> > > +       .ascii  ".data\0\0\0"
> > > +       .long   _end - __initdata_begin                 /* VirtualSize */
> > > +       .long   __initdata_begin - _head                /* VirtualAddress */
> > > +       .long   _edata - __initdata_begin               /* SizeOfRawData */
> > > +       .long   __initdata_begin - _head                /* PointerToRawData */
> > > +
> > > +       .long   0                                       /* PointerToRelocations */
> > > +       .long   0                                       /* PointerToLineNumbers */
> > > +       .short  0                                       /* NumberOfRelocations */
> > > +       .short  0                                       /* NumberOfLineNumbers */
> > > +       .long   IMAGE_SCN_CNT_INITIALIZED_DATA | \
> > > +               IMAGE_SCN_MEM_READ | \
> > > +               IMAGE_SCN_MEM_WRITE                     /* Characteristics */
> > > +
> > > +       .org 0x20e
> > > +       .word kernel_version - 512 -  _head
> > > +
> > > +       .set    section_count, (. - section_table) / 40
> > > +efi_header_end:
> > > +       .endm
> > > diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
> > > index b4a0b28da3e7..361b72e8bfc5 100644
> > > --- a/arch/loongarch/kernel/head.S
> > > +++ b/arch/loongarch/kernel/head.S
> > > @@ -11,11 +11,53 @@
> > >  #include <asm/regdef.h>
> > >  #include <asm/loongarch.h>
> > >  #include <asm/stackframe.h>
> > > +#include <generated/compile.h>
> > > +#include <generated/utsrelease.h>
> > >
> > > -SYM_ENTRY(_stext, SYM_L_GLOBAL, SYM_A_NONE)
> > > +#ifdef CONFIG_EFI_STUB
> > > +
> > > +#include "efi-header.S"
> > > +
> > > +       __HEAD
> > > +
> > > +_head:
> > > +       /* "MZ", MS-DOS header */
> > > +       .word   MZ_MAGIC
> > > +       .org    0x28
> > > +       .ascii  "Loongson\0"
>
> Is this part of a special boot protocol? It would be better not to
> overload EFI and PE/COFF with your own hacks if we can avoid it.
This is used as a magic string and Grub will check it.

>
> > > +       .org    0x3c
> > > +       /* Offset to the PE header */
> > > +       .long   pe_header - _head
> > > +
> > > +pe_header:
> > > +       __EFI_PE_HEADER
> > > +
> > > +kernel_asize:
> > > +       .long _end - _text
> > > +
> > > +kernel_fsize:
> > > +       .long _edata - _text
> > > +
> > > +kernel_vaddr:
> > > +       .quad VMLINUX_LOAD_ADDRESS
> > > +
> > > +kernel_offset:
> > > +       .long kernel_offset - _text
> > > +
> > > +kernel_version:
> > > +       .ascii  UTS_RELEASE " (" LINUX_COMPILE_BY "@" LINUX_COMPILE_HOST ") " UTS_VERSION "\0"
> > > +
> > > +SYM_L_GLOBAL(kernel_asize)
> > > +SYM_L_GLOBAL(kernel_fsize)
> > > +SYM_L_GLOBAL(kernel_vaddr)
> > > +SYM_L_GLOBAL(kernel_offset)
>
> I think you can simplify this to
>
> SYM_DATA(kernel_asize, .long _end - _text);
>
> etc etc (which implies the .globl annotation)
OK, thanks.

>
>
> > > +
> > > +#endif
> > >
> > >         __REF
> > >
> > > +SYM_ENTRY(_stext, SYM_L_GLOBAL, SYM_A_NONE)
> > > +
> > >  SYM_CODE_START(kernel_entry)                   # kernel entry point
> > >
> > >         /* Config direct window and set PG */
> > > diff --git a/arch/loongarch/kernel/image-vars.h b/arch/loongarch/kernel/image-vars.h
> > > new file mode 100644
> > > index 000000000000..0162402b6212
> > > --- /dev/null
> > > +++ b/arch/loongarch/kernel/image-vars.h
> > > @@ -0,0 +1,30 @@
> > > +/* SPDX-License-Identifier: GPL-2.0-only */
> > > +/*
> > > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > > + */
> > > +#ifndef __LOONGARCH_KERNEL_IMAGE_VARS_H
> > > +#define __LOONGARCH_KERNEL_IMAGE_VARS_H
> > > +
> > > +#ifdef CONFIG_EFI_STUB
> > > +
> > > +__efistub_memcmp               = memcmp;
> > > +__efistub_memcpy               = memcpy;
> > > +__efistub_memmove              = memmove;
> > > +__efistub_memset               = memset;
> > > +__efistub_strcat               = strcat;
> > > +__efistub_strcmp               = strcmp;
> > > +__efistub_strlen               = strlen;
> > > +__efistub_strncat              = strncat;
> > > +__efistub_strnstr              = strnstr;
> > > +__efistub_strnlen              = strnlen;
> > > +__efistub_strpbrk              = strpbrk;
> > > +__efistub_strsep               = strsep;
> > > +__efistub_kernel_entry         = kernel_entry;
> > > +__efistub_kernel_asize         = kernel_asize;
> > > +__efistub_kernel_fsize         = kernel_fsize;
> > > +__efistub_kernel_vaddr         = kernel_vaddr;
> > > +__efistub_kernel_offset                = kernel_offset;
> > > +
> > > +#endif
> > > +
> > > +#endif /* __LOONGARCH_KERNEL_IMAGE_VARS_H */
> > > diff --git a/arch/loongarch/kernel/vmlinux.lds.S b/arch/loongarch/kernel/vmlinux.lds.S
> > > index 02abfaaa4892..7da4c4d7c50d 100644
> > > --- a/arch/loongarch/kernel/vmlinux.lds.S
> > > +++ b/arch/loongarch/kernel/vmlinux.lds.S
> > > @@ -12,6 +12,14 @@
> > >  #define BSS_FIRST_SECTIONS *(.bss..swapper_pg_dir)
> > >
> > >  #include <asm-generic/vmlinux.lds.h>
> > > +#include "image-vars.h"
> > > +
> > > +/*
> > > + * Max avaliable Page Size is 64K, so we set SectionAlignment
> > > + * field of EFI application to 64K.
> > > + */
> > > +PECOFF_FILE_ALIGN = 0x200;
> > > +PECOFF_SEGMENT_ALIGN = 0x10000;
> > >
> > >  OUTPUT_ARCH(loongarch)
> > >  ENTRY(kernel_entry)
> > > @@ -27,6 +35,9 @@ SECTIONS
> > >         . = VMLINUX_LOAD_ADDRESS;
> > >
> > >         _text = .;
> > > +       HEAD_TEXT_SECTION
> > > +
> > > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > >         .text : {
> > >                 TEXT_TEXT
> > >                 SCHED_TEXT
> > > @@ -38,11 +49,12 @@ SECTIONS
> > >                 *(.fixup)
> > >                 *(.gnu.warning)
> > >         } :text = 0
> > > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > >         _etext = .;
> > >
> > >         EXCEPTION_TABLE(16)
> > >
> > > -       . = ALIGN(PAGE_SIZE);
> > > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > >         __init_begin = .;
> > >         __inittext_begin = .;
> > >
> > > @@ -51,6 +63,7 @@ SECTIONS
> > >                 EXIT_TEXT
> > >         }
> > >
> > > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > >         __inittext_end = .;
> > >
> > >         __initdata_begin = .;
> > > @@ -60,6 +73,10 @@ SECTIONS
> > >                 EXIT_DATA
> > >         }
> > >
> > > +       .init.bss : {
> > > +               *(.init.bss)
> > > +       }
> > > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > >         __initdata_end = .;
> > >
> > >         __init_end = .;
> > > @@ -71,11 +88,11 @@ SECTIONS
> > >         .sdata : {
> > >                 *(.sdata)
> > >         }
> > > -
> > > -       . = ALIGN(SZ_64K);
> > > +       .edata_padding : { BYTE(0); . = ALIGN(PECOFF_FILE_ALIGN); }
> > >         _edata =  .;
> > >
> > >         BSS_SECTION(0, SZ_64K, 8)
> > > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > >
> > >         _end = .;
> > >
> > > diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig
> > > index 2c3dac5ecb36..ecb4e0b1295a 100644
> > > --- a/drivers/firmware/efi/Kconfig
> > > +++ b/drivers/firmware/efi/Kconfig
> > > @@ -121,9 +121,9 @@ config EFI_ARMSTUB_DTB_LOADER
> > >
> > >  config EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER
> > >         bool "Enable the command line initrd loader" if !X86
> > > -       depends on EFI_STUB && (EFI_GENERIC_STUB || X86)
> > > -       default y if X86
> > >         depends on !RISCV
> > > +       depends on EFI_STUB && (EFI_GENERIC_STUB || X86 || LOONGARCH)
> > > +       default y if (X86 || LOONGARCH)
>
> Don't enable the command line initrd loader please. It is deprecated,
> and has been replaced with the LoadFile2 protocol based one, which is
> more flexible.
>
> Uboot already implements it, as well as EDK2. GRUB does not implement
> this yet afair, but it should not be that hard to add.
If we don't select this, is it possible to load initrd in the UEFI shell?

>
> > >         help
> > >           Select this config option to add support for the initrd= command
> > >           line parameter, allowing an initrd that resides on the same volume
> > > diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
> > > index d0537573501e..663e9d317299 100644
> > > --- a/drivers/firmware/efi/libstub/Makefile
> > > +++ b/drivers/firmware/efi/libstub/Makefile
> > > @@ -26,6 +26,8 @@ cflags-$(CONFIG_ARM)          := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > >                                    $(call cc-option,-mno-single-pic-base)
> > >  cflags-$(CONFIG_RISCV)         := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > >                                    -fpic
> > > +cflags-$(CONFIG_LOONGARCH)     := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > > +                                  -fpic
> > >
> > >  cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
> > >
> > > @@ -55,7 +57,7 @@ KCOV_INSTRUMENT                       := n
> > >  lib-y                          := efi-stub-helper.o gop.o secureboot.o tpm.o \
> > >                                    file.o mem.o random.o randomalloc.o pci.o \
> > >                                    skip_spaces.o lib-cmdline.o lib-ctype.o \
> > > -                                  alignedmem.o relocate.o vsprintf.o
> > > +                                  alignedmem.o relocate.o string.o vsprintf.o
> > >
> > >  # include the stub's generic dependencies from lib/ when building for ARM/arm64
> > >  efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c
> > > @@ -63,13 +65,15 @@ efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c
> > >  $(obj)/lib-%.o: $(srctree)/lib/%.c FORCE
> > >         $(call if_changed_rule,cc_o_c)
> > >
> > > -lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o fdt.o string.o \
> > > +lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o fdt.o \
> > >                                    $(patsubst %.c,lib-%.o,$(efi-deps-y))
> > >
> > >  lib-$(CONFIG_ARM)              += arm32-stub.o
> > >  lib-$(CONFIG_ARM64)            += arm64-stub.o
> > >  lib-$(CONFIG_X86)              += x86-stub.o
> > >  lib-$(CONFIG_RISCV)            += riscv-stub.o
> > > +lib-$(CONFIG_LOONGARCH)                += loongarch-stub.o
> > > +
> > >  CFLAGS_arm32-stub.o            := -DTEXT_OFFSET=$(TEXT_OFFSET)
> > >
> > >  # Even when -mbranch-protection=none is set, Clang will generate a
> > > @@ -125,6 +129,12 @@ STUBCOPY_FLAGS-$(CONFIG_RISCV)     += --prefix-alloc-sections=.init \
> > >                                    --prefix-symbols=__efistub_
> > >  STUBCOPY_RELOC-$(CONFIG_RISCV) := R_RISCV_HI20
> > >
> > > +# For LoongArch, keep all the symbols in .init section and make sure that no
> > > +# absolute symbols references doesn't exist.
> > > +STUBCOPY_FLAGS-$(CONFIG_LOONGARCH)     += --prefix-alloc-sections=.init \
> > > +                                          --prefix-symbols=__efistub_
> > > +STUBCOPY_RELOC-$(CONFIG_LOONGARCH)     := R_LARCH_MARK_LA
> > > +
> > >  $(obj)/%.stub.o: $(obj)/%.o FORCE
> > >         $(call if_changed,stubcopy)
> > >
> > > diff --git a/drivers/firmware/efi/libstub/loongarch-stub.c b/drivers/firmware/efi/libstub/loongarch-stub.c
> > > new file mode 100644
> > > index 000000000000..399641a0b0cb
> > > --- /dev/null
> > > +++ b/drivers/firmware/efi/libstub/loongarch-stub.c
> > > @@ -0,0 +1,425 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * Author: Yun Liu <liuyun@loongson.cn>
> > > + *         Huacai Chen <chenhuacai@loongson.cn>
> > > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > > + */
> > > +
> > > +#include <linux/efi.h>
> > > +#include <linux/sort.h>
> > > +#include <asm/efi.h>
> > > +#include <asm/addrspace.h>
> > > +#include <asm/boot_param.h>
> > > +#include "efistub.h"
> > > +
> > > +#define MAX_ARG_COUNT          128
> > > +#define CMDLINE_MAX_SIZE       0x200
> > > +
> > > +static int argc;
> > > +static char **argv;
> > > +const efi_system_table_t *efi_system_table;
> > > +static efi_guid_t screen_info_guid = LINUX_EFI_LARCH_SCREEN_INFO_TABLE_GUID;
> > > +static unsigned int map_entry[LOONGSON3_BOOT_MEM_MAP_MAX];
> > > +static struct efi_mmap mmap_array[EFI_MAX_MEMORY_TYPE][LOONGSON3_BOOT_MEM_MAP_MAX];
> > > +
> > > +struct exit_boot_struct {
> > > +       struct boot_params *bp;
> > > +       unsigned int *runtime_entry_count;
> > > +};
> > > +
> > > +typedef void (*kernel_entry_t)(int argc, char *argv[], struct boot_params *boot_p);
> > > +
> > > +extern int kernel_asize;
> > > +extern int kernel_fsize;
> > > +extern int kernel_offset;
> > > +extern unsigned long kernel_vaddr;
> > > +extern kernel_entry_t kernel_entry;
> > > +
> > > +unsigned char efi_crc8(char *buff, int size)
> > > +{
> > > +       int sum, cnt;
> > > +
> > > +       for (sum = 0, cnt = 0; cnt < size; cnt++)
> > > +               sum = (char) (sum + *(buff + cnt));
> > > +
> > > +       return (char)(0x100 - sum);
> > > +}
> > > +
> > > +struct screen_info *alloc_screen_info(void)
> > > +{
> > > +       efi_status_t status;
> > > +       struct screen_info *si;
> > > +
> > > +       status = efi_bs_call(allocate_pool,
> > > +                       EFI_RUNTIME_SERVICES_DATA, sizeof(*si), (void **)&si);
> > > +       if (status != EFI_SUCCESS)
> > > +               return NULL;
> > > +
> > > +       status = efi_bs_call(install_configuration_table, &screen_info_guid, si);
> > > +       if (status == EFI_SUCCESS)
> > > +               return si;
> > > +
> > > +       efi_bs_call(free_pool, si);
> > > +
> > > +       return NULL;
> > > +}
> > > +
> > > +static void setup_graphics(void)
> > > +{
> > > +       unsigned long size;
> > > +       efi_status_t status;
> > > +       efi_guid_t gop_proto = EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID;
> > > +       void **gop_handle = NULL;
> > > +       struct screen_info *si = NULL;
> > > +
> > > +       size = 0;
> > > +       status = efi_bs_call(locate_handle, EFI_LOCATE_BY_PROTOCOL,
> > > +                               &gop_proto, NULL, &size, gop_handle);
> > > +       if (status == EFI_BUFFER_TOO_SMALL) {
> > > +               si = alloc_screen_info();
> > > +               efi_setup_gop(si, &gop_proto, size);
> > > +       }
> > > +}
> > > +
> > > +struct boot_params *bootparams_init(efi_system_table_t *sys_table)
> > > +{
> > > +       efi_status_t status;
> > > +       struct boot_params *p;
> > > +       unsigned char sig[8] = {'B', 'P', 'I', '0', '1', '0', '0', '2'};
> > > +
> > > +       status = efi_bs_call(allocate_pool, EFI_RUNTIME_SERVICES_DATA, SZ_64K, (void **)&p);
> > > +       if (status != EFI_SUCCESS)
> > > +               return NULL;
> > > +
> > > +       memset(p, 0, SZ_64K);
> > > +       memcpy(&p->signature, sig, sizeof(long));
> > > +
> > > +       return p;
> > > +}
> > > +
> > > +static unsigned long convert_priv_cmdline(char *cmdline_ptr,
> > > +               unsigned long rd_addr, unsigned long rd_size)
> > > +{
> > > +       unsigned int rdprev_size;
> > > +       unsigned int cmdline_size;
> > > +       efi_status_t status;
> > > +       char *pstr, *substr;
> > > +       char *initrd_ptr = NULL;
> > > +       char convert_str[CMDLINE_MAX_SIZE];
> > > +       static char cmdline_array[CMDLINE_MAX_SIZE];
> > > +
> > > +       cmdline_size = strlen(cmdline_ptr);
> > > +       snprintf(cmdline_array, CMDLINE_MAX_SIZE, "kernel ");
> > > +
> > > +       initrd_ptr = strstr(cmdline_ptr, "initrd=");
> > > +       if (!initrd_ptr) {
> > > +               snprintf(cmdline_array, CMDLINE_MAX_SIZE, "kernel %s", cmdline_ptr);
> > > +               goto completed;
> > > +       }
> > > +       snprintf(convert_str, CMDLINE_MAX_SIZE, " initrd=0x%lx,0x%lx", rd_addr, rd_size);
> > > +       rdprev_size = cmdline_size - strlen(initrd_ptr);
> > > +       strncat(cmdline_array, cmdline_ptr, rdprev_size);
> > > +
> > > +       cmdline_ptr = strnstr(initrd_ptr, " ", CMDLINE_MAX_SIZE);
> > > +       strcat(cmdline_array, convert_str);
> > > +       if (!cmdline_ptr)
> > > +               goto completed;
> > > +
> > > +       strcat(cmdline_array, cmdline_ptr);
> > > +
> > > +completed:
> > > +       status = efi_allocate_pages((MAX_ARG_COUNT + 1) * (sizeof(char *)),
> > > +                                       (unsigned long *)&argv, ULONG_MAX);
> > > +       if (status != EFI_SUCCESS) {
> > > +               efi_err("Alloc argv mmap_array error\n");
> > > +               return status;
> > > +       }
> > > +
> > > +       argc = 0;
> > > +       pstr = cmdline_array;
> > > +
> > > +       substr = strsep(&pstr, " \t");
> > > +       while (substr != NULL) {
> > > +               if (strlen(substr)) {
> > > +                       argv[argc++] = substr;
> > > +                       if (argc == MAX_ARG_COUNT) {
> > > +                               efi_err("Argv mmap_array full!\n");
> > > +                               break;
> > > +                       }
> > > +               }
> > > +               substr = strsep(&pstr, " \t");
> > > +       }
> > > +
> > > +       return EFI_SUCCESS;
> > > +}
> > > +
> > > +unsigned int efi_memmap_sort(struct loongsonlist_mem_map *memmap,
> > > +                       unsigned int index, unsigned int mem_type)
> > > +{
> > > +       unsigned int i, t;
> > > +       unsigned long msize;
> > > +
> > > +       for (i = 0; i < map_entry[mem_type]; i = t) {
> > > +               msize = mmap_array[mem_type][i].mem_size;
> > > +               for (t = i + 1; t < map_entry[mem_type]; t++) {
> > > +                       if (mmap_array[mem_type][i].mem_start + msize <
> > > +                                       mmap_array[mem_type][t].mem_start)
> > > +                               break;
> > > +
> > > +                       msize += mmap_array[mem_type][t].mem_size;
> > > +               }
> > > +               memmap->map[index].mem_type = mem_type;
> > > +               memmap->map[index].mem_start = mmap_array[mem_type][i].mem_start;
> > > +               memmap->map[index].mem_size = msize;
> > > +               memmap->map[index].attribute = mmap_array[mem_type][i].attribute;
> > > +               index++;
> > > +       }
> > > +
> > > +       return index;
> > > +}
> > > +
> > > +static efi_status_t mk_mmap(struct efi_boot_memmap *map, struct boot_params *p)
> > > +{
>
> Are you passing a different representation of the memory map to the
> core kernel? I think it would be easier just to pass the EFI memory
> map like other EFI arches do, and reuse all of the code that we
> already have.
Yes, this different representation is used by our "boot_params", the
interface between bootloader (including efistub) and the core kernel.
>
> > > +       char checksum;
> > > +       unsigned int i;
> > > +       unsigned int nr_desc;
> > > +       unsigned int mem_type;
> > > +       unsigned long count;
> > > +       efi_memory_desc_t *mem_desc;
> > > +       struct loongsonlist_mem_map *mhp = NULL;
> > > +
> > > +       memset(map_entry, 0, sizeof(map_entry));
> > > +       memset(mmap_array, 0, sizeof(mmap_array));
> > > +
> > > +       if (!strncmp((char *)p, "BPI", 3)) {
> > > +               p->flags |= BPI_FLAGS_UEFI_SUPPORTED;
> > > +               p->systemtable = (efi_system_table_t *)efi_system_table;
> > > +               p->extlist_offset = sizeof(*p) + sizeof(unsigned long);
> > > +               mhp = (struct loongsonlist_mem_map *)((char *)p + p->extlist_offset);
> > > +
> > > +               memcpy(&mhp->header.signature, "MEM", sizeof(unsigned long));
> > > +               mhp->header.length = sizeof(*mhp);
> > > +               mhp->desc_version = *map->desc_ver;
> > > +               mhp->map_count = 0;
> > > +       }
> > > +       if (!(*(map->map_size)) || !(*(map->desc_size)) || !mhp) {
> > > +               efi_err("get memory info error\n");
> > > +               return EFI_INVALID_PARAMETER;
> > > +       }
> > > +       nr_desc = *(map->map_size) / *(map->desc_size);
> > > +
> > > +       /*
> > > +        * According to UEFI SPEC, mmap_buf is the accurate Memory Map
> > > +        * mmap_array now we can fill platform specific memory structure.
> > > +        */
> > > +       for (i = 0; i < nr_desc; i++) {
> > > +               mem_desc = (efi_memory_desc_t *)((void *)(*map->map) + (i * (*(map->desc_size))));
> > > +               switch (mem_desc->type) {
> > > +               case EFI_RESERVED_TYPE:
> > > +               case EFI_RUNTIME_SERVICES_CODE:
> > > +               case EFI_RUNTIME_SERVICES_DATA:
> > > +               case EFI_MEMORY_MAPPED_IO:
> > > +               case EFI_MEMORY_MAPPED_IO_PORT_SPACE:
> > > +               case EFI_UNUSABLE_MEMORY:
> > > +               case EFI_PAL_CODE:
> > > +                       mem_type = ADDRESS_TYPE_RESERVED;
> > > +                       break;
> > > +
> > > +               case EFI_ACPI_MEMORY_NVS:
> > > +                       mem_type = ADDRESS_TYPE_NVS;
> > > +                       break;
> > > +
> > > +               case EFI_ACPI_RECLAIM_MEMORY:
> > > +                       mem_type = ADDRESS_TYPE_ACPI;
> > > +                       break;
> > > +
> > > +               case EFI_LOADER_CODE:
> > > +               case EFI_LOADER_DATA:
> > > +               case EFI_PERSISTENT_MEMORY:
> > > +               case EFI_BOOT_SERVICES_CODE:
> > > +               case EFI_BOOT_SERVICES_DATA:
> > > +               case EFI_CONVENTIONAL_MEMORY:
> > > +                       mem_type = ADDRESS_TYPE_SYSRAM;
> > > +                       break;
> > > +
> > > +               default:
> > > +                       continue;
> > > +               }
> > > +
> > > +               mmap_array[mem_type][map_entry[mem_type]].mem_type = mem_type;
> > > +               mmap_array[mem_type][map_entry[mem_type]].mem_start =
> > > +                                               mem_desc->phys_addr & TO_PHYS_MASK;
> > > +               mmap_array[mem_type][map_entry[mem_type]].mem_size =
> > > +                                               mem_desc->num_pages << EFI_PAGE_SHIFT;
> > > +               mmap_array[mem_type][map_entry[mem_type]].attribute =
> > > +                                               mem_desc->attribute;
> > > +               map_entry[mem_type]++;
> > > +       }
> > > +
> > > +       count = mhp->map_count;
> > > +       /* Sort EFI memmap and add to BPI for kernel */
> > > +       for (i = 0; i < LOONGSON3_BOOT_MEM_MAP_MAX; i++) {
> > > +               if (!map_entry[i])
> > > +                       continue;
> > > +               count = efi_memmap_sort(mhp, count, i);
> > > +       }
> > > +
> > > +       mhp->map_count = count;
> > > +       mhp->header.checksum = 0;
> > > +
> > > +       checksum = efi_crc8((char *)mhp, mhp->header.length);
> > > +       mhp->header.checksum = checksum;
> > > +
> > > +       return EFI_SUCCESS;
> > > +}
> > > +
> > > +static efi_status_t exit_boot_func(struct efi_boot_memmap *map, void *priv)
> > > +{
> > > +       efi_status_t status;
> > > +       struct exit_boot_struct *p = priv;
> > > +
> > > +       status = mk_mmap(map, p->bp);
> > > +       if (status != EFI_SUCCESS) {
> > > +               efi_err("Make kernel memory map failed!\n");
> > > +               return status;
> > > +       }
> > > +
> > > +       return EFI_SUCCESS;
> > > +}
> > > +
> > > +static efi_status_t exit_boot_services(struct boot_params *boot_params, void *handle)
> > > +{
> > > +       unsigned int desc_version;
> > > +       unsigned int runtime_entry_count = 0;
> > > +       unsigned long map_size, key, desc_size, buff_size;
> > > +       efi_status_t status;
> > > +       efi_memory_desc_t *mem_map;
> > > +       struct efi_boot_memmap map;
> > > +       struct exit_boot_struct priv;
> > > +
> > > +       map.map                 = &mem_map;
> > > +       map.map_size            = &map_size;
> > > +       map.desc_size           = &desc_size;
> > > +       map.desc_ver            = &desc_version;
> > > +       map.key_ptr             = &key;
> > > +       map.buff_size           = &buff_size;
> > > +       status = efi_get_memory_map(&map);
> > > +       if (status != EFI_SUCCESS) {
> > > +               efi_err("Unable to retrieve UEFI memory map.\n");
> > > +               return status;
> > > +       }
> > > +
> > > +       priv.bp = boot_params;
> > > +       priv.runtime_entry_count = &runtime_entry_count;
> > > +
> > > +       /* Might as well exit boot services now */
> > > +       status = efi_exit_boot_services(handle, &map, &priv, exit_boot_func);
> > > +       if (status != EFI_SUCCESS)
> > > +               return status;
> > > +
> > > +       return EFI_SUCCESS;
> > > +}
> > > +
> > > +/*
> > > + * EFI entry point for the LoongArch EFI stub.
> > > + */
> > > +efi_status_t __efiapi efi_pe_entry(efi_handle_t handle, efi_system_table_t *sys_table)
>
> Why are you not using the generic EFI stub boot flow?
Hmmm, as I know, we define our own "boot_params", a interface between
bootloader (including efistub) and the core kernel to pass memmap,
cmdline and initrd information, three years ago. This method looks
like the X86 way, while different from the generic stub (which is
called arm stub before 5.8). In these years, many products have
already use the "boot_params" interface (including UEFI, PMON, Grub,
Kernel, etc., but most of them haven't be upstream). Replace
boot_params with FDT (i.e., the generic stub way) is difficult for us,
because it means a big broken of compatibility.

Huacai
>
> > > +{
> > > +       unsigned int cmdline_size = 0;
> > > +       unsigned long kernel_addr = 0;
> > > +       unsigned long initrd_addr = 0;
> > > +       unsigned long initrd_size = 0;
> > > +       enum efi_secureboot_mode secure_boot;
> > > +       char *cmdline_ptr = NULL;
> > > +       struct boot_params *boot_p;
> > > +       efi_status_t status;
> > > +       efi_loaded_image_t *image;
> > > +       efi_guid_t loaded_image_proto;
> > > +       kernel_entry_t real_kernel_entry;
> > > +
> > > +       /* Config Direct Mapping */
> > > +       csr_writeq(CSR_DMW0_INIT, LOONGARCH_CSR_DMWIN0);
> > > +       csr_writeq(CSR_DMW1_INIT, LOONGARCH_CSR_DMWIN1);
> > > +
>
> Why is this needed? Doesn't the EFI firmware enter the EFI loader with
> this mapping enabled?
>
> > > +       efi_system_table = sys_table;
> > > +       loaded_image_proto = LOADED_IMAGE_PROTOCOL_GUID;
> > > +       kernel_addr = (unsigned long)&kernel_offset - kernel_offset;
> > > +       real_kernel_entry = (kernel_entry_t)
> > > +               ((unsigned long)&kernel_entry - kernel_addr + kernel_vaddr);
> > > +
> > > +       /* Check if we were booted by the EFI firmware */
> > > +       if (sys_table->hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE)
> > > +               goto fail;
> > > +
> > > +       /*
> > > +        * Get a handle to the loaded image protocol.  This is used to get
> > > +        * information about the running image, such as size and the command
> > > +        * line.
> > > +        */
> > > +       status = sys_table->boottime->handle_protocol(handle,
> > > +                                       &loaded_image_proto, (void *)&image);
> > > +       if (status != EFI_SUCCESS) {
> > > +               efi_err("Failed to get loaded image protocol\n");
> > > +               goto fail;
> > > +       }
> > > +
> > > +       /* Get the command line from EFI, using the LOADED_IMAGE protocol. */
> > > +       cmdline_ptr = efi_convert_cmdline(image, &cmdline_size);
> > > +       if (!cmdline_ptr) {
> > > +               efi_err("Getting command line failed!\n");
> > > +               goto fail_free_cmdline;
> > > +       }
> > > +
> > > +#ifdef CONFIG_CMDLINE_BOOL
> > > +       if (cmdline_size == 0)
> > > +               efi_parse_options(CONFIG_CMDLINE);
> > > +#endif
> > > +       if (!IS_ENABLED(CONFIG_CMDLINE_OVERRIDE) && cmdline_size > 0)
> > > +               efi_parse_options(cmdline_ptr);
> > > +
> > > +       efi_info("Booting Linux Kernel...\n");
> > > +
> > > +       efi_relocate_kernel(&kernel_addr, kernel_fsize, kernel_asize,
> > > +                           PHYSADDR(kernel_vaddr), SZ_2M, PHYSADDR(kernel_vaddr));
> > > +
> > > +       setup_graphics();
> > > +       secure_boot = efi_get_secureboot();
> > > +       efi_enable_reset_attack_mitigation();
> > > +
> > > +       status = efi_load_initrd(image, &initrd_addr, &initrd_size, SZ_4G, ULONG_MAX);
> > > +       if (status != EFI_SUCCESS) {
> > > +               efi_err("Failed get initrd addr!\n");
> > > +               goto fail_free;
> > > +       }
> > > +
> > > +       status = convert_priv_cmdline(cmdline_ptr, initrd_addr, initrd_size);
> > > +       if (status != EFI_SUCCESS) {
> > > +               efi_err("Covert cmdline failed!\n");
> > > +               goto fail_free;
> > > +       }
> > > +
> > > +       boot_p = bootparams_init(sys_table);
> > > +       if (!boot_p) {
> > > +               efi_err("Create BPI struct error!\n");
> > > +               goto fail;
> > > +       }
> > > +
> > > +       status = exit_boot_services(boot_p, handle);
> > > +       if (status != EFI_SUCCESS) {
> > > +               efi_err("exit_boot services failed!\n");
> > > +               goto fail_free;
> > > +       }
> > > +
> > > +       real_kernel_entry(argc, argv, boot_p);
> > > +
> > > +       return EFI_SUCCESS;
> > > +
> > > +fail_free:
> > > +       efi_free(initrd_size, initrd_addr);
> > > +
> > > +fail_free_cmdline:
> > > +       efi_free(cmdline_size, (unsigned long)cmdline_ptr);
> > > +
> > > +fail:
> > > +       return status;
> > > +}
> > > diff --git a/include/linux/pe.h b/include/linux/pe.h
> > > index daf09ffffe38..f4bb0b6a416d 100644
> > > --- a/include/linux/pe.h
> > > +++ b/include/linux/pe.h
> > > @@ -65,6 +65,7 @@
> > >  #define        IMAGE_FILE_MACHINE_SH5          0x01a8
> > >  #define        IMAGE_FILE_MACHINE_THUMB        0x01c2
> > >  #define        IMAGE_FILE_MACHINE_WCEMIPSV2    0x0169
> > > +#define        IMAGE_FILE_MACHINE_LOONGARCH    0x6264
> > >
> > >  /* flags */
> > >  #define IMAGE_FILE_RELOCS_STRIPPED           0x0001
> > > --
> > > 2.27.0
> > >

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 20/24] LoongArch: Add efistub booting support
  2022-05-05  9:59       ` Huacai Chen
@ 2022-05-06  8:14         ` Ard Biesheuvel
  2022-05-06 11:26           ` WANG Xuerui
  0 siblings, 1 reply; 94+ messages in thread
From: Ard Biesheuvel @ 2022-05-06  8:14 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, Huacai Chen, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Guo Ren,
	Xuerui Wang, Jiaxun Yang

On Thu, 5 May 2022 at 11:59, Huacai Chen <chenhuacai@gmail.com> wrote:
>
> Hi, Ard,
>
> On Tue, May 3, 2022 at 3:24 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Sat, 30 Apr 2022 at 11:56, Arnd Bergmann <arnd@arndb.de> wrote:
> > >
> > > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> > > >
> > > > This patch adds efistub booting support, which is the standard UEFI boot
> > > > protocol for us to use.
> > > >
> > > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > >
> > > It's good to see that you completed this. Unfortunately you did not add Ard
> > > Biesheuvel to Cc, he is the one who needs to review this code. Adding him
> > > to Cc now, with the full patch quoted below for him (no more comments
> > > from me there).
> > >
> >
> > Thanks Arnd,
> >
> > >
> > > > ---
> > > >  arch/loongarch/Kbuild                         |   3 +
> > > >  arch/loongarch/Kconfig                        |   8 +
> > > >  arch/loongarch/Makefile                       |  18 +-
> > > >  arch/loongarch/boot/Makefile                  |  23 +
> > > >  arch/loongarch/kernel/efi-header.S            | 100 +++++
> > > >  arch/loongarch/kernel/head.S                  |  44 +-
> > > >  arch/loongarch/kernel/image-vars.h            |  30 ++
> > > >  arch/loongarch/kernel/vmlinux.lds.S           |  23 +-
> > > >  drivers/firmware/efi/Kconfig                  |   4 +-
> > > >  drivers/firmware/efi/libstub/Makefile         |  14 +-
> > > >  drivers/firmware/efi/libstub/loongarch-stub.c | 425 ++++++++++++++++++
> > > >  include/linux/pe.h                            |   1 +
> > > >  12 files changed, 680 insertions(+), 13 deletions(-)
> > > >  create mode 100644 arch/loongarch/boot/Makefile
> > > >  create mode 100644 arch/loongarch/kernel/efi-header.S
> > > >  create mode 100644 arch/loongarch/kernel/image-vars.h
> > > >  create mode 100644 drivers/firmware/efi/libstub/loongarch-stub.c
> > > >
...
> > > > diff --git a/arch/loongarch/kernel/efi-header.S b/arch/loongarch/kernel/efi-header.S
> > > > new file mode 100644
> > > > index 000000000000..ceb44524944a
> > > > --- /dev/null
> > > > +++ b/arch/loongarch/kernel/efi-header.S
> > > > @@ -0,0 +1,100 @@
> > > > +/* SPDX-License-Identifier: GPL-2.0 */
> > > > +/*
> > > > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > > > + */
> > > > +
> > > > +#include <linux/pe.h>
> > > > +#include <linux/sizes.h>
> > > > +
> > > > +       .macro  __EFI_PE_HEADER
> > > > +       .long   PE_MAGIC
> > > > +coff_header:
> >
> > Please use .L prefixed local symbol definitions in this file, so we
> > don't clutter up the core kernel's global symbol table.
> I found that ARM64 uses .L prefix while RISCV doesn't, so I suppose
> that both OK?
>

No, please change this.

> >
> > > > +       .short  IMAGE_FILE_MACHINE_LOONGARCH            /* Machine */
> > > > +       .short  section_count                           /* NumberOfSections */
> > > > +       .long   0                                       /* TimeDateStamp */
> > > > +       .long   0                                       /* PointerToSymbolTable */
> > > > +       .long   0                                       /* NumberOfSymbols */
> > > > +       .short  section_table - optional_header         /* SizeOfOptionalHeader */
> > > > +       .short  IMAGE_FILE_DEBUG_STRIPPED | \
> > > > +               IMAGE_FILE_EXECUTABLE_IMAGE | \
> > > > +               IMAGE_FILE_LINE_NUMS_STRIPPED           /* Characteristics */
> > > > +
> > > > +optional_header:
> > > > +       .short  PE_OPT_MAGIC_PE32PLUS                   /* PE32+ format */
> > > > +       .byte   0x02                                    /* MajorLinkerVersion */
> > > > +       .byte   0x14                                    /* MinorLinkerVersion */
> > > > +       .long   __inittext_end - efi_header_end         /* SizeOfCode */
> > > > +       .long   _end - __initdata_begin                 /* SizeOfInitializedData */
> > > > +       .long   0                                       /* SizeOfUninitializedData */
> > > > +       .long   __efistub_efi_pe_entry - _head          /* AddressOfEntryPoint */
> > > > +       .long   efi_header_end - _head                  /* BaseOfCode */
> > > > +
> > > > +extra_header_fields:
> > > > +       .quad   0                                       /* ImageBase */
> > > > +       .long   PECOFF_SEGMENT_ALIGN                    /* SectionAlignment */
> > > > +       .long   PECOFF_FILE_ALIGN                       /* FileAlignment */
> > > > +       .short  0                                       /* MajorOperatingSystemVersion */
> > > > +       .short  0                                       /* MinorOperatingSystemVersion */
> > > > +       .short  0                                       /* MajorImageVersion */
> > > > +       .short  0                                       /* MinorImageVersion */
> >
> > Once you enable EFI_GENERIC_STUB, set the above fields to
> > EFISTUB_MAJOR_IMAGE_VERSION/EFISTUB_MINOR_IMAGE_VERSION, so
> > bootloaders know they can use the LoadFile2 based initrd loader.
> OK, versions will be filled.
>
> >
> > > > +       .short  0                                       /* MajorSubsystemVersion */
> > > > +       .short  0                                       /* MinorSubsystemVersion */
> > > > +       .long   0                                       /* Win32VersionValue */
> > > > +
> > > > +       .long   _end - _head                            /* SizeOfImage */
> > > > +
> > > > +       /* Everything before the kernel image is considered part of the header */
> > > > +       .long   efi_header_end - _head                  /* SizeOfHeaders */
> > > > +       .long   0                                       /* CheckSum */
> > > > +       .short  IMAGE_SUBSYSTEM_EFI_APPLICATION         /* Subsystem */
> > > > +       .short  0                                       /* DllCharacteristics */
> > > > +       .quad   0                                       /* SizeOfStackReserve */
> > > > +       .quad   0                                       /* SizeOfStackCommit */
> > > > +       .quad   0                                       /* SizeOfHeapReserve */
> > > > +       .quad   0                                       /* SizeOfHeapCommit */
> > > > +       .long   0                                       /* LoaderFlags */
> > > > +       .long   (section_table - .) / 8                 /* NumberOfRvaAndSizes */
> > > > +
> > > > +       .quad   0                                       /* ExportTable */
> > > > +       .quad   0                                       /* ImportTable */
> > > > +       .quad   0                                       /* ResourceTable */
> > > > +       .quad   0                                       /* ExceptionTable */
> > > > +       .quad   0                                       /* CertificationTable */
> > > > +       .quad   0                                       /* BaseRelocationTable */
> > > > +
> > > > +       /* Section table */
> > > > +section_table:
> > > > +       .ascii  ".text\0\0\0"
> > > > +       .long   __inittext_end - efi_header_end         /* VirtualSize */
> > > > +       .long   efi_header_end - _head                  /* VirtualAddress */
> > > > +       .long   __inittext_end - efi_header_end         /* SizeOfRawData */
> > > > +       .long   efi_header_end - _head                  /* PointerToRawData */
> > > > +
> > > > +       .long   0                                       /* PointerToRelocations */
> > > > +       .long   0                                       /* PointerToLineNumbers */
> > > > +       .short  0                                       /* NumberOfRelocations */
> > > > +       .short  0                                       /* NumberOfLineNumbers */
> > > > +       .long   IMAGE_SCN_CNT_CODE | \
> > > > +               IMAGE_SCN_MEM_READ | \
> > > > +               IMAGE_SCN_MEM_EXECUTE                   /* Characteristics */
> > > > +
> > > > +       .ascii  ".data\0\0\0"
> > > > +       .long   _end - __initdata_begin                 /* VirtualSize */
> > > > +       .long   __initdata_begin - _head                /* VirtualAddress */
> > > > +       .long   _edata - __initdata_begin               /* SizeOfRawData */
> > > > +       .long   __initdata_begin - _head                /* PointerToRawData */
> > > > +
> > > > +       .long   0                                       /* PointerToRelocations */
> > > > +       .long   0                                       /* PointerToLineNumbers */
> > > > +       .short  0                                       /* NumberOfRelocations */
> > > > +       .short  0                                       /* NumberOfLineNumbers */
> > > > +       .long   IMAGE_SCN_CNT_INITIALIZED_DATA | \
> > > > +               IMAGE_SCN_MEM_READ | \
> > > > +               IMAGE_SCN_MEM_WRITE                     /* Characteristics */
> > > > +
> > > > +       .org 0x20e
> > > > +       .word kernel_version - 512 -  _head
> > > > +
> > > > +       .set    section_count, (. - section_table) / 40
> > > > +efi_header_end:
> > > > +       .endm
> > > > diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
> > > > index b4a0b28da3e7..361b72e8bfc5 100644
> > > > --- a/arch/loongarch/kernel/head.S
> > > > +++ b/arch/loongarch/kernel/head.S
> > > > @@ -11,11 +11,53 @@
> > > >  #include <asm/regdef.h>
> > > >  #include <asm/loongarch.h>
> > > >  #include <asm/stackframe.h>
> > > > +#include <generated/compile.h>
> > > > +#include <generated/utsrelease.h>
> > > >
> > > > -SYM_ENTRY(_stext, SYM_L_GLOBAL, SYM_A_NONE)
> > > > +#ifdef CONFIG_EFI_STUB
> > > > +
> > > > +#include "efi-header.S"
> > > > +
> > > > +       __HEAD
> > > > +
> > > > +_head:
> > > > +       /* "MZ", MS-DOS header */
> > > > +       .word   MZ_MAGIC
> > > > +       .org    0x28
> > > > +       .ascii  "Loongson\0"
> >
> > Is this part of a special boot protocol? It would be better not to
> > overload EFI and PE/COFF with your own hacks if we can avoid it.
> This is used as a magic string and Grub will check it.
>

Why? I don't think it is a good idea to have hacks like this, because
it means that
a) the kernel is no longer a standard PE/COFF image, and so you need a
special loader to load it;
b) the loader has to parse the file manually before loading it, which
is problematic when using EFI device paths that, e.g., evaluate to a
HTTP boot target or something like that.



> >
> > > > +       .org    0x3c
> > > > +       /* Offset to the PE header */
> > > > +       .long   pe_header - _head
> > > > +
> > > > +pe_header:
> > > > +       __EFI_PE_HEADER
> > > > +
> > > > +kernel_asize:
> > > > +       .long _end - _text
> > > > +
> > > > +kernel_fsize:
> > > > +       .long _edata - _text
> > > > +
> > > > +kernel_vaddr:
> > > > +       .quad VMLINUX_LOAD_ADDRESS
> > > > +
> > > > +kernel_offset:
> > > > +       .long kernel_offset - _text
> > > > +
> > > > +kernel_version:
> > > > +       .ascii  UTS_RELEASE " (" LINUX_COMPILE_BY "@" LINUX_COMPILE_HOST ") " UTS_VERSION "\0"
> > > > +
> > > > +SYM_L_GLOBAL(kernel_asize)
> > > > +SYM_L_GLOBAL(kernel_fsize)
> > > > +SYM_L_GLOBAL(kernel_vaddr)
> > > > +SYM_L_GLOBAL(kernel_offset)
> >
> > I think you can simplify this to
> >
> > SYM_DATA(kernel_asize, .long _end - _text);
> >
> > etc etc (which implies the .globl annotation)
> OK, thanks.
>
> >
> >
> > > > +
> > > > +#endif
> > > >
> > > >         __REF
> > > >
> > > > +SYM_ENTRY(_stext, SYM_L_GLOBAL, SYM_A_NONE)
> > > > +
> > > >  SYM_CODE_START(kernel_entry)                   # kernel entry point
> > > >
> > > >         /* Config direct window and set PG */
> > > > diff --git a/arch/loongarch/kernel/image-vars.h b/arch/loongarch/kernel/image-vars.h
> > > > new file mode 100644
> > > > index 000000000000..0162402b6212
> > > > --- /dev/null
> > > > +++ b/arch/loongarch/kernel/image-vars.h
> > > > @@ -0,0 +1,30 @@
> > > > +/* SPDX-License-Identifier: GPL-2.0-only */
> > > > +/*
> > > > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > > > + */
> > > > +#ifndef __LOONGARCH_KERNEL_IMAGE_VARS_H
> > > > +#define __LOONGARCH_KERNEL_IMAGE_VARS_H
> > > > +
> > > > +#ifdef CONFIG_EFI_STUB
> > > > +
> > > > +__efistub_memcmp               = memcmp;
> > > > +__efistub_memcpy               = memcpy;
> > > > +__efistub_memmove              = memmove;
> > > > +__efistub_memset               = memset;
> > > > +__efistub_strcat               = strcat;
> > > > +__efistub_strcmp               = strcmp;
> > > > +__efistub_strlen               = strlen;
> > > > +__efistub_strncat              = strncat;
> > > > +__efistub_strnstr              = strnstr;
> > > > +__efistub_strnlen              = strnlen;
> > > > +__efistub_strpbrk              = strpbrk;
> > > > +__efistub_strsep               = strsep;
> > > > +__efistub_kernel_entry         = kernel_entry;
> > > > +__efistub_kernel_asize         = kernel_asize;
> > > > +__efistub_kernel_fsize         = kernel_fsize;
> > > > +__efistub_kernel_vaddr         = kernel_vaddr;
> > > > +__efistub_kernel_offset                = kernel_offset;
> > > > +
> > > > +#endif
> > > > +
> > > > +#endif /* __LOONGARCH_KERNEL_IMAGE_VARS_H */
> > > > diff --git a/arch/loongarch/kernel/vmlinux.lds.S b/arch/loongarch/kernel/vmlinux.lds.S
> > > > index 02abfaaa4892..7da4c4d7c50d 100644
> > > > --- a/arch/loongarch/kernel/vmlinux.lds.S
> > > > +++ b/arch/loongarch/kernel/vmlinux.lds.S
> > > > @@ -12,6 +12,14 @@
> > > >  #define BSS_FIRST_SECTIONS *(.bss..swapper_pg_dir)
> > > >
> > > >  #include <asm-generic/vmlinux.lds.h>
> > > > +#include "image-vars.h"
> > > > +
> > > > +/*
> > > > + * Max avaliable Page Size is 64K, so we set SectionAlignment
> > > > + * field of EFI application to 64K.
> > > > + */
> > > > +PECOFF_FILE_ALIGN = 0x200;
> > > > +PECOFF_SEGMENT_ALIGN = 0x10000;
> > > >
> > > >  OUTPUT_ARCH(loongarch)
> > > >  ENTRY(kernel_entry)
> > > > @@ -27,6 +35,9 @@ SECTIONS
> > > >         . = VMLINUX_LOAD_ADDRESS;
> > > >
> > > >         _text = .;
> > > > +       HEAD_TEXT_SECTION
> > > > +
> > > > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > > >         .text : {
> > > >                 TEXT_TEXT
> > > >                 SCHED_TEXT
> > > > @@ -38,11 +49,12 @@ SECTIONS
> > > >                 *(.fixup)
> > > >                 *(.gnu.warning)
> > > >         } :text = 0
> > > > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > > >         _etext = .;
> > > >
> > > >         EXCEPTION_TABLE(16)
> > > >
> > > > -       . = ALIGN(PAGE_SIZE);
> > > > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > > >         __init_begin = .;
> > > >         __inittext_begin = .;
> > > >
> > > > @@ -51,6 +63,7 @@ SECTIONS
> > > >                 EXIT_TEXT
> > > >         }
> > > >
> > > > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > > >         __inittext_end = .;
> > > >
> > > >         __initdata_begin = .;
> > > > @@ -60,6 +73,10 @@ SECTIONS
> > > >                 EXIT_DATA
> > > >         }
> > > >
> > > > +       .init.bss : {
> > > > +               *(.init.bss)
> > > > +       }
> > > > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > > >         __initdata_end = .;
> > > >
> > > >         __init_end = .;
> > > > @@ -71,11 +88,11 @@ SECTIONS
> > > >         .sdata : {
> > > >                 *(.sdata)
> > > >         }
> > > > -
> > > > -       . = ALIGN(SZ_64K);
> > > > +       .edata_padding : { BYTE(0); . = ALIGN(PECOFF_FILE_ALIGN); }
> > > >         _edata =  .;
> > > >
> > > >         BSS_SECTION(0, SZ_64K, 8)
> > > > +       . = ALIGN(PECOFF_SEGMENT_ALIGN);
> > > >
> > > >         _end = .;
> > > >
> > > > diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig
> > > > index 2c3dac5ecb36..ecb4e0b1295a 100644
> > > > --- a/drivers/firmware/efi/Kconfig
> > > > +++ b/drivers/firmware/efi/Kconfig
> > > > @@ -121,9 +121,9 @@ config EFI_ARMSTUB_DTB_LOADER
> > > >
> > > >  config EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER
> > > >         bool "Enable the command line initrd loader" if !X86
> > > > -       depends on EFI_STUB && (EFI_GENERIC_STUB || X86)
> > > > -       default y if X86
> > > >         depends on !RISCV
> > > > +       depends on EFI_STUB && (EFI_GENERIC_STUB || X86 || LOONGARCH)
> > > > +       default y if (X86 || LOONGARCH)
> >
> > Don't enable the command line initrd loader please. It is deprecated,
> > and has been replaced with the LoadFile2 protocol based one, which is
> > more flexible.
> >
> > Uboot already implements it, as well as EDK2. GRUB does not implement
> > this yet afair, but it should not be that hard to add.
> If we don't select this, is it possible to load initrd in the UEFI shell?
>

Yes, if you build the shell to include the 'initrd' command.

Please refer to the following EDK2 module:
OvmfPkg/LinuxInitrdDynamicShellCommand/LinuxInitrdDynamicShellCommand.inf

The initrd= command line parameter only permits initrd images that are
in the same file system as the one the kernel was loaded from, which
is overly restrictive, and doesn't work at all if the Image was not
loaded from a file system to begin with.

> >
> > > >         help
> > > >           Select this config option to add support for the initrd= command
> > > >           line parameter, allowing an initrd that resides on the same volume
> > > > diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
> > > > index d0537573501e..663e9d317299 100644
> > > > --- a/drivers/firmware/efi/libstub/Makefile
> > > > +++ b/drivers/firmware/efi/libstub/Makefile
> > > > @@ -26,6 +26,8 @@ cflags-$(CONFIG_ARM)          := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > > >                                    $(call cc-option,-mno-single-pic-base)
> > > >  cflags-$(CONFIG_RISCV)         := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > > >                                    -fpic
> > > > +cflags-$(CONFIG_LOONGARCH)     := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > > > +                                  -fpic
> > > >
> > > >  cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
> > > >
> > > > @@ -55,7 +57,7 @@ KCOV_INSTRUMENT                       := n
> > > >  lib-y                          := efi-stub-helper.o gop.o secureboot.o tpm.o \
> > > >                                    file.o mem.o random.o randomalloc.o pci.o \
> > > >                                    skip_spaces.o lib-cmdline.o lib-ctype.o \
> > > > -                                  alignedmem.o relocate.o vsprintf.o
> > > > +                                  alignedmem.o relocate.o string.o vsprintf.o
> > > >
> > > >  # include the stub's generic dependencies from lib/ when building for ARM/arm64
> > > >  efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c
> > > > @@ -63,13 +65,15 @@ efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c
> > > >  $(obj)/lib-%.o: $(srctree)/lib/%.c FORCE
> > > >         $(call if_changed_rule,cc_o_c)
> > > >
> > > > -lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o fdt.o string.o \
> > > > +lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o fdt.o \
> > > >                                    $(patsubst %.c,lib-%.o,$(efi-deps-y))
> > > >
> > > >  lib-$(CONFIG_ARM)              += arm32-stub.o
> > > >  lib-$(CONFIG_ARM64)            += arm64-stub.o
> > > >  lib-$(CONFIG_X86)              += x86-stub.o
> > > >  lib-$(CONFIG_RISCV)            += riscv-stub.o
> > > > +lib-$(CONFIG_LOONGARCH)                += loongarch-stub.o
> > > > +
> > > >  CFLAGS_arm32-stub.o            := -DTEXT_OFFSET=$(TEXT_OFFSET)
> > > >
> > > >  # Even when -mbranch-protection=none is set, Clang will generate a
> > > > @@ -125,6 +129,12 @@ STUBCOPY_FLAGS-$(CONFIG_RISCV)     += --prefix-alloc-sections=.init \
> > > >                                    --prefix-symbols=__efistub_
> > > >  STUBCOPY_RELOC-$(CONFIG_RISCV) := R_RISCV_HI20
> > > >
> > > > +# For LoongArch, keep all the symbols in .init section and make sure that no
> > > > +# absolute symbols references doesn't exist.
> > > > +STUBCOPY_FLAGS-$(CONFIG_LOONGARCH)     += --prefix-alloc-sections=.init \
> > > > +                                          --prefix-symbols=__efistub_
> > > > +STUBCOPY_RELOC-$(CONFIG_LOONGARCH)     := R_LARCH_MARK_LA
> > > > +
> > > >  $(obj)/%.stub.o: $(obj)/%.o FORCE
> > > >         $(call if_changed,stubcopy)
> > > >
> > > > diff --git a/drivers/firmware/efi/libstub/loongarch-stub.c b/drivers/firmware/efi/libstub/loongarch-stub.c
> > > > new file mode 100644
> > > > index 000000000000..399641a0b0cb
> > > > --- /dev/null
> > > > +++ b/drivers/firmware/efi/libstub/loongarch-stub.c
> > > > @@ -0,0 +1,425 @@
> > > > +// SPDX-License-Identifier: GPL-2.0
> > > > +/*
> > > > + * Author: Yun Liu <liuyun@loongson.cn>
> > > > + *         Huacai Chen <chenhuacai@loongson.cn>
> > > > + * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> > > > + */
> > > > +
> > > > +#include <linux/efi.h>
> > > > +#include <linux/sort.h>
> > > > +#include <asm/efi.h>
> > > > +#include <asm/addrspace.h>
> > > > +#include <asm/boot_param.h>
> > > > +#include "efistub.h"
> > > > +
> > > > +#define MAX_ARG_COUNT          128
> > > > +#define CMDLINE_MAX_SIZE       0x200
> > > > +
> > > > +static int argc;
> > > > +static char **argv;
> > > > +const efi_system_table_t *efi_system_table;
> > > > +static efi_guid_t screen_info_guid = LINUX_EFI_LARCH_SCREEN_INFO_TABLE_GUID;
> > > > +static unsigned int map_entry[LOONGSON3_BOOT_MEM_MAP_MAX];
> > > > +static struct efi_mmap mmap_array[EFI_MAX_MEMORY_TYPE][LOONGSON3_BOOT_MEM_MAP_MAX];
> > > > +
> > > > +struct exit_boot_struct {
> > > > +       struct boot_params *bp;
> > > > +       unsigned int *runtime_entry_count;
> > > > +};
> > > > +
> > > > +typedef void (*kernel_entry_t)(int argc, char *argv[], struct boot_params *boot_p);
> > > > +
> > > > +extern int kernel_asize;
> > > > +extern int kernel_fsize;
> > > > +extern int kernel_offset;
> > > > +extern unsigned long kernel_vaddr;
> > > > +extern kernel_entry_t kernel_entry;
> > > > +
> > > > +unsigned char efi_crc8(char *buff, int size)
> > > > +{
> > > > +       int sum, cnt;
> > > > +
> > > > +       for (sum = 0, cnt = 0; cnt < size; cnt++)
> > > > +               sum = (char) (sum + *(buff + cnt));
> > > > +
> > > > +       return (char)(0x100 - sum);
> > > > +}
> > > > +
> > > > +struct screen_info *alloc_screen_info(void)
> > > > +{
> > > > +       efi_status_t status;
> > > > +       struct screen_info *si;
> > > > +
> > > > +       status = efi_bs_call(allocate_pool,
> > > > +                       EFI_RUNTIME_SERVICES_DATA, sizeof(*si), (void **)&si);
> > > > +       if (status != EFI_SUCCESS)
> > > > +               return NULL;
> > > > +
> > > > +       status = efi_bs_call(install_configuration_table, &screen_info_guid, si);
> > > > +       if (status == EFI_SUCCESS)
> > > > +               return si;
> > > > +
> > > > +       efi_bs_call(free_pool, si);
> > > > +
> > > > +       return NULL;
> > > > +}
> > > > +
> > > > +static void setup_graphics(void)
> > > > +{
> > > > +       unsigned long size;
> > > > +       efi_status_t status;
> > > > +       efi_guid_t gop_proto = EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID;
> > > > +       void **gop_handle = NULL;
> > > > +       struct screen_info *si = NULL;
> > > > +
> > > > +       size = 0;
> > > > +       status = efi_bs_call(locate_handle, EFI_LOCATE_BY_PROTOCOL,
> > > > +                               &gop_proto, NULL, &size, gop_handle);
> > > > +       if (status == EFI_BUFFER_TOO_SMALL) {
> > > > +               si = alloc_screen_info();
> > > > +               efi_setup_gop(si, &gop_proto, size);
> > > > +       }
> > > > +}
> > > > +
> > > > +struct boot_params *bootparams_init(efi_system_table_t *sys_table)
> > > > +{
> > > > +       efi_status_t status;
> > > > +       struct boot_params *p;
> > > > +       unsigned char sig[8] = {'B', 'P', 'I', '0', '1', '0', '0', '2'};
> > > > +
> > > > +       status = efi_bs_call(allocate_pool, EFI_RUNTIME_SERVICES_DATA, SZ_64K, (void **)&p);
> > > > +       if (status != EFI_SUCCESS)
> > > > +               return NULL;
> > > > +
> > > > +       memset(p, 0, SZ_64K);
> > > > +       memcpy(&p->signature, sig, sizeof(long));
> > > > +
> > > > +       return p;
> > > > +}
> > > > +
> > > > +static unsigned long convert_priv_cmdline(char *cmdline_ptr,
> > > > +               unsigned long rd_addr, unsigned long rd_size)
> > > > +{
> > > > +       unsigned int rdprev_size;
> > > > +       unsigned int cmdline_size;
> > > > +       efi_status_t status;
> > > > +       char *pstr, *substr;
> > > > +       char *initrd_ptr = NULL;
> > > > +       char convert_str[CMDLINE_MAX_SIZE];
> > > > +       static char cmdline_array[CMDLINE_MAX_SIZE];
> > > > +
> > > > +       cmdline_size = strlen(cmdline_ptr);
> > > > +       snprintf(cmdline_array, CMDLINE_MAX_SIZE, "kernel ");
> > > > +
> > > > +       initrd_ptr = strstr(cmdline_ptr, "initrd=");
> > > > +       if (!initrd_ptr) {
> > > > +               snprintf(cmdline_array, CMDLINE_MAX_SIZE, "kernel %s", cmdline_ptr);
> > > > +               goto completed;
> > > > +       }
> > > > +       snprintf(convert_str, CMDLINE_MAX_SIZE, " initrd=0x%lx,0x%lx", rd_addr, rd_size);
> > > > +       rdprev_size = cmdline_size - strlen(initrd_ptr);
> > > > +       strncat(cmdline_array, cmdline_ptr, rdprev_size);
> > > > +
> > > > +       cmdline_ptr = strnstr(initrd_ptr, " ", CMDLINE_MAX_SIZE);
> > > > +       strcat(cmdline_array, convert_str);
> > > > +       if (!cmdline_ptr)
> > > > +               goto completed;
> > > > +
> > > > +       strcat(cmdline_array, cmdline_ptr);
> > > > +
> > > > +completed:
> > > > +       status = efi_allocate_pages((MAX_ARG_COUNT + 1) * (sizeof(char *)),
> > > > +                                       (unsigned long *)&argv, ULONG_MAX);
> > > > +       if (status != EFI_SUCCESS) {
> > > > +               efi_err("Alloc argv mmap_array error\n");
> > > > +               return status;
> > > > +       }
> > > > +
> > > > +       argc = 0;
> > > > +       pstr = cmdline_array;
> > > > +
> > > > +       substr = strsep(&pstr, " \t");
> > > > +       while (substr != NULL) {
> > > > +               if (strlen(substr)) {
> > > > +                       argv[argc++] = substr;
> > > > +                       if (argc == MAX_ARG_COUNT) {
> > > > +                               efi_err("Argv mmap_array full!\n");
> > > > +                               break;
> > > > +                       }
> > > > +               }
> > > > +               substr = strsep(&pstr, " \t");
> > > > +       }
> > > > +
> > > > +       return EFI_SUCCESS;
> > > > +}
> > > > +
> > > > +unsigned int efi_memmap_sort(struct loongsonlist_mem_map *memmap,
> > > > +                       unsigned int index, unsigned int mem_type)
> > > > +{
> > > > +       unsigned int i, t;
> > > > +       unsigned long msize;
> > > > +
> > > > +       for (i = 0; i < map_entry[mem_type]; i = t) {
> > > > +               msize = mmap_array[mem_type][i].mem_size;
> > > > +               for (t = i + 1; t < map_entry[mem_type]; t++) {
> > > > +                       if (mmap_array[mem_type][i].mem_start + msize <
> > > > +                                       mmap_array[mem_type][t].mem_start)
> > > > +                               break;
> > > > +
> > > > +                       msize += mmap_array[mem_type][t].mem_size;
> > > > +               }
> > > > +               memmap->map[index].mem_type = mem_type;
> > > > +               memmap->map[index].mem_start = mmap_array[mem_type][i].mem_start;
> > > > +               memmap->map[index].mem_size = msize;
> > > > +               memmap->map[index].attribute = mmap_array[mem_type][i].attribute;
> > > > +               index++;
> > > > +       }
> > > > +
> > > > +       return index;
> > > > +}
> > > > +
> > > > +static efi_status_t mk_mmap(struct efi_boot_memmap *map, struct boot_params *p)
> > > > +{
> >
> > Are you passing a different representation of the memory map to the
> > core kernel? I think it would be easier just to pass the EFI memory
> > map like other EFI arches do, and reuse all of the code that we
> > already have.
> Yes, this different representation is used by our "boot_params", the
> interface between bootloader (including efistub) and the core kernel.

So how does the core kernel consume the EFI memory map? Only through
this mechanism?

> >
> > > > +       char checksum;
> > > > +       unsigned int i;
> > > > +       unsigned int nr_desc;
> > > > +       unsigned int mem_type;
> > > > +       unsigned long count;
> > > > +       efi_memory_desc_t *mem_desc;
> > > > +       struct loongsonlist_mem_map *mhp = NULL;
> > > > +
> > > > +       memset(map_entry, 0, sizeof(map_entry));
> > > > +       memset(mmap_array, 0, sizeof(mmap_array));
> > > > +
> > > > +       if (!strncmp((char *)p, "BPI", 3)) {
> > > > +               p->flags |= BPI_FLAGS_UEFI_SUPPORTED;
> > > > +               p->systemtable = (efi_system_table_t *)efi_system_table;
> > > > +               p->extlist_offset = sizeof(*p) + sizeof(unsigned long);
> > > > +               mhp = (struct loongsonlist_mem_map *)((char *)p + p->extlist_offset);
> > > > +
> > > > +               memcpy(&mhp->header.signature, "MEM", sizeof(unsigned long));
> > > > +               mhp->header.length = sizeof(*mhp);
> > > > +               mhp->desc_version = *map->desc_ver;
> > > > +               mhp->map_count = 0;
> > > > +       }
> > > > +       if (!(*(map->map_size)) || !(*(map->desc_size)) || !mhp) {
> > > > +               efi_err("get memory info error\n");
> > > > +               return EFI_INVALID_PARAMETER;
> > > > +       }
> > > > +       nr_desc = *(map->map_size) / *(map->desc_size);
> > > > +
> > > > +       /*
> > > > +        * According to UEFI SPEC, mmap_buf is the accurate Memory Map
> > > > +        * mmap_array now we can fill platform specific memory structure.
> > > > +        */
> > > > +       for (i = 0; i < nr_desc; i++) {
> > > > +               mem_desc = (efi_memory_desc_t *)((void *)(*map->map) + (i * (*(map->desc_size))));
> > > > +               switch (mem_desc->type) {
> > > > +               case EFI_RESERVED_TYPE:
> > > > +               case EFI_RUNTIME_SERVICES_CODE:
> > > > +               case EFI_RUNTIME_SERVICES_DATA:
> > > > +               case EFI_MEMORY_MAPPED_IO:
> > > > +               case EFI_MEMORY_MAPPED_IO_PORT_SPACE:
> > > > +               case EFI_UNUSABLE_MEMORY:
> > > > +               case EFI_PAL_CODE:
> > > > +                       mem_type = ADDRESS_TYPE_RESERVED;
> > > > +                       break;
> > > > +
> > > > +               case EFI_ACPI_MEMORY_NVS:
> > > > +                       mem_type = ADDRESS_TYPE_NVS;
> > > > +                       break;
> > > > +
> > > > +               case EFI_ACPI_RECLAIM_MEMORY:
> > > > +                       mem_type = ADDRESS_TYPE_ACPI;
> > > > +                       break;
> > > > +
> > > > +               case EFI_LOADER_CODE:
> > > > +               case EFI_LOADER_DATA:
> > > > +               case EFI_PERSISTENT_MEMORY:
> > > > +               case EFI_BOOT_SERVICES_CODE:
> > > > +               case EFI_BOOT_SERVICES_DATA:
> > > > +               case EFI_CONVENTIONAL_MEMORY:
> > > > +                       mem_type = ADDRESS_TYPE_SYSRAM;
> > > > +                       break;
> > > > +
> > > > +               default:
> > > > +                       continue;
> > > > +               }
> > > > +
> > > > +               mmap_array[mem_type][map_entry[mem_type]].mem_type = mem_type;
> > > > +               mmap_array[mem_type][map_entry[mem_type]].mem_start =
> > > > +                                               mem_desc->phys_addr & TO_PHYS_MASK;
> > > > +               mmap_array[mem_type][map_entry[mem_type]].mem_size =
> > > > +                                               mem_desc->num_pages << EFI_PAGE_SHIFT;
> > > > +               mmap_array[mem_type][map_entry[mem_type]].attribute =
> > > > +                                               mem_desc->attribute;
> > > > +               map_entry[mem_type]++;
> > > > +       }
> > > > +
> > > > +       count = mhp->map_count;
> > > > +       /* Sort EFI memmap and add to BPI for kernel */
> > > > +       for (i = 0; i < LOONGSON3_BOOT_MEM_MAP_MAX; i++) {
> > > > +               if (!map_entry[i])
> > > > +                       continue;
> > > > +               count = efi_memmap_sort(mhp, count, i);
> > > > +       }
> > > > +
> > > > +       mhp->map_count = count;
> > > > +       mhp->header.checksum = 0;
> > > > +
> > > > +       checksum = efi_crc8((char *)mhp, mhp->header.length);
> > > > +       mhp->header.checksum = checksum;
> > > > +
> > > > +       return EFI_SUCCESS;
> > > > +}
> > > > +
> > > > +static efi_status_t exit_boot_func(struct efi_boot_memmap *map, void *priv)
> > > > +{
> > > > +       efi_status_t status;
> > > > +       struct exit_boot_struct *p = priv;
> > > > +
> > > > +       status = mk_mmap(map, p->bp);
> > > > +       if (status != EFI_SUCCESS) {
> > > > +               efi_err("Make kernel memory map failed!\n");
> > > > +               return status;
> > > > +       }
> > > > +
> > > > +       return EFI_SUCCESS;
> > > > +}
> > > > +
> > > > +static efi_status_t exit_boot_services(struct boot_params *boot_params, void *handle)
> > > > +{
> > > > +       unsigned int desc_version;
> > > > +       unsigned int runtime_entry_count = 0;
> > > > +       unsigned long map_size, key, desc_size, buff_size;
> > > > +       efi_status_t status;
> > > > +       efi_memory_desc_t *mem_map;
> > > > +       struct efi_boot_memmap map;
> > > > +       struct exit_boot_struct priv;
> > > > +
> > > > +       map.map                 = &mem_map;
> > > > +       map.map_size            = &map_size;
> > > > +       map.desc_size           = &desc_size;
> > > > +       map.desc_ver            = &desc_version;
> > > > +       map.key_ptr             = &key;
> > > > +       map.buff_size           = &buff_size;
> > > > +       status = efi_get_memory_map(&map);
> > > > +       if (status != EFI_SUCCESS) {
> > > > +               efi_err("Unable to retrieve UEFI memory map.\n");
> > > > +               return status;
> > > > +       }
> > > > +
> > > > +       priv.bp = boot_params;
> > > > +       priv.runtime_entry_count = &runtime_entry_count;
> > > > +
> > > > +       /* Might as well exit boot services now */
> > > > +       status = efi_exit_boot_services(handle, &map, &priv, exit_boot_func);
> > > > +       if (status != EFI_SUCCESS)
> > > > +               return status;
> > > > +
> > > > +       return EFI_SUCCESS;
> > > > +}
> > > > +
> > > > +/*
> > > > + * EFI entry point for the LoongArch EFI stub.
> > > > + */
> > > > +efi_status_t __efiapi efi_pe_entry(efi_handle_t handle, efi_system_table_t *sys_table)
> >
> > Why are you not using the generic EFI stub boot flow?
> Hmmm, as I know, we define our own "boot_params", a interface between
> bootloader (including efistub) and the core kernel to pass memmap,
> cmdline and initrd information, three years ago. This method looks
> like the X86 way, while different from the generic stub (which is
> called arm stub before 5.8). In these years, many products have
> already use the "boot_params" interface (including UEFI, PMON, Grub,
> Kernel, etc., but most of them haven't be upstream). Replace
> boot_params with FDT (i.e., the generic stub way) is difficult for us,
> because it means a big broken of compatibility.
>

OK, I understand. So using the generic stub is not possible for you.

So as long as you don't enable deprecated features such as initrd=, or
rely on special hacks like putting magic numbers at fixed offsets in
the image, I'm fine with this approach.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 20/24] LoongArch: Add efistub booting support
  2022-05-06  8:14         ` Ard Biesheuvel
@ 2022-05-06 11:26           ` WANG Xuerui
  2022-05-06 11:41             ` Arnd Bergmann
  0 siblings, 1 reply; 94+ messages in thread
From: WANG Xuerui @ 2022-05-06 11:26 UTC (permalink / raw)
  To: Ard Biesheuvel, Huacai Chen
  Cc: Arnd Bergmann, Huacai Chen, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Guo Ren,
	Xuerui Wang, Jiaxun Yang

Hi,

On 5/6/22 16:14, Ard Biesheuvel wrote:
> [snip]
>>>>> +
>>>>> +static efi_status_t mk_mmap(struct efi_boot_memmap *map, struct boot_params *p)
>>>>> +{
>>> Are you passing a different representation of the memory map to the
>>> core kernel? I think it would be easier just to pass the EFI memory
>>> map like other EFI arches do, and reuse all of the code that we
>>> already have.
>> Yes, this different representation is used by our "boot_params", the
>> interface between bootloader (including efistub) and the core kernel.
> So how does the core kernel consume the EFI memory map? Only through
> this mechanism?
>
>>>>> +       char checksum;
>>>>> +       unsigned int i;
>>>>> +       unsigned int nr_desc;
>>>>> +       unsigned int mem_type;
>>>>> +       unsigned long count;
>>>>> +       efi_memory_desc_t *mem_desc;
>>>>> +       struct loongsonlist_mem_map *mhp = NULL;
>>>>> +
>>>>> +       memset(map_entry, 0, sizeof(map_entry));
>>>>> +       memset(mmap_array, 0, sizeof(mmap_array));
>>>>> +
>>>>> +       if (!strncmp((char *)p, "BPI", 3)) {
>>>>> +               p->flags |= BPI_FLAGS_UEFI_SUPPORTED;
>>>>> +               p->systemtable = (efi_system_table_t *)efi_system_table;
>>>>> +               p->extlist_offset = sizeof(*p) + sizeof(unsigned long);
>>>>> +               mhp = (struct loongsonlist_mem_map *)((char *)p + p->extlist_offset);
>>>>> +
>>>>> +               memcpy(&mhp->header.signature, "MEM", sizeof(unsigned long));
>>>>> +               mhp->header.length = sizeof(*mhp);
>>>>> +               mhp->desc_version = *map->desc_ver;
>>>>> +               mhp->map_count = 0;
>>>>> +       }
>>>>> +       if (!(*(map->map_size)) || !(*(map->desc_size)) || !mhp) {
>>>>> +               efi_err("get memory info error\n");
>>>>> +               return EFI_INVALID_PARAMETER;
>>>>> +       }
>>>>> +       nr_desc = *(map->map_size) / *(map->desc_size);
>>>>> +
>>>>> +       /*
>>>>> +        * According to UEFI SPEC, mmap_buf is the accurate Memory Map
>>>>> +        * mmap_array now we can fill platform specific memory structure.
>>>>> +        */
>>>>> +       for (i = 0; i < nr_desc; i++) {
>>>>> +               mem_desc = (efi_memory_desc_t *)((void *)(*map->map) + (i * (*(map->desc_size))));
>>>>> +               switch (mem_desc->type) {
>>>>> +               case EFI_RESERVED_TYPE:
>>>>> +               case EFI_RUNTIME_SERVICES_CODE:
>>>>> +               case EFI_RUNTIME_SERVICES_DATA:
>>>>> +               case EFI_MEMORY_MAPPED_IO:
>>>>> +               case EFI_MEMORY_MAPPED_IO_PORT_SPACE:
>>>>> +               case EFI_UNUSABLE_MEMORY:
>>>>> +               case EFI_PAL_CODE:
>>>>> +                       mem_type = ADDRESS_TYPE_RESERVED;
>>>>> +                       break;
>>>>> +
>>>>> +               case EFI_ACPI_MEMORY_NVS:
>>>>> +                       mem_type = ADDRESS_TYPE_NVS;
>>>>> +                       break;
>>>>> +
>>>>> +               case EFI_ACPI_RECLAIM_MEMORY:
>>>>> +                       mem_type = ADDRESS_TYPE_ACPI;
>>>>> +                       break;
>>>>> +
>>>>> +               case EFI_LOADER_CODE:
>>>>> +               case EFI_LOADER_DATA:
>>>>> +               case EFI_PERSISTENT_MEMORY:
>>>>> +               case EFI_BOOT_SERVICES_CODE:
>>>>> +               case EFI_BOOT_SERVICES_DATA:
>>>>> +               case EFI_CONVENTIONAL_MEMORY:
>>>>> +                       mem_type = ADDRESS_TYPE_SYSRAM;
>>>>> +                       break;
>>>>> +
>>>>> +               default:
>>>>> +                       continue;
>>>>> +               }
>>>>> +
>>>>> +               mmap_array[mem_type][map_entry[mem_type]].mem_type = mem_type;
>>>>> +               mmap_array[mem_type][map_entry[mem_type]].mem_start =
>>>>> +                                               mem_desc->phys_addr & TO_PHYS_MASK;
>>>>> +               mmap_array[mem_type][map_entry[mem_type]].mem_size =
>>>>> +                                               mem_desc->num_pages << EFI_PAGE_SHIFT;
>>>>> +               mmap_array[mem_type][map_entry[mem_type]].attribute =
>>>>> +                                               mem_desc->attribute;
>>>>> +               map_entry[mem_type]++;
>>>>> +       }
>>>>> +
>>>>> +       count = mhp->map_count;
>>>>> +       /* Sort EFI memmap and add to BPI for kernel */
>>>>> +       for (i = 0; i < LOONGSON3_BOOT_MEM_MAP_MAX; i++) {
>>>>> +               if (!map_entry[i])
>>>>> +                       continue;
>>>>> +               count = efi_memmap_sort(mhp, count, i);
>>>>> +       }
>>>>> +
>>>>> +       mhp->map_count = count;
>>>>> +       mhp->header.checksum = 0;
>>>>> +
>>>>> +       checksum = efi_crc8((char *)mhp, mhp->header.length);
>>>>> +       mhp->header.checksum = checksum;
>>>>> +
>>>>> +       return EFI_SUCCESS;
>>>>> +}
>>>>> +
>>>>> +static efi_status_t exit_boot_func(struct efi_boot_memmap *map, void *priv)
>>>>> +{
>>>>> +       efi_status_t status;
>>>>> +       struct exit_boot_struct *p = priv;
>>>>> +
>>>>> +       status = mk_mmap(map, p->bp);
>>>>> +       if (status != EFI_SUCCESS) {
>>>>> +               efi_err("Make kernel memory map failed!\n");
>>>>> +               return status;
>>>>> +       }
>>>>> +
>>>>> +       return EFI_SUCCESS;
>>>>> +}
>>>>> +
>>>>> +static efi_status_t exit_boot_services(struct boot_params *boot_params, void *handle)
>>>>> +{
>>>>> +       unsigned int desc_version;
>>>>> +       unsigned int runtime_entry_count = 0;
>>>>> +       unsigned long map_size, key, desc_size, buff_size;
>>>>> +       efi_status_t status;
>>>>> +       efi_memory_desc_t *mem_map;
>>>>> +       struct efi_boot_memmap map;
>>>>> +       struct exit_boot_struct priv;
>>>>> +
>>>>> +       map.map                 = &mem_map;
>>>>> +       map.map_size            = &map_size;
>>>>> +       map.desc_size           = &desc_size;
>>>>> +       map.desc_ver            = &desc_version;
>>>>> +       map.key_ptr             = &key;
>>>>> +       map.buff_size           = &buff_size;
>>>>> +       status = efi_get_memory_map(&map);
>>>>> +       if (status != EFI_SUCCESS) {
>>>>> +               efi_err("Unable to retrieve UEFI memory map.\n");
>>>>> +               return status;
>>>>> +       }
>>>>> +
>>>>> +       priv.bp = boot_params;
>>>>> +       priv.runtime_entry_count = &runtime_entry_count;
>>>>> +
>>>>> +       /* Might as well exit boot services now */
>>>>> +       status = efi_exit_boot_services(handle, &map, &priv, exit_boot_func);
>>>>> +       if (status != EFI_SUCCESS)
>>>>> +               return status;
>>>>> +
>>>>> +       return EFI_SUCCESS;
>>>>> +}
>>>>> +
>>>>> +/*
>>>>> + * EFI entry point for the LoongArch EFI stub.
>>>>> + */
>>>>> +efi_status_t __efiapi efi_pe_entry(efi_handle_t handle, efi_system_table_t *sys_table)
>>> Why are you not using the generic EFI stub boot flow?
>> Hmmm, as I know, we define our own "boot_params", a interface between
>> bootloader (including efistub) and the core kernel to pass memmap,
>> cmdline and initrd information, three years ago. This method looks
>> like the X86 way, while different from the generic stub (which is
>> called arm stub before 5.8). In these years, many products have
>> already use the "boot_params" interface (including UEFI, PMON, Grub,
>> Kernel, etc., but most of them haven't be upstream). Replace
>> boot_params with FDT (i.e., the generic stub way) is difficult for us,
>> because it means a big broken of compatibility.
>>
> OK, I understand. So using the generic stub is not possible for you.
>
> So as long as you don't enable deprecated features such as initrd=, or
> rely on special hacks like putting magic numbers at fixed offsets in
> the image, I'm fine with this approach.

I'd like to add some relevant background: this "struct boot_params" 
thingy is actually a Loongson corporate standard. It is available at 
[1]; only in Chinese but should be minimally recognizable given much of 
it is C code, and you can see this struct and its friends barely changed 
since 2019.

The standard is in place long before inception of LoongArch (the 
earliest spec is dated back to 2014). Back when Loongson was still doing 
MIPS this is somewhat acceptable, due to fragmentation of the MIPS 
world, but they didn't take the chance to re-think most of this for 
LoongArch, instead simply porting everything over as-is. Hence the ship 
has more-or-less already sailed, and we indeed have to support this flow 
for keeping compatibility...

Or is there compatibility at all?

It turns out that this port is already incompatible with shipped 
systems, in other ways, at least since the March revision or so.

For one thing, the exact definition of this "struct boot_params" is 
already incompatibly revised; this version [2] is the one actually 
compatible with existing firmware, so people already have to write shims 
(not started yet) or flash their firmware (not open-sourced or provided 
by Loongson yet) to actually compile and run this port. (You haven't 
read that wrong; indeed no one outside Loongson is able to run this 
kernel so far.)

For another thing, the kernel ABI and the userland (mainly glibc) are 
also incompatible with the shipped systems with their pre-installed 
vendor systems. Things like different NSIG, sigcontext, and glibc symbol 
versions already ensured no binary can run in "the other world".

So, in effect, this port is starting from scratch, and taking the chance 
to fix early mistakes and oversights all over; hence my opinion is, 
better do the Right Thing (tm) and give the generic codepath a chance.

For the Loongson devs: at least, declare the struct boot_params flow 
deprecated from day one, then work to eliminate it from future products, 
if you really don't want to delay merging even further (it's already 
unlikely to land in 5.19, given the discussion happening in LKML [3]). 
It's not embarrassing to admit mistakes; we all make mistakes, and 
what's important is to learn from them so we don't collectively repeat 
ourselves.


[1]: 
https://web.archive.org/web/20190713081851/http://www.loongson.cn/uploadfile/devsysmanual/loongson_devsys_firmware_kernel_interface_specification.pdf
[2]: 
https://github.com/xen0n/linux/commit/a55739f8e748dc9164c12da504696161bb8b9911
[3]: https://lwn.net/ml/linux-kernel/87v8uk6kfa.wl-maz@kernel.org/


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 20/24] LoongArch: Add efistub booting support
  2022-05-06 11:26           ` WANG Xuerui
@ 2022-05-06 11:41             ` Arnd Bergmann
  2022-05-06 13:20               ` Huacai Chen
  0 siblings, 1 reply; 94+ messages in thread
From: Arnd Bergmann @ 2022-05-06 11:41 UTC (permalink / raw)
  To: WANG Xuerui
  Cc: Ard Biesheuvel, Huacai Chen, Arnd Bergmann, Huacai Chen,
	Andy Lutomirski, Thomas Gleixner, Peter Zijlstra, Andrew Morton,
	David Airlie, Jonathan Corbet, Linus Torvalds, linux-arch,
	open list:DOCUMENTATION, Linux Kernel Mailing List, Xuefeng Li,
	Yanteng Si, Guo Ren, Jiaxun Yang

On Fri, May 6, 2022 at 1:26 PM WANG Xuerui <kernel@xen0n.name> wrote:
> On 5/6/22 16:14, Ard Biesheuvel wrote:

> Or is there compatibility at all?
>
> It turns out that this port is already incompatible with shipped
> systems, in other ways, at least since the March revision or so.

I think we can treat user space compatibility separately from firmware
compatibility.

> So, in effect, this port is starting from scratch, and taking the chance
> to fix early mistakes and oversights all over; hence my opinion is,
> better do the Right Thing (tm) and give the generic codepath a chance.
>
> For the Loongson devs: at least, declare the struct boot_params flow
> deprecated from day one, then work to eliminate it from future products,
> if you really don't want to delay merging even further (it's already
> unlikely to land in 5.19, given the discussion happening in LKML [3]).
> It's not embarrassing to admit mistakes; we all make mistakes, and
> what's important is to learn from them so we don't collectively repeat
> ourselves.

Agreed. I think there can be limited compatibility support for old
firmware though, at least to help with the migration: As long as
the interface between grub and linux has a proper definition following
the normal UEFI standard, there can be both a modern grub
that is booted using the same protocol and a backwards-compatible
grub that can be booted from existing firmware and that is able
to boot the kernel.

The compatibility version of grub can be retired after the firmware
itself is able to speak the normal boot protocol.

       Arnd

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 20/24] LoongArch: Add efistub booting support
  2022-05-06 11:41             ` Arnd Bergmann
@ 2022-05-06 13:20               ` Huacai Chen
  2022-05-13 19:32                 ` Arnd Bergmann
  0 siblings, 1 reply; 94+ messages in thread
From: Huacai Chen @ 2022-05-06 13:20 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: WANG Xuerui, Ard Biesheuvel, Huacai Chen, Andy Lutomirski,
	Thomas Gleixner, Peter Zijlstra, Andrew Morton, David Airlie,
	Jonathan Corbet, Linus Torvalds, linux-arch,
	open list:DOCUMENTATION, Linux Kernel Mailing List, Xuefeng Li,
	Yanteng Si, Guo Ren, Jiaxun Yang

Hi, Ard, Arnd and Xuerui,

On Fri, May 6, 2022 at 7:41 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Fri, May 6, 2022 at 1:26 PM WANG Xuerui <kernel@xen0n.name> wrote:
> > On 5/6/22 16:14, Ard Biesheuvel wrote:
>
> > Or is there compatibility at all?
> >
> > It turns out that this port is already incompatible with shipped
> > systems, in other ways, at least since the March revision or so.
>
> I think we can treat user space compatibility separately from firmware
> compatibility.
>
> > So, in effect, this port is starting from scratch, and taking the chance
> > to fix early mistakes and oversights all over; hence my opinion is,
> > better do the Right Thing (tm) and give the generic codepath a chance.
> >
> > For the Loongson devs: at least, declare the struct boot_params flow
> > deprecated from day one, then work to eliminate it from future products,
> > if you really don't want to delay merging even further (it's already
> > unlikely to land in 5.19, given the discussion happening in LKML [3]).
> > It's not embarrassing to admit mistakes; we all make mistakes, and
> > what's important is to learn from them so we don't collectively repeat
> > ourselves.
>
> Agreed. I think there can be limited compatibility support for old
> firmware though, at least to help with the migration: As long as
> the interface between grub and linux has a proper definition following
> the normal UEFI standard, there can be both a modern grub
> that is booted using the same protocol and a backwards-compatible
> grub that can be booted from existing firmware and that is able
> to boot the kernel.
>
> The compatibility version of grub can be retired after the firmware
> itself is able to speak the normal boot protocol.
After an internal discussion, we decide to use the generic stub, and
we have a draft version of generic stub now[1]. I hope V10 can solve
all problems. :)
[1] https://github.com/loongson/linux/tree/loongarch-next-generic-stub

Huacai
>
>        Arnd

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 13/24] LoongArch: Add system call support
  2022-04-30 10:34       ` Arnd Bergmann
@ 2022-05-07 12:11         ` Christian Brauner
  2022-05-09 10:00           ` Christian Brauner
  0 siblings, 1 reply; 94+ messages in thread
From: Christian Brauner @ 2022-05-07 12:11 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Huacai Chen, Huacai Chen, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Guo Ren,
	Xuerui Wang, Jiaxun Yang, Linux API

On Sat, Apr 30, 2022 at 12:34:52PM +0200, Arnd Bergmann wrote:
> On Sat, Apr 30, 2022 at 12:05 PM Huacai Chen <chenhuacai@gmail.com> wrote:
> > On Sat, Apr 30, 2022 at 5:45 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> > > >
> > > > This patch adds system call support and related uaccess.h for LoongArch.
> > > >
> > > > Q: Why keep __ARCH_WANT_NEW_STAT definition while there is statx:
> > > > A: Until the latest glibc release (2.34), statx is only used for 32-bit
> > > >    platforms, or 64-bit platforms with 32-bit timestamp. I.e., Most 64-
> > > >    bit platforms still use newstat now.
> > > >
> > > > Q: Why keep _ARCH_WANT_SYS_CLONE definition while there is clone3:
> > > > A: The latest glibc release (2.34) has some basic support for clone3 but
> > > >    it isn't complete. E.g., pthread_create() and spawni() have converted
> > > >    to use clone3 but fork() will still use clone. Moreover, some seccomp
> > > >    related applications can still not work perfectly with clone3. E.g.,
> > > >    Chromium sandbox cannot work at all and there is no solution for it,
> > > >    which is more terrible than the fork() story [1].
> > > >
> > > > [1] https://chromium-review.googlesource.com/c/chromium/src/+/2936184
> > >
> > > I still think these have to be removed. There is no mainline glibc or musl
> > > port yet, and neither of them should actually be required. Please remove
> > > them here, and modify your libc patches accordingly when you send those
> > > upstream.
> >
> > If this is just a problem that can be resolved by upgrading
> > glibc/musl, I will remove them. But the Chromium problem (or sandbox
> > problem in general) seems to have no solution now.
> 
> I added Christian Brauner to Cc now, maybe he has come across the
> sandbox problem before and has an idea for a solution.

(I just got back from LSFMM so I'll reply in more detail next week. I'm
still pretty jet-lagged.)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 13/24] LoongArch: Add system call support
  2022-05-07 12:11         ` Christian Brauner
@ 2022-05-09 10:00           ` Christian Brauner
  2022-05-11  7:11             ` Arnd Bergmann
  2022-05-11 16:17             ` Florian Weimer
  0 siblings, 2 replies; 94+ messages in thread
From: Christian Brauner @ 2022-05-09 10:00 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Huacai Chen, Huacai Chen, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, David Airlie, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Guo Ren,
	Xuerui Wang, Jiaxun Yang, Linux API

On Sat, May 07, 2022 at 02:11:04PM +0200, Christian Brauner wrote:
> On Sat, Apr 30, 2022 at 12:34:52PM +0200, Arnd Bergmann wrote:
> > On Sat, Apr 30, 2022 at 12:05 PM Huacai Chen <chenhuacai@gmail.com> wrote:
> > > On Sat, Apr 30, 2022 at 5:45 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> > > > >
> > > > > This patch adds system call support and related uaccess.h for LoongArch.
> > > > >
> > > > > Q: Why keep __ARCH_WANT_NEW_STAT definition while there is statx:
> > > > > A: Until the latest glibc release (2.34), statx is only used for 32-bit
> > > > >    platforms, or 64-bit platforms with 32-bit timestamp. I.e., Most 64-
> > > > >    bit platforms still use newstat now.
> > > > >
> > > > > Q: Why keep _ARCH_WANT_SYS_CLONE definition while there is clone3:
> > > > > A: The latest glibc release (2.34) has some basic support for clone3 but
> > > > >    it isn't complete. E.g., pthread_create() and spawni() have converted
> > > > >    to use clone3 but fork() will still use clone. Moreover, some seccomp
> > > > >    related applications can still not work perfectly with clone3. E.g.,
> > > > >    Chromium sandbox cannot work at all and there is no solution for it,
> > > > >    which is more terrible than the fork() story [1].
> > > > >
> > > > > [1] https://chromium-review.googlesource.com/c/chromium/src/+/2936184
> > > >
> > > > I still think these have to be removed. There is no mainline glibc or musl
> > > > port yet, and neither of them should actually be required. Please remove
> > > > them here, and modify your libc patches accordingly when you send those
> > > > upstream.
> > >
> > > If this is just a problem that can be resolved by upgrading
> > > glibc/musl, I will remove them. But the Chromium problem (or sandbox
> > > problem in general) seems to have no solution now.
> > 
> > I added Christian Brauner to Cc now, maybe he has come across the
> > sandbox problem before and has an idea for a solution.
> 
> (I just got back from LSFMM so I'll reply in more detail next week. I'm
> still pretty jet-lagged.)

Right, I forgot about the EPERM/ENOSYS sandbox thread.

Kees and I gave a talk about this problem at LPC 2019 (see [2]). The
proposed solutions back then was to add basic deep argument inspection
for first-level pointers to seccomp.

There are problems with this approach such as not useable on
second-level pointers (although we concluded that's ok) and if the input
args are very large copying stuff from within seccomp becomes rather
costly and in general the various approaches seemed handwavy at the
time.

If seccomp were to be made to support some basic form of eBPF such that
it can still be safely called by unprivileged users then this would
likely be easier to do (famous last words) but given that the stance has
traditionally bee to not port seccomp it remains a tricky patch.

Some time after that I talked to Mathieu Desnoyers about this issue who
used another angle of attack. The idea seems less complicated to me.
Instead of argument inspection we introduce basic syscall argument
checksumming for seccomp. It would only be done when seccomp is
interested in syscall input args and checksumming would be per syscall
argument. It would be validated within the syscall when it actually
reads the arguments; again, only if seccomp is used. If the checksums
mismatch an error is returned or the calling process terminated.

There's one case that deserves mentioning: since we introduced the
seccomp notifier we do allow advanced syscall interception and we do use
it extensively in various projects.

Roughly, it works by allowing a userspace process (the "supervisor") to
listen on a seccomp fd. The seccomp fd is an fd referring to the filter
of a target task (the "supervisee"). When the supervisee performs a
syscall listed in the seccomp notify filter the supervisor will receive
a notification on the seccomp fd for the filter.

I mention this because it is possible for the supervisor to e.g.
intercept an bpf() system call and then modify/create/attach a bpf
program for the supervisee and then update fields in the supervisee's
bpf struct that was passed to the bpf() syscall by it. So the supervisor
might rewrite syscall args and continue the syscall (In general, it's
not recommeneded because of TOCTOU. But still doable in certain
scenarios where we can guarantee that this is safe even if syscall args
are rewritten to something else by a MIT attack.).

Arguably, the checksumming approach could even be made to work with this
if the seccomp fd learns a new ioctl() or similar to safely update the
checksum.

I can try and move a poc for this up the todo list.

Without an approach like this certain sandboxes will fallback to
ENOSYSing system calls they can't filter. This is a generic problem
though with clone3() being one promiment example.

[2]: https://www.youtube.com/watch?v=PnOSPsRzVYM&list=PLVsQ_xZBEyN2Ol7y8axxhbTsG47Va3Se2

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 13/24] LoongArch: Add system call support
  2022-05-09 10:00           ` Christian Brauner
@ 2022-05-11  7:11             ` Arnd Bergmann
  2022-05-11 21:12               ` [musl] " Rich Felker
  2022-05-11 16:17             ` Florian Weimer
  1 sibling, 1 reply; 94+ messages in thread
From: Arnd Bergmann @ 2022-05-11  7:11 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Arnd Bergmann, Huacai Chen, Huacai Chen, Andy Lutomirski,
	Thomas Gleixner, Peter Zijlstra, Andrew Morton, David Airlie,
	Jonathan Corbet, Linus Torvalds, linux-arch,
	open list:DOCUMENTATION, Linux Kernel Mailing List, Xuefeng Li,
	Yanteng Si, Guo Ren, Xuerui Wang, Jiaxun Yang, Linux API,
	GNU C Library, musl

On Mon, May 9, 2022 at 12:00 PM Christian Brauner <brauner@kernel.org> wrote:
....
> I can try and move a poc for this up the todo list.
>
> Without an approach like this certain sandboxes will fallback to
> ENOSYSing system calls they can't filter. This is a generic problem
> though with clone3() being one promiment example.

Thank you for the detailed reply. It sounds to me like this will eventually have
to get solved anyway, so we could move ahead without clone() on loongarch,
and just not have support for Chrome until this is fully solved.

As both the glibc and musl ports are being proposed for inclusion right
now, we should try to come to a decision so the libc ports can adjust if
necessary. Adding both mailing lists to Cc here, the discussion is archived
at [1].

         Arnd

[1] https://lore.kernel.org/linux-arch/20220509100058.vmrgn5fkk3ayt63v@wittgenstein/

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 13/24] LoongArch: Add system call support
  2022-05-09 10:00           ` Christian Brauner
  2022-05-11  7:11             ` Arnd Bergmann
@ 2022-05-11 16:17             ` Florian Weimer
  1 sibling, 0 replies; 94+ messages in thread
From: Florian Weimer @ 2022-05-11 16:17 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Arnd Bergmann, Huacai Chen, Huacai Chen, Andy Lutomirski,
	Thomas Gleixner, Peter Zijlstra, Andrew Morton, David Airlie,
	Jonathan Corbet, Linus Torvalds, linux-arch,
	open list:DOCUMENTATION, Linux Kernel Mailing List, Xuefeng Li,
	Yanteng Si, Guo Ren, Xuerui Wang, Jiaxun Yang, Linux API

* Christian Brauner:

> Without an approach like this certain sandboxes will fallback to
> ENOSYSing system calls they can't filter. This is a generic problem
> though with clone3() being one promiment example.

Furthermore, for glibc (and I believe musl as well), the trick with
in-process emulation of clone3 using SIGSYS does not work here because
we must inhibit delivery of signals on the nascent thread, before it is
fully set up.  This means that we have to block signals around the
clone/clone3 system call, so that the new thread is created with all
signals blocked.  This means that instead of calling the SIGSYS handler,
the filtered system call simply terminates the process.

(I think there have been discussions of using out-of-process filtering,
but I don't know where we are with that.)

Thanks,
Florian


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [musl] Re: [PATCH V9 13/24] LoongArch: Add system call support
  2022-05-11  7:11             ` Arnd Bergmann
@ 2022-05-11 21:12               ` Rich Felker
  2022-05-12  7:21                 ` Arnd Bergmann
  0 siblings, 1 reply; 94+ messages in thread
From: Rich Felker @ 2022-05-11 21:12 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christian Brauner, Huacai Chen, Huacai Chen, Andy Lutomirski,
	Thomas Gleixner, Peter Zijlstra, Andrew Morton, David Airlie,
	Jonathan Corbet, Linus Torvalds, linux-arch,
	open list:DOCUMENTATION, Linux Kernel Mailing List, Xuefeng Li,
	Yanteng Si, Guo Ren, Xuerui Wang, Jiaxun Yang, Linux API,
	GNU C Library, musl

On Wed, May 11, 2022 at 09:11:56AM +0200, Arnd Bergmann wrote:
> On Mon, May 9, 2022 at 12:00 PM Christian Brauner <brauner@kernel.org> wrote:
> .....
> > I can try and move a poc for this up the todo list.
> >
> > Without an approach like this certain sandboxes will fallback to
> > ENOSYSing system calls they can't filter. This is a generic problem
> > though with clone3() being one promiment example.
> 
> Thank you for the detailed reply. It sounds to me like this will eventually have
> to get solved anyway, so we could move ahead without clone() on loongarch,
> and just not have support for Chrome until this is fully solved.
> 
> As both the glibc and musl ports are being proposed for inclusion right
> now, we should try to come to a decision so the libc ports can adjust if
> necessary. Adding both mailing lists to Cc here, the discussion is archived
> at [1].
> 
>          Arnd
> 
> [1] https://lore.kernel.org/linux-arch/20220509100058.vmrgn5fkk3ayt63v@wittgenstein/

Having read about the seccomp issue, I think it's a very strong
argument that __NR_clone should be kept permanently for all future
archs. Otherwise, at least AIUI, it's impossible to seccomp-sandbox
multithreaded programs (since you can't allow the creation of threads
without also allowing other unwanted use of clone3). It sounds like
there's some interest in extending seccomp to allow filtering of
argument blocks like clone3 uses, but some of what I read about was
checksum-based (thus a weak hardening measure at best, not a hard
privilege boundary) and even if something is eventually created that
works, it won't be available right away, and it won't be nearly as
easy to use as just allowing thread-creating clone syscalls on
existing archs.

Rich

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [musl] Re: [PATCH V9 13/24] LoongArch: Add system call support
  2022-05-11 21:12               ` [musl] " Rich Felker
@ 2022-05-12  7:21                 ` Arnd Bergmann
  2022-05-12 12:11                   ` Rich Felker
  0 siblings, 1 reply; 94+ messages in thread
From: Arnd Bergmann @ 2022-05-12  7:21 UTC (permalink / raw)
  To: musl
  Cc: Arnd Bergmann, Christian Brauner, Huacai Chen, Huacai Chen,
	Andy Lutomirski, Thomas Gleixner, Peter Zijlstra, Andrew Morton,
	David Airlie, Jonathan Corbet, Linus Torvalds, linux-arch,
	open list:DOCUMENTATION, Linux Kernel Mailing List, Xuefeng Li,
	Yanteng Si, Guo Ren, Xuerui Wang, Jiaxun Yang, Linux API,
	GNU C Library

On Wed, May 11, 2022 at 11:12 PM Rich Felker <dalias@libc.org> wrote:
> On Wed, May 11, 2022 at 09:11:56AM +0200, Arnd Bergmann wrote:
> > On Mon, May 9, 2022 at 12:00 PM Christian Brauner <brauner@kernel.org> wrote:
> > .....
> > > I can try and move a poc for this up the todo list.
> > >
> > > Without an approach like this certain sandboxes will fallback to
> > > ENOSYSing system calls they can't filter. This is a generic problem
> > > though with clone3() being one promiment example.
> >
> > Thank you for the detailed reply. It sounds to me like this will eventually have
> > to get solved anyway, so we could move ahead without clone() on loongarch,
> > and just not have support for Chrome until this is fully solved.
> >
> > As both the glibc and musl ports are being proposed for inclusion right
> > now, we should try to come to a decision so the libc ports can adjust if
> > necessary. Adding both mailing lists to Cc here, the discussion is archived
> > at [1].
> >
> >          Arnd
> >
> > [1] https://lore.kernel.org/linux-arch/20220509100058.vmrgn5fkk3ayt63v@wittgenstein/
>
> Having read about the seccomp issue, I think it's a very strong
> argument that __NR_clone should be kept permanently for all future
> archs.

Ok, let's keep clone() around for all architectures then. We should probably
just remove the __ARCH_WANT_SYS_CLONE macro and build the
code into the kernel unconditionally, but at the moment there
are still private versions for ia64 and sparc with the same name as
the generic version. Both are also still lacking support for clone3() and
don't have anyone actively working on them.

In this case, we probably don't need to change clone3() to allow the
zero-length stack after all, and the wrapper that was added to the
musl port should get removed again.

For the other syscalls, I think the latest musl patches already dropped
the old-style stat() implementation, but the glibc patches still have those
and need to drop them as well to match what the kernel will get.

       Arnd

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [musl] Re: [PATCH V9 13/24] LoongArch: Add system call support
  2022-05-12  7:21                 ` Arnd Bergmann
@ 2022-05-12 12:11                   ` Rich Felker
  0 siblings, 0 replies; 94+ messages in thread
From: Rich Felker @ 2022-05-12 12:11 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: musl, Christian Brauner, Huacai Chen, Huacai Chen,
	Andy Lutomirski, Thomas Gleixner, Peter Zijlstra, Andrew Morton,
	David Airlie, Jonathan Corbet, Linus Torvalds, linux-arch,
	open list:DOCUMENTATION, Linux Kernel Mailing List, Xuefeng Li,
	Yanteng Si, Guo Ren, Xuerui Wang, Jiaxun Yang, Linux API,
	GNU C Library

On Thu, May 12, 2022 at 09:21:13AM +0200, Arnd Bergmann wrote:
> On Wed, May 11, 2022 at 11:12 PM Rich Felker <dalias@libc.org> wrote:
> > On Wed, May 11, 2022 at 09:11:56AM +0200, Arnd Bergmann wrote:
> > > On Mon, May 9, 2022 at 12:00 PM Christian Brauner <brauner@kernel.org> wrote:
> > > .....
> > > > I can try and move a poc for this up the todo list.
> > > >
> > > > Without an approach like this certain sandboxes will fallback to
> > > > ENOSYSing system calls they can't filter. This is a generic problem
> > > > though with clone3() being one promiment example.
> > >
> > > Thank you for the detailed reply. It sounds to me like this will eventually have
> > > to get solved anyway, so we could move ahead without clone() on loongarch,
> > > and just not have support for Chrome until this is fully solved.
> > >
> > > As both the glibc and musl ports are being proposed for inclusion right
> > > now, we should try to come to a decision so the libc ports can adjust if
> > > necessary. Adding both mailing lists to Cc here, the discussion is archived
> > > at [1].
> > >
> > >          Arnd
> > >
> > > [1] https://lore.kernel.org/linux-arch/20220509100058.vmrgn5fkk3ayt63v@wittgenstein/
> >
> > Having read about the seccomp issue, I think it's a very strong
> > argument that __NR_clone should be kept permanently for all future
> > archs.
> 
> Ok, let's keep clone() around for all architectures then. We should probably
> just remove the __ARCH_WANT_SYS_CLONE macro and build the
> code into the kernel unconditionally, but at the moment there
> are still private versions for ia64 and sparc with the same name as
> the generic version. Both are also still lacking support for clone3() and
> don't have anyone actively working on them.
> 
> In this case, we probably don't need to change clone3() to allow the
> zero-length stack after all, and the wrapper that was added to the
> musl port should get removed again.

I still think disallowing a zero length (unknown length with caller
providing the start address only) stack is a gratuitous limitation on
the clone3 interface, and would welcome leaving the change to allow
zero-length in place. There does not seem to be any good justification
for forbidding it, and it does pose other real-world obstruction to
use. For example if your main thread had exited (or if you're forking
from a non-main thread) and you wanted to create a new process using
the old main thread stack as your stack, you would not know a
size/lowest-address, only a starting address from which it extends
some long (and possibly expanding) amount.

Rich

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 20/24] LoongArch: Add efistub booting support
  2022-05-06 13:20               ` Huacai Chen
@ 2022-05-13 19:32                 ` Arnd Bergmann
  2022-05-14  2:27                   ` Huacai Chen
  0 siblings, 1 reply; 94+ messages in thread
From: Arnd Bergmann @ 2022-05-13 19:32 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Arnd Bergmann, WANG Xuerui, Ard Biesheuvel, Huacai Chen,
	Andy Lutomirski, Thomas Gleixner, Peter Zijlstra, Andrew Morton,
	Jonathan Corbet, Linus Torvalds, linux-arch,
	open list:DOCUMENTATION, Linux Kernel Mailing List, Xuefeng Li,
	Yanteng Si, Guo Ren, Jiaxun Yang, Stephen Rothwell

On Fri, May 6, 2022 at 3:20 PM Huacai Chen <chenhuacai@gmail.com> wrote:
> On Fri, May 6, 2022 at 7:41 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > Agreed. I think there can be limited compatibility support for old
> > firmware though, at least to help with the migration: As long as
> > the interface between grub and linux has a proper definition following
> > the normal UEFI standard, there can be both a modern grub
> > that is booted using the same protocol and a backwards-compatible
> > grub that can be booted from existing firmware and that is able
> > to boot the kernel.
> >
> > The compatibility version of grub can be retired after the firmware
> > itself is able to speak the normal boot protocol.
> After an internal discussion, we decide to use the generic stub, and
> we have a draft version of generic stub now[1]. I hope V10 can solve
> all problems. :)
> [1] https://github.com/loongson/linux/tree/loongarch-next-generic-stub

Can you post v19 to the list? As we have resolved the question on clone()
now (I hope), and you have a prototype for the boot protocol, it sounds
like this can make it into v5.19 after all, but we need to be sure that the
remaining points that Xuerui Wang and Ard Biesheuvel raised are
all addressed, and there is not much time before the merge window.

I have built a gcc-12.1 based toochain at
https://mirrors.edge.kernel.org/pub/tools/crosstool/ that now includes
loongarch64 suport, please point to that in the cover letter for v10
in case someone wants to start test building.

I will be travelling next week, and won't be able to pull your tree
into the asm-generic tree during that time, as I had originally planned.

However, you can ask Stephen Rothwell (added to Cc) to add your
git tree to linux-next once you think that you have addressed all of the
remaining review comments, and posted the same version to the
list. This will allow others to more easily test your tree in combination
with the other work that has been queued for the 5.19 release.

If there are no new show-stoppers, I can help you coordinate
the pull request during the merge window.

         Arnd

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH V9 20/24] LoongArch: Add efistub booting support
  2022-05-13 19:32                 ` Arnd Bergmann
@ 2022-05-14  2:27                   ` Huacai Chen
  0 siblings, 0 replies; 94+ messages in thread
From: Huacai Chen @ 2022-05-14  2:27 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: WANG Xuerui, Ard Biesheuvel, Huacai Chen, Andy Lutomirski,
	Thomas Gleixner, Peter Zijlstra, Andrew Morton, Jonathan Corbet,
	Linus Torvalds, linux-arch, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Xuefeng Li, Yanteng Si, Guo Ren,
	Jiaxun Yang, Stephen Rothwell

Hi, Arnd,

On Sat, May 14, 2022 at 3:33 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Fri, May 6, 2022 at 3:20 PM Huacai Chen <chenhuacai@gmail.com> wrote:
> > On Fri, May 6, 2022 at 7:41 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > >
> > > Agreed. I think there can be limited compatibility support for old
> > > firmware though, at least to help with the migration: As long as
> > > the interface between grub and linux has a proper definition following
> > > the normal UEFI standard, there can be both a modern grub
> > > that is booted using the same protocol and a backwards-compatible
> > > grub that can be booted from existing firmware and that is able
> > > to boot the kernel.
> > >
> > > The compatibility version of grub can be retired after the firmware
> > > itself is able to speak the normal boot protocol.
> > After an internal discussion, we decide to use the generic stub, and
> > we have a draft version of generic stub now[1]. I hope V10 can solve
> > all problems. :)
> > [1] https://github.com/loongson/linux/tree/loongarch-next-generic-stub
>
> Can you post v19 to the list? As we have resolved the question on clone()
> now (I hope), and you have a prototype for the boot protocol, it sounds
> like this can make it into v5.19 after all, but we need to be sure that the
> remaining points that Xuerui Wang and Ard Biesheuvel raised are
> all addressed, and there is not much time before the merge window.
>
> I have built a gcc-12.1 based toochain at
> https://mirrors.edge.kernel.org/pub/tools/crosstool/ that now includes
> loongarch64 suport, please point to that in the cover letter for v10
> in case someone wants to start test building.
>
> I will be travelling next week, and won't be able to pull your tree
> into the asm-generic tree during that time, as I had originally planned.
>
OK, thanks, I will send a new version today.

Huacai

> However, you can ask Stephen Rothwell (added to Cc) to add your
> git tree to linux-next once you think that you have addressed all of the
> remaining review comments, and posted the same version to the
> list. This will allow others to more easily test your tree in combination
> with the other work that has been queued for the 5.19 release.
>
> If there are no new show-stoppers, I can help you coordinate
> the pull request during the merge window.
>
>          Arnd

^ permalink raw reply	[flat|nested] 94+ messages in thread

end of thread, other threads:[~2022-05-14  2:56 UTC | newest]

Thread overview: 94+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
2022-04-30  9:04 ` [PATCH V9 01/24] Documentation: LoongArch: Add basic documentations Huacai Chen
2022-05-01  7:48   ` Bagas Sanjaya
2022-05-01  8:55     ` Huacai Chen
2022-05-01  9:32   ` WANG Xuerui
2022-05-01 10:17     ` Huacai Chen
2022-04-30  9:04 ` [PATCH V9 02/24] Documentation/zh_CN: Add basic LoongArch documentations Huacai Chen
2022-05-01  9:38   ` WANG Xuerui
2022-04-30  9:04 ` [PATCH V9 03/24] LoongArch: Add elf-related definitions Huacai Chen
2022-05-01  9:41   ` WANG Xuerui
2022-05-01 14:27     ` Huacai Chen
2022-04-30  9:04 ` [PATCH V9 04/24] LoongArch: Add writecombine support for drm Huacai Chen
2022-04-30  9:04 ` [PATCH V9 05/24] LoongArch: Add build infrastructure Huacai Chen
2022-05-01 10:09   ` WANG Xuerui
2022-05-01 12:41     ` Huacai Chen
2022-05-01 15:43     ` Xi Ruoyao
2022-04-30  9:05 ` [PATCH V9 06/24] LoongArch: Add CPU definition headers Huacai Chen
2022-05-01 11:05   ` WANG Xuerui
2022-05-01 15:07     ` Huacai Chen
2022-04-30  9:05 ` [PATCH V9 07/24] LoongArch: Add atomic/locking headers Huacai Chen
2022-05-01 11:16   ` WANG Xuerui
2022-05-01 13:16     ` Huacai Chen
2022-04-30  9:05 ` [PATCH V9 08/24] LoongArch: Add other common headers Huacai Chen
2022-05-01 11:39   ` WANG Xuerui
2022-05-01 14:26     ` Huacai Chen
2022-04-30  9:05 ` [PATCH V9 09/24] LoongArch: Add boot and setup routines Huacai Chen
2022-04-30  9:05 ` [PATCH V9 10/24] LoongArch: Add exception/interrupt handling Huacai Chen
2022-05-01 16:27   ` Xi Ruoyao
2022-05-01 17:08     ` Xi Ruoyao
2022-05-02  0:01       ` Huacai Chen
2022-04-30  9:05 ` [PATCH V9 11/24] LoongArch: Add process management Huacai Chen
2022-04-30  9:05 ` [PATCH V9 12/24] LoongArch: Add memory management Huacai Chen
2022-04-30  9:05 ` [PATCH V9 13/24] LoongArch: Add system call support Huacai Chen
2022-04-30  9:44   ` Arnd Bergmann
2022-04-30 10:05     ` Huacai Chen
2022-04-30 10:34       ` Arnd Bergmann
2022-05-07 12:11         ` Christian Brauner
2022-05-09 10:00           ` Christian Brauner
2022-05-11  7:11             ` Arnd Bergmann
2022-05-11 21:12               ` [musl] " Rich Felker
2022-05-12  7:21                 ` Arnd Bergmann
2022-05-12 12:11                   ` Rich Felker
2022-05-11 16:17             ` Florian Weimer
2022-04-30  9:05 ` [PATCH V9 14/24] LoongArch: Add signal handling support Huacai Chen
2022-04-30  9:05 ` [PATCH V9 15/24] LoongArch: Add elf and module support Huacai Chen
2022-04-30  9:05 ` [PATCH V9 16/24] LoongArch: Add misc common routines Huacai Chen
2022-04-30  9:50   ` Arnd Bergmann
2022-04-30 10:00     ` Huacai Chen
2022-04-30 10:41       ` Arnd Bergmann
2022-04-30 13:22         ` Palmer Dabbelt
2022-05-01  5:12           ` Huacai Chen
2022-04-30  9:05 ` [PATCH V9 17/24] LoongArch: Add some library functions Huacai Chen
2022-05-01 10:55   ` Guo Ren
2022-05-01 12:18     ` Huacai Chen
2022-04-30  9:05 ` [PATCH V9 18/24] LoongArch: Add PCI controller support Huacai Chen
2022-04-30  9:05 ` [PATCH V9 19/24] LoongArch: Add VDSO and VSYSCALL support Huacai Chen
2022-04-30  9:05 ` [PATCH V9 20/24] LoongArch: Add efistub booting support Huacai Chen
2022-04-30  9:56   ` Arnd Bergmann
2022-04-30 10:02     ` Huacai Chen
2022-05-03  7:23     ` Ard Biesheuvel
2022-05-05  9:59       ` Huacai Chen
2022-05-06  8:14         ` Ard Biesheuvel
2022-05-06 11:26           ` WANG Xuerui
2022-05-06 11:41             ` Arnd Bergmann
2022-05-06 13:20               ` Huacai Chen
2022-05-13 19:32                 ` Arnd Bergmann
2022-05-14  2:27                   ` Huacai Chen
2022-04-30  9:05 ` [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support Huacai Chen
2022-04-30 10:07   ` Arnd Bergmann
2022-04-30 10:07     ` Arnd Bergmann
2022-04-30 10:07     ` Arnd Bergmann
2022-05-01  5:22     ` Huacai Chen
2022-05-01  5:22       ` Huacai Chen
2022-05-01  5:22       ` Huacai Chen
2022-05-01  6:35       ` Russell King (Oracle)
2022-05-01  6:35         ` Russell King (Oracle)
2022-05-01  6:35         ` Russell King (Oracle)
2022-05-01  8:46         ` Huacai Chen
2022-05-01  8:46           ` Huacai Chen
2022-05-01  8:46           ` Huacai Chen
2022-05-01 11:28           ` Russell King (Oracle)
2022-05-01 11:28             ` Russell King (Oracle)
2022-05-01 11:28             ` Russell King (Oracle)
2022-05-01  8:33       ` Arnd Bergmann
2022-05-01  8:33         ` Arnd Bergmann
2022-05-01  8:33         ` Arnd Bergmann
2022-05-01 23:36     ` Ard Biesheuvel
2022-05-01 23:36       ` Ard Biesheuvel
2022-05-01 23:36       ` Ard Biesheuvel
2022-04-30  9:05 ` [PATCH V9 22/24] LoongArch: Add multi-processor (SMP) support Huacai Chen
2022-04-30  9:05 ` [PATCH V9 23/24] LoongArch: Add Non-Uniform Memory Access (NUMA) support Huacai Chen
2022-04-30  9:05 ` [PATCH V9 24/24] LoongArch: Add Loongson-3 default config file Huacai Chen
2022-05-01  8:19 ` [PATCH V9 00/22] arch: Add basic LoongArch support Bagas Sanjaya
2022-05-01  8:55   ` Huacai Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.