From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E195C4332F for ; Mon, 7 Nov 2022 22:42:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D131C6B0073; Mon, 7 Nov 2022 17:42:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CCE206B0074; Mon, 7 Nov 2022 17:42:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B8A958E0001; Mon, 7 Nov 2022 17:42:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id A86226B0073 for ; Mon, 7 Nov 2022 17:42:38 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 81E67160FA6 for ; Mon, 7 Nov 2022 22:42:38 +0000 (UTC) X-FDA: 80108121996.19.DD5542E Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by imf13.hostedemail.com (Postfix) with ESMTP id 1276220006 for ; Mon, 7 Nov 2022 22:42:37 +0000 (UTC) Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2A7LKul6007178 for ; Mon, 7 Nov 2022 14:42:37 -0800 Received: from maileast.thefacebook.com ([163.114.130.8]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3knnnvb06y-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 07 Nov 2022 14:42:37 -0800 Received: from twshared15216.17.frc2.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 7 Nov 2022 14:42:35 -0800 Received: by devbig932.frc1.facebook.com (Postfix, from userid 4523) id 5DFB5F6B36E8; Mon, 7 Nov 2022 14:39:29 -0800 (PST) From: Song Liu To: , CC: , , , , , , , , Song Liu Subject: [PATCH bpf-next v2 3/5] bpf: use execmem_alloc for bpf program and bpf dispatcher Date: Mon, 7 Nov 2022 14:39:19 -0800 Message-ID: <20221107223921.3451913-4-song@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221107223921.3451913-1-song@kernel.org> References: <20221107223921.3451913-1-song@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-FB-Internal: Safe Content-Type: text/plain X-Proofpoint-ORIG-GUID: SBtwCXlm8Wa48Y4I5HMuv9B-vr5ktrpM X-Proofpoint-GUID: SBtwCXlm8Wa48Y4I5HMuv9B-vr5ktrpM X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-07_11,2022-11-07_02,2022-06-22_01 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1667860958; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=D4zbyFI8BC6F6G9JqcfwVDKpjEN+Qr9HhIUCD+7jKZE=; b=bHwem1VLpyHDMmRT5rF6KMVMCa9kR8phatyyyVcy60RUF214FmjFiRQdGu8iBzmzTZIQ8r LeNlHm99PjETN0TYSI3IuyzFCSqeAVZWcRmnqLrGpFU1LQv3VEVsceH0p7fXvTQMAibOuS d4SnYnLAakwaT4zzKMAHeuq8hZs9kp4= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=none; spf=pass (imf13.hostedemail.com: domain of "prvs=2310266901=songliubraving@meta.com" designates 67.231.153.30 as permitted sender) smtp.mailfrom="prvs=2310266901=songliubraving@meta.com"; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1667860958; a=rsa-sha256; cv=none; b=mB7SvzUiSyCpj0K56qvwukxgYVnJlI/hdDir09G0CABlidQLxVuRVKp5t2kgG+ablFswtg wUzar0VMuZHZGc6mYt86J41wm2z8ipI7/z5eHstTljqT1LRPa71bcum2MayXbF9+BFciOh hD5XpJAJDPn230f+4ZMLGMhI+BMHEuY= Authentication-Results: imf13.hostedemail.com; dkim=none; spf=pass (imf13.hostedemail.com: domain of "prvs=2310266901=songliubraving@meta.com" designates 67.231.153.30 as permitted sender) smtp.mailfrom="prvs=2310266901=songliubraving@meta.com"; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none) X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 1276220006 X-Stat-Signature: 59p6f8dftnnnxp5m818bquzyrjhr6dun X-HE-Tag: 1667860957-845901 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Use execmem_alloc, execmem_free, and execmem_fill instead of bpf_prog_pack_alloc, bpf_prog_pack_free, and bpf_arch_text_copy. execmem_free doesn't require extra size information. Therefore, the free and error handling path can be simplified. Signed-off-by: Song Liu --- arch/x86/net/bpf_jit_comp.c | 23 +---- include/linux/bpf.h | 3 - include/linux/filter.h | 5 - kernel/bpf/core.c | 180 +++--------------------------------- kernel/bpf/dispatcher.c | 11 +-- 5 files changed, 21 insertions(+), 201 deletions(-) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index cec5195602bc..43b93570d8f9 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -229,11 +229,6 @@ static void jit_fill_hole(void *area, unsigned int s= ize) memset(area, 0xcc, size); } =20 -int bpf_arch_text_invalidate(void *dst, size_t len) -{ - return IS_ERR_OR_NULL(text_poke_set(dst, 0xcc, len)); -} - struct jit_context { int cleanup_addr; /* Epilogue code offset */ =20 @@ -2509,11 +2504,9 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_pr= og *prog) if (proglen <=3D 0) { out_image: image =3D NULL; - if (header) { - bpf_arch_text_copy(&header->size, &rw_header->size, - sizeof(rw_header->size)); + if (header) bpf_jit_binary_pack_free(header, rw_header); - } + /* Fall back to interpreter mode */ prog =3D orig_prog; if (extra_pass) { @@ -2563,8 +2556,9 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_pro= g *prog) if (!prog->is_func || extra_pass) { /* * bpf_jit_binary_pack_finalize fails in two scenarios: - * 1) header is not pointing to proper module memory; - * 2) the arch doesn't support bpf_arch_text_copy(). + * 1) header is not pointing to memory allocated by + * execmem_alloc; + * 2) the arch doesn't support execmem_free(). * * Both cases are serious bugs and justify WARN_ON. */ @@ -2610,13 +2604,6 @@ bool bpf_jit_supports_kfunc_call(void) return true; } =20 -void *bpf_arch_text_copy(void *dst, void *src, size_t len) -{ - if (text_poke_copy(dst, src, len) =3D=3D NULL) - return ERR_PTR(-EINVAL); - return dst; -} - /* Indicate the JIT backend supports mixing bpf2bpf and tailcalls. */ bool bpf_jit_supports_subprog_tailcalls(void) { diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 798aec816970..e9f4806cec93 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -2721,9 +2721,6 @@ enum bpf_text_poke_type { int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t, void *addr1, void *addr2); =20 -void *bpf_arch_text_copy(void *dst, void *src, size_t len); -int bpf_arch_text_invalidate(void *dst, size_t len); - struct btf_id_set; bool btf_id_set_contains(const struct btf_id_set *set, u32 id); =20 diff --git a/include/linux/filter.h b/include/linux/filter.h index efc42a6e3aed..98e28126c24b 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -1023,8 +1023,6 @@ extern long bpf_jit_limit_max; =20 typedef void (*bpf_jit_fill_hole_t)(void *area, unsigned int size); =20 -void bpf_jit_fill_hole_with_zero(void *area, unsigned int size); - struct bpf_binary_header * bpf_jit_binary_alloc(unsigned int proglen, u8 **image_ptr, unsigned int alignment, @@ -1037,9 +1035,6 @@ void bpf_jit_free(struct bpf_prog *fp); struct bpf_binary_header * bpf_jit_binary_pack_hdr(const struct bpf_prog *fp); =20 -void *bpf_prog_pack_alloc(u32 size, bpf_jit_fill_hole_t bpf_fill_ill_ins= ns); -void bpf_prog_pack_free(struct bpf_binary_header *hdr); - static inline bool bpf_prog_kallsyms_verify_off(const struct bpf_prog *f= p) { return list_empty(&fp->aux->ksym.lnode) || diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 9c16338bcbe8..86b4f640e3c0 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -806,149 +806,6 @@ int bpf_jit_add_poke_descriptor(struct bpf_prog *pr= og, return slot; } =20 -/* - * BPF program pack allocator. - * - * Most BPF programs are pretty small. Allocating a hole page for each - * program is sometime a waste. Many small bpf program also adds pressur= e - * to instruction TLB. To solve this issue, we introduce a BPF program p= ack - * allocator. The prog_pack allocator uses HPAGE_PMD_SIZE page (2MB on x= 86) - * to host BPF programs. - */ -#define BPF_PROG_CHUNK_SHIFT 6 -#define BPF_PROG_CHUNK_SIZE (1 << BPF_PROG_CHUNK_SHIFT) -#define BPF_PROG_CHUNK_MASK (~(BPF_PROG_CHUNK_SIZE - 1)) - -struct bpf_prog_pack { - struct list_head list; - void *ptr; - unsigned long bitmap[]; -}; - -void bpf_jit_fill_hole_with_zero(void *area, unsigned int size) -{ - memset(area, 0, size); -} - -#define BPF_PROG_SIZE_TO_NBITS(size) (round_up(size, BPF_PROG_CHUNK_SIZE= ) / BPF_PROG_CHUNK_SIZE) - -static DEFINE_MUTEX(pack_mutex); -static LIST_HEAD(pack_list); - -/* PMD_SIZE is not available in some special config, e.g. ARCH=3Darm wit= h - * CONFIG_MMU=3Dn. Use PAGE_SIZE in these cases. - */ -#ifdef PMD_SIZE -#define BPF_PROG_PACK_SIZE (PMD_SIZE * num_possible_nodes()) -#else -#define BPF_PROG_PACK_SIZE PAGE_SIZE -#endif - -#define BPF_PROG_CHUNK_COUNT (BPF_PROG_PACK_SIZE / BPF_PROG_CHUNK_SIZE) - -static struct bpf_prog_pack *alloc_new_pack(bpf_jit_fill_hole_t bpf_fill= _ill_insns) -{ - struct bpf_prog_pack *pack; - - pack =3D kzalloc(struct_size(pack, bitmap, BITS_TO_LONGS(BPF_PROG_CHUNK= _COUNT)), - GFP_KERNEL); - if (!pack) - return NULL; - pack->ptr =3D module_alloc(BPF_PROG_PACK_SIZE); - if (!pack->ptr) { - kfree(pack); - return NULL; - } - bpf_fill_ill_insns(pack->ptr, BPF_PROG_PACK_SIZE); - bitmap_zero(pack->bitmap, BPF_PROG_PACK_SIZE / BPF_PROG_CHUNK_SIZE); - list_add_tail(&pack->list, &pack_list); - - set_vm_flush_reset_perms(pack->ptr); - set_memory_ro((unsigned long)pack->ptr, BPF_PROG_PACK_SIZE / PAGE_SIZE)= ; - set_memory_x((unsigned long)pack->ptr, BPF_PROG_PACK_SIZE / PAGE_SIZE); - return pack; -} - -void *bpf_prog_pack_alloc(u32 size, bpf_jit_fill_hole_t bpf_fill_ill_ins= ns) -{ - unsigned int nbits =3D BPF_PROG_SIZE_TO_NBITS(size); - struct bpf_prog_pack *pack; - unsigned long pos; - void *ptr =3D NULL; - - mutex_lock(&pack_mutex); - if (size > BPF_PROG_PACK_SIZE) { - size =3D round_up(size, PAGE_SIZE); - ptr =3D module_alloc(size); - if (ptr) { - bpf_fill_ill_insns(ptr, size); - set_vm_flush_reset_perms(ptr); - set_memory_ro((unsigned long)ptr, size / PAGE_SIZE); - set_memory_x((unsigned long)ptr, size / PAGE_SIZE); - } - goto out; - } - list_for_each_entry(pack, &pack_list, list) { - pos =3D bitmap_find_next_zero_area(pack->bitmap, BPF_PROG_CHUNK_COUNT,= 0, - nbits, 0); - if (pos < BPF_PROG_CHUNK_COUNT) - goto found_free_area; - } - - pack =3D alloc_new_pack(bpf_fill_ill_insns); - if (!pack) - goto out; - - pos =3D 0; - -found_free_area: - bitmap_set(pack->bitmap, pos, nbits); - ptr =3D (void *)(pack->ptr) + (pos << BPF_PROG_CHUNK_SHIFT); - -out: - mutex_unlock(&pack_mutex); - return ptr; -} - -void bpf_prog_pack_free(struct bpf_binary_header *hdr) -{ - struct bpf_prog_pack *pack =3D NULL, *tmp; - unsigned int nbits; - unsigned long pos; - - mutex_lock(&pack_mutex); - if (hdr->size > BPF_PROG_PACK_SIZE) { - module_memfree(hdr); - goto out; - } - - list_for_each_entry(tmp, &pack_list, list) { - if ((void *)hdr >=3D tmp->ptr && (tmp->ptr + BPF_PROG_PACK_SIZE) > (vo= id *)hdr) { - pack =3D tmp; - break; - } - } - - if (WARN_ONCE(!pack, "bpf_prog_pack bug\n")) - goto out; - - nbits =3D BPF_PROG_SIZE_TO_NBITS(hdr->size); - pos =3D ((unsigned long)hdr - (unsigned long)pack->ptr) >> BPF_PROG_CHU= NK_SHIFT; - - WARN_ONCE(bpf_arch_text_invalidate(hdr, hdr->size), - "bpf_prog_pack bug: missing bpf_arch_text_invalidate?\n"); - - bitmap_clear(pack->bitmap, pos, nbits); - if (bitmap_find_next_zero_area(pack->bitmap, BPF_PROG_CHUNK_COUNT, 0, - BPF_PROG_CHUNK_COUNT, 0) =3D=3D 0) { - list_del(&pack->list); - module_memfree(pack->ptr); - kfree(pack); - } -out: - mutex_unlock(&pack_mutex); -} - static atomic_long_t bpf_jit_current; =20 /* Can be overridden by an arch's JIT compiler if it has a custom, @@ -1048,6 +905,9 @@ void bpf_jit_binary_free(struct bpf_binary_header *h= dr) bpf_jit_uncharge_modmem(size); } =20 +#define BPF_PROG_EXEC_ALIGN 64 +#define BPF_PROG_EXEC_MASK (~(BPF_PROG_EXEC_ALIGN - 1)) + /* Allocate jit binary from bpf_prog_pack allocator. * Since the allocated memory is RO+X, the JIT engine cannot write direc= tly * to the memory. To solve this problem, a RW buffer is also allocated a= t @@ -1070,11 +930,11 @@ bpf_jit_binary_pack_alloc(unsigned int proglen, u8= **image_ptr, alignment > BPF_IMAGE_ALIGNMENT); =20 /* add 16 bytes for a random section of illegal instructions */ - size =3D round_up(proglen + sizeof(*ro_header) + 16, BPF_PROG_CHUNK_SIZ= E); + size =3D round_up(proglen + sizeof(*ro_header) + 16, BPF_PROG_EXEC_ALIG= N); =20 if (bpf_jit_charge_modmem(size)) return NULL; - ro_header =3D bpf_prog_pack_alloc(size, bpf_fill_ill_insns); + ro_header =3D execmem_alloc(size, BPF_PROG_EXEC_ALIGN); if (!ro_header) { bpf_jit_uncharge_modmem(size); return NULL; @@ -1082,8 +942,7 @@ bpf_jit_binary_pack_alloc(unsigned int proglen, u8 *= *image_ptr, =20 *rw_header =3D kvmalloc(size, GFP_KERNEL); if (!*rw_header) { - bpf_arch_text_copy(&ro_header->size, &size, sizeof(size)); - bpf_prog_pack_free(ro_header); + execmem_free(ro_header); bpf_jit_uncharge_modmem(size); return NULL; } @@ -1093,7 +952,7 @@ bpf_jit_binary_pack_alloc(unsigned int proglen, u8 *= *image_ptr, (*rw_header)->size =3D size; =20 hole =3D min_t(unsigned int, size - (proglen + sizeof(*ro_header)), - BPF_PROG_CHUNK_SIZE - sizeof(*ro_header)); + BPF_PROG_EXEC_ALIGN - sizeof(*ro_header)); start =3D prandom_u32_max(hole) & ~(alignment - 1); =20 *image_ptr =3D &ro_header->image[start]; @@ -1109,12 +968,12 @@ int bpf_jit_binary_pack_finalize(struct bpf_prog *= prog, { void *ptr; =20 - ptr =3D bpf_arch_text_copy(ro_header, rw_header, rw_header->size); + ptr =3D execmem_fill(ro_header, rw_header, rw_header->size); =20 kvfree(rw_header); =20 if (IS_ERR(ptr)) { - bpf_prog_pack_free(ro_header); + execmem_free(ro_header); return PTR_ERR(ptr); } return 0; @@ -1124,18 +983,13 @@ int bpf_jit_binary_pack_finalize(struct bpf_prog *= prog, * 1) when the program is freed after; * 2) when the JIT engine fails (before bpf_jit_binary_pack_finalize). * For case 2), we need to free both the RO memory and the RW buffer. - * - * bpf_jit_binary_pack_free requires proper ro_header->size. However, - * bpf_jit_binary_pack_alloc does not set it. Therefore, ro_header->size - * must be set with either bpf_jit_binary_pack_finalize (normal path) or - * bpf_arch_text_copy (when jit fails). */ void bpf_jit_binary_pack_free(struct bpf_binary_header *ro_header, struct bpf_binary_header *rw_header) { - u32 size =3D ro_header->size; + u32 size =3D rw_header ? rw_header->size : ro_header->size; =20 - bpf_prog_pack_free(ro_header); + execmem_free(ro_header); kvfree(rw_header); bpf_jit_uncharge_modmem(size); } @@ -1146,7 +1000,7 @@ bpf_jit_binary_pack_hdr(const struct bpf_prog *fp) unsigned long real_start =3D (unsigned long)fp->bpf_func; unsigned long addr; =20 - addr =3D real_start & BPF_PROG_CHUNK_MASK; + addr =3D real_start & BPF_PROG_EXEC_MASK; return (void *)addr; } =20 @@ -2736,16 +2590,6 @@ int __weak bpf_arch_text_poke(void *ip, enum bpf_t= ext_poke_type t, return -ENOTSUPP; } =20 -void * __weak bpf_arch_text_copy(void *dst, void *src, size_t len) -{ - return ERR_PTR(-ENOTSUPP); -} - -int __weak bpf_arch_text_invalidate(void *dst, size_t len) -{ - return -ENOTSUPP; -} - DEFINE_STATIC_KEY_FALSE(bpf_stats_enabled_key); EXPORT_SYMBOL(bpf_stats_enabled_key); =20 diff --git a/kernel/bpf/dispatcher.c b/kernel/bpf/dispatcher.c index 04f0a045dcaa..c41dff3379db 100644 --- a/kernel/bpf/dispatcher.c +++ b/kernel/bpf/dispatcher.c @@ -126,11 +126,11 @@ static void bpf_dispatcher_update(struct bpf_dispat= cher *d, int prev_num_progs) tmp =3D d->num_progs ? d->rw_image + noff : NULL; if (new) { /* Prepare the dispatcher in d->rw_image. Then use - * bpf_arch_text_copy to update d->image, which is RO+X. + * execmem_fill to update d->image, which is RO+X. */ if (bpf_dispatcher_prepare(d, new, tmp)) return; - if (IS_ERR(bpf_arch_text_copy(new, tmp, PAGE_SIZE / 2))) + if (IS_ERR(execmem_fill(new, tmp, PAGE_SIZE / 2))) return; } =20 @@ -152,15 +152,12 @@ void bpf_dispatcher_change_prog(struct bpf_dispatch= er *d, struct bpf_prog *from, =20 mutex_lock(&d->mutex); if (!d->image) { - d->image =3D bpf_prog_pack_alloc(PAGE_SIZE, bpf_jit_fill_hole_with_zer= o); + d->image =3D execmem_alloc(PAGE_SIZE, PAGE_SIZE /* align */); if (!d->image) goto out; d->rw_image =3D bpf_jit_alloc_exec(PAGE_SIZE); if (!d->rw_image) { - u32 size =3D PAGE_SIZE; - - bpf_arch_text_copy(d->image, &size, sizeof(size)); - bpf_prog_pack_free((struct bpf_binary_header *)d->image); + execmem_free((struct bpf_binary_header *)d->image); d->image =3D NULL; goto out; } --=20 2.30.2