All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] netfilter: nft_socket: socket expressions for GID & UID
@ 2022-04-20 18:54 Topi Miettinen
  2022-04-20 21:15 ` Jan Engelhardt
  2022-04-25 18:45 ` Topi Miettinen
  0 siblings, 2 replies; 17+ messages in thread
From: Topi Miettinen @ 2022-04-20 18:54 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Topi Miettinen

Add socket expressions for checking GID or UID of the originating
socket. These work also on input side, unlike meta skuid/skgid.

Signed-off-by: Topi Miettinen <toiwoton@gmail.com>
---
 include/uapi/linux/netfilter/nf_tables.h |  4 ++++
 net/netfilter/nft_socket.c               | 28 ++++++++++++++++++++++++
 2 files changed, 32 insertions(+)

diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index 466fd3f4447c..b3c09c67d13a 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -1040,12 +1040,16 @@ enum nft_socket_attributes {
  * @NFT_SOCKET_MARK: Value of the socket mark
  * @NFT_SOCKET_WILDCARD: Whether the socket is zero-bound (e.g. 0.0.0.0 or ::0)
  * @NFT_SOCKET_CGROUPV2: Match on cgroups version 2
+ * @NFT_SOCKET_GID: Match on GID of socket owner
+ * @NFT_SOCKET_GID: Match on UID of socket owner
  */
 enum nft_socket_keys {
 	NFT_SOCKET_TRANSPARENT,
 	NFT_SOCKET_MARK,
 	NFT_SOCKET_WILDCARD,
 	NFT_SOCKET_CGROUPV2,
+	NFT_SOCKET_GID,
+	NFT_SOCKET_UID,
 	__NFT_SOCKET_MAX
 };
 #define NFT_SOCKET_MAX	(__NFT_SOCKET_MAX - 1)
diff --git a/net/netfilter/nft_socket.c b/net/netfilter/nft_socket.c
index b8f011145765..2f0fe9d886f3 100644
--- a/net/netfilter/nft_socket.c
+++ b/net/netfilter/nft_socket.c
@@ -113,6 +113,32 @@ static void nft_socket_eval(const struct nft_expr *expr,
 		}
 		break;
 #endif
+	case NFT_SOCKET_GID:
+		if (sk_fullsock(sk)) {
+			struct socket *sock;
+
+			sock = sk->sk_socket;
+			if (sock && sock->file)
+				*dest = from_kgid_munged(sock_net(sk)->user_ns,
+							 sock->file->f_cred->fsgid);
+		} else {
+			regs->verdict.code = NFT_BREAK;
+			return;
+		}
+		break;
+	case NFT_SOCKET_UID:
+		if (sk_fullsock(sk)) {
+			struct socket *sock;
+
+			sock = sk->sk_socket;
+			if (sock && sock->file)
+				*dest = from_kuid_munged(sock_net(sk)->user_ns,
+							 sock->file->f_cred->fsuid);
+		} else {
+			regs->verdict.code = NFT_BREAK;
+			return;
+		}
+		break;
 	default:
 		WARN_ON(1);
 		regs->verdict.code = NFT_BREAK;
@@ -156,6 +182,8 @@ static int nft_socket_init(const struct nft_ctx *ctx,
 		len = sizeof(u8);
 		break;
 	case NFT_SOCKET_MARK:
+	case NFT_SOCKET_GID:
+	case NFT_SOCKET_UID:
 		len = sizeof(u32);
 		break;
 #ifdef CONFIG_CGROUPS
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] netfilter: nft_socket: socket expressions for GID & UID
  2022-04-20 18:54 [PATCH] netfilter: nft_socket: socket expressions for GID & UID Topi Miettinen
@ 2022-04-20 21:15 ` Jan Engelhardt
  2022-04-21 16:35   ` Topi Miettinen
  2022-04-25 18:45 ` Topi Miettinen
  1 sibling, 1 reply; 17+ messages in thread
From: Jan Engelhardt @ 2022-04-20 21:15 UTC (permalink / raw)
  To: Topi Miettinen; +Cc: netfilter-devel


On Wednesday 2022-04-20 20:54, Topi Miettinen wrote:

>Add socket expressions for checking GID or UID of the originating
>socket. These work also on input side, unlike meta skuid/skgid.

Why exactly is it that meta skuid does not work?
Because of the skb_to_full_sk() call in nft_meta_get_eval_skugid()?

>+	case NFT_SOCKET_GID:
>+		if (sk_fullsock(sk)) {
>+			struct socket *sock;
>+
>+			sock = sk->sk_socket;
>+			if (sock && sock->file)
>+				*dest = from_kgid_munged(sock_net(sk)->user_ns,
>+							 sock->file->f_cred->fsgid);

The code is quite the same as nft_meta_get_eval_skugid's, save for the BH
locking and skb_to_full_sk. Perhaps nft_socket.c could still call into a
suitably augmented nft_meta_get_eval_skugid function to share code.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] netfilter: nft_socket: socket expressions for GID & UID
  2022-04-20 21:15 ` Jan Engelhardt
@ 2022-04-21 16:35   ` Topi Miettinen
  2022-04-26 21:05     ` Pablo Neira Ayuso
  0 siblings, 1 reply; 17+ messages in thread
From: Topi Miettinen @ 2022-04-21 16:35 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: netfilter-devel

On 21.4.2022 0.15, Jan Engelhardt wrote:
> 
> On Wednesday 2022-04-20 20:54, Topi Miettinen wrote:
> 
>> Add socket expressions for checking GID or UID of the originating
>> socket. These work also on input side, unlike meta skuid/skgid.
> 
> Why exactly is it that meta skuid does not work?
> Because of the skb_to_full_sk() call in nft_meta_get_eval_skugid()?

I don't know the details, but early demux isn't reliable and filters 
aren't run after final demux. In my case, something like "ct state new 
meta skuid < 1000 drop" as part of input filter doesn't do anything. 
Making "meta skuid" 100% reliable would be of course preferable to 
adding a new expression.

> 
>> +	case NFT_SOCKET_GID:
>> +		if (sk_fullsock(sk)) {
>> +			struct socket *sock;
>> +
>> +			sock = sk->sk_socket;
>> +			if (sock && sock->file)
>> +				*dest = from_kgid_munged(sock_net(sk)->user_ns,
>> +							 sock->file->f_cred->fsgid);
> 
> The code is quite the same as nft_meta_get_eval_skugid's, save for the BH
> locking and skb_to_full_sk. Perhaps nft_socket.c could still call into a
> suitably augmented nft_meta_get_eval_skugid function to share code.

Makes sense.

-Topi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] netfilter: nft_socket: socket expressions for GID & UID
  2022-04-20 18:54 [PATCH] netfilter: nft_socket: socket expressions for GID & UID Topi Miettinen
  2022-04-20 21:15 ` Jan Engelhardt
@ 2022-04-25 18:45 ` Topi Miettinen
  2022-04-25 22:34   ` Florian Westphal
  1 sibling, 1 reply; 17+ messages in thread
From: Topi Miettinen @ 2022-04-25 18:45 UTC (permalink / raw)
  To: netfilter-devel

On 20.4.2022 21.54, Topi Miettinen wrote:
> Add socket expressions for checking GID or UID of the originating
> socket. These work also on input side, unlike meta skuid/skgid.

Unfortunately, there's a reproducible kernel BUG when closing a local 
connection:

Apr 25 21:18:13 kernel: 
==================================================================
Apr 25 21:18:13 kernel: BUG: KASAN: null-ptr-deref in 
nf_sk_lookup_slow_v6+0x45b/0x590 [nf_socket_ipv6]
Apr 25 21:18:13 kernel: Read of size 4 at addr 00000000000000d8 by task 
ssh/1754
Apr 25 21:18:13 kernel:
Apr 25 21:18:13 kernel: CPU: 8 PID: 1754 Comm: ssh Tainted: G 
  E     5.17.0-rc7+ #6
Apr 25 21:18:13 kernel: Hardware name: XXX
Apr 25 21:18:13 kernel: Call Trace:
Apr 25 21:18:13 kernel:  <IRQ>
Apr 25 21:18:13 kernel:  dump_stack_lvl+0x34/0x44
Apr 25 21:18:13 kernel:  ? nf_sk_lookup_slow_v6+0x45b/0x590 [nf_socket_ipv6]
Apr 25 21:18:13 kernel:  kasan_report.cold+0x66/0xdc
Apr 25 21:18:13 kernel:  ? nf_sk_lookup_slow_v6+0x45b/0x590 [nf_socket_ipv6]
Apr 25 21:18:13 kernel:  nf_sk_lookup_slow_v6+0x45b/0x590 [nf_socket_ipv6]
Apr 25 21:18:13 kernel:  ? 0xffffffffc141c000
Apr 25 21:18:13 kernel:  ? preempt_count_sub+0xf/0xb0
Apr 25 21:18:13 kernel:  ? unwind_next_frame+0x6c6/0xbf0
Apr 25 21:18:13 kernel:  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
Apr 25 21:18:13 kernel:  ? bpf_ksym_find+0x8f/0xe0
Apr 25 21:18:13 kernel:  ? __rcu_read_unlock+0x2a/0x60
Apr 25 21:18:13 kernel:  ? is_bpf_text_address+0x1a/0x30
Apr 25 21:18:13 kernel:  ? kernel_text_address+0x57/0xb0
Apr 25 21:18:13 kernel:  ? __kernel_text_address+0x9/0x30
Apr 25 21:18:13 kernel:  ? unwind_get_return_address+0x2a/0x40
Apr 25 21:18:13 kernel:  ? create_prof_cpu_mask+0x20/0x20
Apr 25 21:18:13 kernel:  ? arch_stack_walk+0x99/0xf0
Apr 25 21:18:13 kernel:  ? __orc_find+0x63/0xc0
Apr 25 21:18:13 kernel:  ? deref_stack_reg+0x7a/0xb0
Apr 25 21:18:13 kernel:  ? get_stack_info_noinstr+0x12/0xf0
Apr 25 21:18:13 kernel:  nft_socket_eval+0xea/0x491 [nft_socket]
Apr 25 21:18:13 kernel:  nft_do_chain+0x240/0x860 [nf_tables]
Apr 25 21:18:13 kernel:  ? bpf_ksym_find+0x8f/0xe0
Apr 25 21:18:13 kernel:  ? __nft_trace_verdict.isra.0+0x20/0x20 [nf_tables]
Apr 25 21:18:13 kernel:  ? __kernel_text_address+0x9/0x30
Apr 25 21:18:13 kernel:  ? unwind_get_return_address+0x2a/0x40
Apr 25 21:18:13 kernel:  ? create_prof_cpu_mask+0x20/0x20
Apr 25 21:18:13 kernel:  ? _raw_spin_lock_irqsave+0x88/0xe0
Apr 25 21:18:13 kernel:  ? __cpuidle_text_end+0x3/0x3
Apr 25 21:18:13 kernel:  ? selinux_netlbl_skbuff_setsid+0x215/0x2a0
Apr 25 21:18:13 kernel:  ? selinux_netlbl_skbuff_setsid+0x215/0x2a0
Apr 25 21:18:13 kernel:  ? stack_trace_save+0x8c/0xc0
Apr 25 21:18:13 kernel:  ? _raw_spin_lock_bh+0x82/0xe0
Apr 25 21:18:13 kernel:  ? _raw_write_lock_irq+0xd0/0xd0
Apr 25 21:18:13 kernel:  ? __nf_ct_refresh_acct+0xa6/0xd0 [nf_conntrack]
Apr 25 21:18:13 kernel:  ? nf_ct_acct_add+0x32/0x80 [nf_conntrack]
Apr 25 21:18:13 kernel:  ? nf_conntrack_tcp_packet+0xef7/0x2c20 
[nf_conntrack]
Apr 25 21:18:13 kernel:  ? kasan_record_aux_stack_noalloc+0x5/0x10
Apr 25 21:18:13 kernel:  ? selinux_netlbl_skbuff_setsid+0x215/0x2a0
Apr 25 21:18:13 kernel:  ? selinux_ip_output+0x7b/0xa0
Apr 25 21:18:13 kernel:  ? ipv6_find_hdr+0x102/0x500
Apr 25 21:18:13 kernel:  ? ipv6_skip_exthdr+0x240/0x240
Apr 25 21:18:13 kernel:  ? ipv6_find_tlv+0xf0/0xf0
Apr 25 21:18:13 kernel:  ? tcp_new+0x420/0x420 [nf_conntrack]
Apr 25 21:18:13 kernel:  ? __nf_conntrack_find_get+0x52e/0x750 
[nf_conntrack]
Apr 25 21:18:13 kernel:  nf_route_table_hook6+0x216/0x400 [nf_tables]
Apr 25 21:18:13 kernel:  ? nf_route_table_hook4+0x280/0x280 [nf_tables]
Apr 25 21:18:13 kernel:  ? __kasan_slab_alloc+0x2c/0x80
Apr 25 21:18:13 kernel:  ? security_netlbl_sid_to_secattr+0xb6/0x130
Apr 25 21:18:13 kernel:  ? nf_conntrack_in+0x768/0xa50 [nf_conntrack]
Apr 25 21:18:13 kernel:  ? nf_route_table_hook6+0x400/0x400 [nf_tables]
Apr 25 21:18:13 kernel:  nf_route_table_inet+0xdf/0xf0 [nf_tables]
Apr 25 21:18:13 kernel:  ? nf_route_table_hook6+0x400/0x400 [nf_tables]
Apr 25 21:18:13 kernel:  nf_hook_slow+0x57/0xd0
Apr 25 21:18:13 kernel:  ip6_xmit+0x6d3/0xaa0
Apr 25 21:18:13 kernel:  ? ip6_forward_finish+0x1b0/0x1b0
Apr 25 21:18:13 kernel:  ? tcp_v6_send_response+0x19f/0xc00
Apr 25 21:18:13 kernel:  ? ip6_output+0x220/0x220
Apr 25 21:18:13 kernel:  ? ip6_dst_lookup_tail.constprop.0+0x860/0x860
Apr 25 21:18:13 kernel:  ? __build_skb_around+0x109/0x130
Apr 25 21:18:13 kernel:  ? selinux_xfrm_skb_sid_ingress+0xe1/0x110
Apr 25 21:18:13 kernel:  tcp_v6_send_response+0x7bd/0xc00
Apr 25 21:18:13 kernel:  ? tcp_v6_connect+0xbb0/0xbb0
Apr 25 21:18:13 kernel:  ? tcp_rcv_state_process+0x1d9c/0x1de0
Apr 25 21:18:13 kernel:  tcp_v6_send_reset+0x2b2/0x630
Apr 25 21:18:13 kernel:  ? tcp_parse_md5sig_option+0x16/0xa0
Apr 25 21:18:13 kernel:  ? reqsk_put+0x150/0x150
Apr 25 21:18:13 kernel:  ? tcp_v6_inbound_md5_hash+0xc4/0x260
Apr 25 21:18:13 kernel:  ? bpf_skb_vlan_pop+0xa0/0xa0
Apr 25 21:18:13 kernel:  tcp_v6_do_rcv+0x394/0x740
Apr 25 21:18:13 kernel:  tcp_v6_rcv+0x13e5/0x15d0
Apr 25 21:18:13 kernel:  ? tcp_v6_do_rcv+0x740/0x740
Apr 25 21:18:13 kernel:  ? ipv6_confirm+0x11f/0x260 [nf_conntrack]
Apr 25 21:18:13 kernel:  ? ipv4_confirm+0x130/0x130 [nf_conntrack]
Apr 25 21:18:13 kernel:  ip6_protocol_deliver_rcu+0x182/0x910
Apr 25 21:18:13 kernel:  ip6_input+0x156/0x170
Apr 25 21:18:13 kernel:  ? ip6_input_finish+0x30/0x30
Apr 25 21:18:13 kernel:  ? ip6_protocol_deliver_rcu+0x910/0x910
Apr 25 21:18:13 kernel:  ? nf_nat_ipv6_fn+0x1a0/0x1a0 [nf_nat]
Apr 25 21:18:13 kernel:  ? nf_hook_slow+0x98/0xd0
Apr 25 21:18:13 kernel:  ipv6_rcv+0x22f/0x270
Apr 25 21:18:13 kernel:  ? ip6_input+0x170/0x170
Apr 25 21:18:13 kernel:  ? __bitmap_and+0x6e/0x100
Apr 25 21:18:13 kernel:  ? _find_next_bit+0x5a/0x110
Apr 25 21:18:13 kernel:  ? ipv6_list_rcv+0x260/0x260
Apr 25 21:18:13 kernel:  ? load_balance+0x1181/0x1290
Apr 25 21:18:13 kernel:  ? ip6_input+0x170/0x170
Apr 25 21:18:13 kernel:  __netif_receive_skb_one_core+0xd4/0x130
Apr 25 21:18:13 kernel:  ? __netif_receive_skb_list_core+0x4c0/0x4c0
Apr 25 21:18:13 kernel:  ? _raw_spin_lock+0x82/0xe0
Apr 25 21:18:13 kernel:  ? _raw_spin_lock_bh+0xe0/0xe0
Apr 25 21:18:13 kernel:  process_backlog+0xec/0x270
Apr 25 21:18:13 kernel:  __napi_poll+0x57/0x1c0
Apr 25 21:18:13 kernel:  net_rx_action+0x1df/0x450
Apr 25 21:18:13 kernel:  ? napi_threaded_poll+0x1a0/0x1a0
Apr 25 21:18:13 kernel:  ? read_hpet+0x100/0x1d0
Apr 25 21:18:13 kernel:  ? native_flush_tlb_global+0xcc/0xe0
Apr 25 21:18:13 kernel:  __do_softirq+0x108/0x2b1
Apr 25 21:18:13 kernel:  ? sched_clock_cpu+0x113/0x130
Apr 25 21:18:13 kernel:  do_softirq+0xa1/0xd0
Apr 25 21:18:13 kernel:  </IRQ>
Apr 25 21:18:13 kernel:  <TASK>
Apr 25 21:18:13 kernel:  __local_bh_enable_ip+0x60/0x70
Apr 25 21:18:13 kernel:  ip6_finish_output2+0x408/0x9e0
Apr 25 21:18:13 kernel:  ? ip6_dst_lookup+0x40/0x40
Apr 25 21:18:13 kernel:  ? __rcu_read_unlock+0x2a/0x60
Apr 25 21:18:13 kernel:  ? ip6_mtu+0x7b/0xc0
Apr 25 21:18:13 kernel:  ? __ip6_finish_output+0x18d/0x420
Apr 25 21:18:13 kernel:  ip6_output+0x110/0x220
Apr 25 21:18:13 kernel:  ? ip6_finish_output+0xc0/0xc0
Apr 25 21:18:13 kernel:  ? __ip6_finish_output+0x420/0x420
Apr 25 21:18:13 kernel:  ip6_xmit+0x7ea/0xaa0
Apr 25 21:18:13 kernel:  ? ip6_forward_finish+0x1b0/0x1b0
Apr 25 21:18:13 kernel:  ? cpu_weight_nice_read_s64+0x46/0x90
Apr 25 21:18:13 kernel:  ? __rcu_read_unlock+0x43/0x60
Apr 25 21:18:13 kernel:  ? ip6_output+0x220/0x220
Apr 25 21:18:13 kernel:  ? __sk_dst_check+0x64/0xe0
Apr 25 21:18:13 kernel:  ? inet6_csk_route_socket+0x29e/0x3e0
Apr 25 21:18:13 kernel:  ? inet6_csk_addr2sockaddr+0xd0/0xd0
Apr 25 21:18:13 kernel:  ? unwind_get_return_address+0x2a/0x40
Apr 25 21:18:13 kernel:  ? create_prof_cpu_mask+0x20/0x20
Apr 25 21:18:13 kernel:  ? arch_stack_walk+0x99/0xf0
Apr 25 21:18:13 kernel:  inet6_csk_xmit+0x1b2/0x250
Apr 25 21:18:13 kernel:  ? inet6_csk_update_pmtu+0x110/0x110
Apr 25 21:18:13 kernel:  ? bpf_skops_hdr_opt_len+0x1e0/0x1e0
Apr 25 21:18:13 kernel:  ? __tcp_select_window+0x143/0x470
Apr 25 21:18:13 kernel:  ? tcp_options_write+0xc9/0x370
Apr 25 21:18:13 kernel:  __tcp_transmit_skb+0xa8a/0x14b0
Apr 25 21:18:13 kernel:  ? __tcp_select_window+0x470/0x470
Apr 25 21:18:13 kernel:  ? hpet_msi_interrupt_handler+0x30/0x30
Apr 25 21:18:13 kernel:  ? tcp_stream_alloc_skb+0x47/0x3d0
Apr 25 21:18:13 kernel:  tcp_write_xmit+0x72a/0x2510
Apr 25 21:18:13 kernel:  ? skb_page_frag_refill+0x15c/0x190
Apr 25 21:18:13 kernel:  ? __virt_addr_valid+0xb9/0x130
Apr 25 21:18:13 kernel:  __tcp_push_pending_frames+0x51/0x170
Apr 25 21:18:13 kernel:  tcp_sendmsg_locked+0x4a7/0x1460
Apr 25 21:18:13 kernel:  ? tcp_sendpage+0x80/0x80
Apr 25 21:18:13 kernel:  ? _raw_spin_lock_bh+0x82/0xe0
Apr 25 21:18:13 kernel:  ? _raw_write_lock_irq+0xd0/0xd0
Apr 25 21:18:13 kernel:  ? inet6_ioctl+0x1b0/0x1b0
Apr 25 21:18:13 kernel:  tcp_sendmsg+0x23/0x40
Apr 25 21:18:13 kernel:  sock_sendmsg+0x73/0xa0
Apr 25 21:18:13 kernel:  sock_write_iter+0x125/0x1d0
Apr 25 21:18:13 kernel:  ? sock_sendmsg+0xa0/0xa0
Apr 25 21:18:13 kernel:  ? bpf_local_storage_map_alloc_check+0x40/0xc0
Apr 25 21:18:13 kernel:  ? new_sync_read+0x33d/0x360
Apr 25 21:18:13 kernel:  ? audit_filter_rules.constprop.0+0x1326/0x1ef0
Apr 25 21:18:13 kernel:  ? audit_filter_rules.constprop.0+0x1326/0x1ef0
Apr 25 21:18:13 kernel:  new_sync_write+0x348/0x360
Apr 25 21:18:13 kernel:  ? new_sync_read+0x360/0x360
Apr 25 21:18:13 kernel:  ? bpf_local_storage_map_alloc_check+0x40/0xc0
Apr 25 21:18:13 kernel:  ? bpf_fd_pass+0xf0/0xf0
Apr 25 21:18:13 kernel:  ? selinux_file_permission+0x11c/0x1f0
Apr 25 21:18:13 kernel:  vfs_write+0x33e/0x3e0
Apr 25 21:18:13 kernel:  ksys_write+0x11b/0x150
Apr 25 21:18:13 kernel:  ? __ia32_sys_read+0x40/0x40
Apr 25 21:18:13 kernel:  ? __audit_syscall_entry+0x173/0x1f0
Apr 25 21:18:13 kernel:  ? ktime_get_coarse_real_ts64+0x45/0x60
Apr 25 21:18:13 kernel:  do_syscall_64+0x5c/0x80
Apr 25 21:18:13 kernel:  ? syscall_exit_to_user_mode+0x1d/0x40
Apr 25 21:18:13 kernel:  ? do_syscall_64+0x69/0x80
Apr 25 21:18:13 kernel:  ? do_syscall_64+0x69/0x80
Apr 25 21:18:13 kernel:  ? do_syscall_64+0x69/0x80
Apr 25 21:18:13 kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
Apr 25 21:18:13 kernel: RIP: 0033:0x75f2a694c603
Apr 25 21:18:13 kernel: Code: 8b 15 71 38 0e 00 f7 d8 64 89 02 48 c7 c0 
ff ff ff ff eb b7 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 
00 00 0f 05 <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 
54 24 18
Apr 25 21:18:13 kernel: RSP: 002b:00004a29af4792c8 EFLAGS: 00000246 
ORIG_RAX: 0000000000000001
Apr 25 21:18:13 kernel: RAX: ffffffffffffffda RBX: 000000000000003c RCX: 
000075f2a694c603
Apr 25 21:18:13 kernel: RDX: 000000000000003c RSI: 000065287ad9af00 RDI: 
0000000000000003
Apr 25 21:18:13 kernel: RBP: 000065287ad8f380 R08: 0000000000000000 R09: 
0000000000000000
Apr 25 21:18:13 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 
0000000000000000
Apr 25 21:18:13 kernel: R13: 00000000ffffffe8 R14: 000065287ad939c0 R15: 
0000000000000000
Apr 25 21:18:13 kernel:  </TASK>
Apr 25 21:18:13 kernel: 
==================================================================
Apr 25 21:18:13 kernel: Disabling lock debugging due to kernel taint
Apr 25 21:18:13 kernel: BUG: kernel NULL pointer dereference, address: 
00000000000000d8
Apr 25 21:18:13 kernel: #PF: supervisor read access in kernel mode
Apr 25 21:18:13 kernel: #PF: error_code(0x0000) - not-present page

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] netfilter: nft_socket: socket expressions for GID & UID
  2022-04-25 18:45 ` Topi Miettinen
@ 2022-04-25 22:34   ` Florian Westphal
  2022-04-26 19:02     ` Topi Miettinen
  0 siblings, 1 reply; 17+ messages in thread
From: Florian Westphal @ 2022-04-25 22:34 UTC (permalink / raw)
  To: Topi Miettinen; +Cc: netfilter-devel

Topi Miettinen <toiwoton@gmail.com> wrote:
> On 20.4.2022 21.54, Topi Miettinen wrote:
> > Add socket expressions for checking GID or UID of the originating
> > socket. These work also on input side, unlike meta skuid/skgid.
> 
> Unfortunately, there's a reproducible kernel BUG when closing a local
> connection:
> 
> Apr 25 21:18:13 kernel:
> ==================================================================
> Apr 25 21:18:13 kernel: BUG: KASAN: null-ptr-deref in
> nf_sk_lookup_slow_v6+0x45b/0x590 [nf_socket_ipv6]

You can pass this to scripts/faddr2line to get the location of the null deref.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] netfilter: nft_socket: socket expressions for GID & UID
  2022-04-25 22:34   ` Florian Westphal
@ 2022-04-26 19:02     ` Topi Miettinen
  2022-04-27  5:48       ` Florian Westphal
  0 siblings, 1 reply; 17+ messages in thread
From: Topi Miettinen @ 2022-04-26 19:02 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On 26.4.2022 1.34, Florian Westphal wrote:
> Topi Miettinen <toiwoton@gmail.com> wrote:
>> On 20.4.2022 21.54, Topi Miettinen wrote:
>>> Add socket expressions for checking GID or UID of the originating
>>> socket. These work also on input side, unlike meta skuid/skgid.
>>
>> Unfortunately, there's a reproducible kernel BUG when closing a local
>> connection:
>>
>> Apr 25 21:18:13 kernel:
>> ==================================================================
>> Apr 25 21:18:13 kernel: BUG: KASAN: null-ptr-deref in
>> nf_sk_lookup_slow_v6+0x45b/0x590 [nf_socket_ipv6]
> 
> You can pass this to scripts/faddr2line to get the location of the null deref.

Didn't work, but with grep and gdb I think I located the error to 
net/ipv6/netfilter/nf_socket_ipv6.c:

static struct sock *
nf_socket_get_sock_v6(struct net *net, struct sk_buff *skb, int doff,
                       const u8 protocol,
                       const struct in6_addr *saddr, const struct 
in6_addr *daddr,
                       const __be16 sport, const __be16 dport,
                       const struct net_device *in)
{
         switch (protocol) {
         case IPPROTO_TCP:
                 return inet6_lookup(net, &tcp_hashinfo, skb, doff,
                                     saddr, sport, daddr, dport,
                                     in->ifindex);
                                     ^^^^^^^^^^^
where in->ifindex would be a NULL deref.

-Topi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] netfilter: nft_socket: socket expressions for GID & UID
  2022-04-21 16:35   ` Topi Miettinen
@ 2022-04-26 21:05     ` Pablo Neira Ayuso
  2022-04-26 21:07       ` Pablo Neira Ayuso
  0 siblings, 1 reply; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-04-26 21:05 UTC (permalink / raw)
  To: Topi Miettinen; +Cc: Jan Engelhardt, netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 1011 bytes --]

On Thu, Apr 21, 2022 at 07:35:06PM +0300, Topi Miettinen wrote:
> On 21.4.2022 0.15, Jan Engelhardt wrote:
> > 
> > On Wednesday 2022-04-20 20:54, Topi Miettinen wrote:
> > 
> > > Add socket expressions for checking GID or UID of the originating
> > > socket. These work also on input side, unlike meta skuid/skgid.
> > 
> > Why exactly is it that meta skuid does not work?
> > Because of the skb_to_full_sk() call in nft_meta_get_eval_skugid()?
> 
> I don't know the details, but early demux isn't reliable and filters aren't
> run after final demux. In my case, something like "ct state new meta skuid <
> 1000 drop" as part of input filter doesn't do anything. Making "meta skuid"
> 100% reliable would be of course preferable to adding a new expression.

Could you give a try to this kernel patch?

This patch adds a new socket hook for inet layer 4 protocols, it is
coming after the NF_LOCAL_IN hook, where the socket information is
available for all cases.

You also need a small patch for userspace nft.

[-- Attachment #2: kernel-socket-hook.patch --]
[-- Type: text/x-diff, Size: 9321 bytes --]

diff --git a/include/linux/netfilter_socket.h b/include/linux/netfilter_socket.h
new file mode 100644
index 000000000000..7acdb5463e14
--- /dev/null
+++ b/include/linux/netfilter_socket.h
@@ -0,0 +1,46 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _NETFILTER_SOCKET_H_
+#define _NETFILTER_SOCKET_H_
+
+#include <linux/netfilter.h>
+
+static inline bool nf_hook_socket_active(struct net *net, const struct sk_buff *skb)
+{
+#ifdef CONFIG_JUMP_LABEL
+	if (!static_key_false(&nf_hooks_needed[NFPROTO_INET][NF_INET_SOCKET]))
+		return false;
+#endif
+	return rcu_access_pointer(net->nf.hooks_socket[0]);
+}
+
+/* caller must hold rcu_read_lock */
+static inline int nf_hook_socket(struct net *net, struct sk_buff *skb)
+{
+	struct nf_hook_entries *e = rcu_dereference(net->nf.hooks_socket[0]);
+	struct nf_hook_state state;
+	int ret;
+
+	if (unlikely(!e))
+		return 0;
+
+	nf_hook_state_init(&state, NF_INET_SOCKET,
+			   NFPROTO_INET, skb->dev, NULL, NULL,
+			   dev_net(skb->dev), NULL);
+	ret = nf_hook_slow(skb, &state, e, 0);
+	if (ret == 0)
+		return -1;
+
+	return ret;
+}
+
+#else /* CONFIG_NETFILTER_SOCKET */
+static inline int nf_hook_socket_active(struct sk_buff *skb)
+{
+	return 0;
+}
+
+static inline int nf_hook_socket(struct sk_buff *skb)
+{
+	return 0;
+}
+#endif /* _NETFILTER_SOCKET_H_ */
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 20af9d3557b9..790c073629e8 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -15,7 +15,7 @@
 #include <net/flow_offload.h>
 #include <net/netns/generic.h>
 
-#define NFT_MAX_HOOKS	(NF_INET_INGRESS + 1)
+#define NFT_MAX_HOOKS	(NF_INET_SOCKET + 1)
 
 struct module;
 
diff --git a/include/net/netns/netfilter.h b/include/net/netns/netfilter.h
index b593f95e9991..d65922f0d65a 100644
--- a/include/net/netns/netfilter.h
+++ b/include/net/netns/netfilter.h
@@ -18,6 +18,7 @@ struct netns_nf {
 #endif
 	struct nf_hook_entries __rcu *hooks_ipv4[NF_INET_NUMHOOKS];
 	struct nf_hook_entries __rcu *hooks_ipv6[NF_INET_NUMHOOKS];
+	struct nf_hook_entries __rcu *hooks_socket[1];
 #ifdef CONFIG_NETFILTER_FAMILY_ARP
 	struct nf_hook_entries __rcu *hooks_arp[NF_ARP_NUMHOOKS];
 #endif
diff --git a/include/net/sock.h b/include/net/sock.h
index c4b91fc19b9c..4e40823d2a43 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2907,4 +2907,7 @@ static inline bool sk_is_readable(struct sock *sk)
 		return sk->sk_prot->sock_is_readable(sk);
 	return false;
 }
+
+int sk_inet_filter(struct sock *sk, struct sk_buff *skb, unsigned int cap);
+
 #endif	/* _SOCK_H */
diff --git a/include/uapi/linux/netfilter.h b/include/uapi/linux/netfilter.h
index 53411ccc69db..65c6691249e6 100644
--- a/include/uapi/linux/netfilter.h
+++ b/include/uapi/linux/netfilter.h
@@ -45,8 +45,9 @@ enum nf_inet_hooks {
 	NF_INET_FORWARD,
 	NF_INET_LOCAL_OUT,
 	NF_INET_POST_ROUTING,
+	NF_INET_INGRESS,
 	NF_INET_NUMHOOKS,
-	NF_INET_INGRESS = NF_INET_NUMHOOKS,
+	NF_INET_SOCKET = NF_INET_NUMHOOKS,
 };
 
 enum nf_dev_hooks {
diff --git a/net/core/sock.c b/net/core/sock.c
index 1180a0cb0110..19e7906f8955 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -129,6 +129,7 @@
 #include <net/cls_cgroup.h>
 #include <net/netprio_cgroup.h>
 #include <linux/sock_diag.h>
+#include <linux/netfilter_socket.h>
 
 #include <linux/filter.h>
 #include <net/sock_reuseport.h>
@@ -520,7 +521,7 @@ int __sk_receive_skb(struct sock *sk, struct sk_buff *skb,
 {
 	int rc = NET_RX_SUCCESS;
 
-	if (sk_filter_trim_cap(sk, skb, trim_cap))
+	if (sk_inet_filter(sk, skb, trim_cap))
 		goto discard_and_relse;
 
 	skb->dev = NULL;
@@ -3222,6 +3223,25 @@ void sk_stop_timer_sync(struct sock *sk, struct timer_list *timer)
 }
 EXPORT_SYMBOL(sk_stop_timer_sync);
 
+int sk_inet_filter(struct sock *sk, struct sk_buff *skb, unsigned int cap)
+{
+	struct net *net = sock_net(sk);
+	int ret;
+
+	ret = sk_filter_trim_cap(sk, skb, cap);
+	if (ret < 0)
+		return ret;
+
+	if (nf_hook_socket_active(net, skb)) {
+		ret = nf_hook_socket(net, skb);
+		if (ret == 0)
+			return -EPERM;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(sk_inet_filter);
+
 void sock_init_data(struct socket *sock, struct sock *sk)
 {
 	sk_init_common(sk);
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index eab3bd1ee9a0..0f7bffde01ae 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -593,7 +593,7 @@ static int dccp_v6_do_rcv(struct sock *sk, struct sk_buff *skb)
 	if (skb->protocol == htons(ETH_P_IP))
 		return dccp_v4_do_rcv(sk, skb);
 
-	if (sk_filter(sk, skb))
+	if (sk_inet_filter(sk, skb, 1))
 		goto discard;
 
 	/*
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index f9cec624068d..0aa5af583f71 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1872,7 +1872,7 @@ int tcp_filter(struct sock *sk, struct sk_buff *skb)
 {
 	struct tcphdr *th = (struct tcphdr *)skb->data;
 
-	return sk_filter_trim_cap(sk, skb, th->doff * 4);
+	return sk_inet_filter(sk, skb, th->doff * 4);
 }
 EXPORT_SYMBOL(tcp_filter);
 
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 6b4d8361560f..ac4a64071ac7 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2211,7 +2211,7 @@ static int udp_queue_rcv_one_skb(struct sock *sk, struct sk_buff *skb)
 	    udp_lib_checksum_complete(skb))
 			goto csum_error;
 
-	if (sk_filter_trim_cap(sk, skb, sizeof(struct udphdr))) {
+	if (sk_inet_filter(sk, skb, sizeof(struct udphdr))) {
 		drop_reason = SKB_DROP_REASON_SOCKET_FILTER;
 		goto drop;
 	}
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 7f0fa9bd9ffe..374bc818c94f 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -734,7 +734,7 @@ static int udpv6_queue_rcv_one_skb(struct sock *sk, struct sk_buff *skb)
 	    udp_lib_checksum_complete(skb))
 		goto csum_error;
 
-	if (sk_filter_trim_cap(sk, skb, sizeof(struct udphdr)))
+	if (sk_inet_filter(sk, skb, sizeof(struct udphdr)))
 		goto drop;
 
 	udp_csum_pull_header(skb);
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index ddc54b6d18ee..3eb7d54225f5 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -18,6 +18,13 @@ config NETFILTER_EGRESS
 	  This allows you to classify packets before transmission using the
 	  Netfilter infrastructure.
 
+config NETFILTER_SOCKET
+	bool "Netfilter socket support"
+	default y
+	help
+	  This allows you to classify packets for local sockets using the
+	  Netfilter infrastructure.
+
 config NETFILTER_SKIP_EGRESS
 	def_bool NETFILTER_EGRESS && (NET_CLS_ACT || IFB)
 
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index dcf752b55a52..0dfdf69e389a 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -284,13 +284,17 @@ nf_hook_entry_head(struct net *net, int pf, unsigned int hooknum,
 #endif
 #ifdef CONFIG_NETFILTER_INGRESS
 	case NFPROTO_INET:
-		if (WARN_ON_ONCE(hooknum != NF_INET_INGRESS))
-			return NULL;
-		if (!dev || dev_net(dev) != net) {
-			WARN_ON_ONCE(1);
-			return NULL;
+		if (hooknum == NF_INET_INGRESS) {
+			if (!dev || dev_net(dev) != net) {
+				WARN_ON_ONCE(1);
+				return NULL;
+			}
+			return &dev->nf_hooks_ingress;
+		} else if (hooknum == NF_INET_SOCKET) {
+			return net->nf.hooks_socket;
 		}
-		return &dev->nf_hooks_ingress;
+		WARN_ON_ONCE(1);
+		return NULL;
 #endif
 	case NFPROTO_IPV4:
 		if (WARN_ON_ONCE(ARRAY_SIZE(net->nf.hooks_ipv4) <= hooknum))
@@ -523,7 +527,8 @@ static void __nf_unregister_net_hook(struct net *net, int pf,
 void nf_unregister_net_hook(struct net *net, const struct nf_hook_ops *reg)
 {
 	if (reg->pf == NFPROTO_INET) {
-		if (reg->hooknum == NF_INET_INGRESS) {
+		if (reg->hooknum == NF_INET_INGRESS ||
+		    reg->hooknum == NF_INET_SOCKET) {
 			__nf_unregister_net_hook(net, NFPROTO_INET, reg);
 		} else {
 			__nf_unregister_net_hook(net, NFPROTO_IPV4, reg);
@@ -553,7 +558,8 @@ int nf_register_net_hook(struct net *net, const struct nf_hook_ops *reg)
 	int err;
 
 	if (reg->pf == NFPROTO_INET) {
-		if (reg->hooknum == NF_INET_INGRESS) {
+		if (reg->hooknum == NF_INET_INGRESS ||
+		    reg->hooknum == NF_INET_SOCKET) {
 			err = __nf_register_net_hook(net, NFPROTO_INET, reg);
 			if (err < 0)
 				return err;
diff --git a/net/netfilter/nft_chain_filter.c b/net/netfilter/nft_chain_filter.c
index c3563f0be269..eceeebfd0ab2 100644
--- a/net/netfilter/nft_chain_filter.c
+++ b/net/netfilter/nft_chain_filter.c
@@ -201,7 +201,8 @@ static const struct nft_chain_type nft_chain_filter_inet = {
 			  (1 << NF_INET_LOCAL_OUT) |
 			  (1 << NF_INET_FORWARD) |
 			  (1 << NF_INET_PRE_ROUTING) |
-			  (1 << NF_INET_POST_ROUTING),
+			  (1 << NF_INET_POST_ROUTING) |
+			  (1 << NF_INET_SOCKET),
 	.hooks		= {
 		[NF_INET_INGRESS]	= nft_do_chain_inet_ingress,
 		[NF_INET_LOCAL_IN]	= nft_do_chain_inet,
@@ -209,6 +210,7 @@ static const struct nft_chain_type nft_chain_filter_inet = {
 		[NF_INET_FORWARD]	= nft_do_chain_inet,
 		[NF_INET_PRE_ROUTING]	= nft_do_chain_inet,
 		[NF_INET_POST_ROUTING]	= nft_do_chain_inet,
+		[NF_INET_SOCKET]	= nft_do_chain_inet,
         },
 };
 
diff --git a/net/sctp/input.c b/net/sctp/input.c
index 90e12bafdd48..2ea84d1f607f 100644
--- a/net/sctp/input.c
+++ b/net/sctp/input.c
@@ -203,7 +203,7 @@ int sctp_rcv(struct sk_buff *skb)
 		goto discard_release;
 	nf_reset_ct(skb);
 
-	if (sk_filter(sk, skb))
+	if (sk_inet_filter(sk, skb, 1))
 		goto discard_release;
 
 	/* Create an SCTP packet structure. */

[-- Attachment #3: nft.patch --]
[-- Type: text/x-diff, Size: 2249 bytes --]

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 9e07888013e8..5fd054c42212 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -49,6 +49,7 @@ enum nf_inet_hooks {
 	NF_INET_LOCAL_OUT,
 	NF_INET_POST_ROUTING,
 	NF_INET_INGRESS,
+	NF_INET_SOCKET,
 	NF_INET_NUMHOOKS
 };
 
diff --git a/src/evaluate.c b/src/evaluate.c
index b5f74d2f5051..03c18f413b09 100644
--- a/src/evaluate.c
+++ b/src/evaluate.c
@@ -4465,6 +4465,8 @@ static uint32_t str2hooknum(uint32_t family, const char *hook)
 	case NFPROTO_INET:
 		if (!strcmp(hook, "ingress"))
 			return NF_INET_INGRESS;
+		if (!strcmp(hook, "socket"))
+			return NF_INET_SOCKET;
 		/* fall through */
 	case NFPROTO_IPV4:
 	case NFPROTO_BRIDGE:
diff --git a/src/parser_bison.y b/src/parser_bison.y
index ca5c488cd5ff..046ce92c0ab6 100644
--- a/src/parser_bison.y
+++ b/src/parser_bison.y
@@ -618,8 +618,8 @@ int nft_lex(void *, void *, void *);
 %type <limit_rate>		limit_rate_pkts
 %type <limit_rate>		limit_rate_bytes
 
-%type <string>			identifier type_identifier string comment_spec
-%destructor { xfree($$); }	identifier type_identifier string comment_spec
+%type <string>			identifier type_identifier string comment_spec hook_str
+%destructor { xfree($$); }	identifier type_identifier string comment_spec hook_str
 
 %type <val>			time_spec quota_used
 
@@ -2390,7 +2390,11 @@ type_identifier		:	STRING	{ $$ = $1; }
 			|	CLASSID { $$ = xstrdup("classid"); }
 			;
 
-hook_spec		:	TYPE		close_scope_type	STRING		HOOK		STRING		dev_spec	prio_spec
+hook_str		:	STRING { $$ = $1; }
+			|	SOCKET { $$ = xstrdup("socket"); }
+			;
+
+hook_spec		:	TYPE		close_scope_type	STRING		HOOK		hook_str	dev_spec	prio_spec
 			{
 				const char *chain_type = chain_type_name_lookup($3);
 
diff --git a/src/rule.c b/src/rule.c
index 799092eb15c5..b05351144c62 100644
--- a/src/rule.c
+++ b/src/rule.c
@@ -677,6 +677,7 @@ static const char * const chain_hookname_str_array[] = {
 	"output",
 	"ingress",
 	"egress",
+	"socket",
 	NULL,
 };
 
@@ -815,6 +816,8 @@ const char *hooknum2str(unsigned int family, unsigned int hooknum)
 			return "output";
 		case NF_INET_INGRESS:
 			return "ingress";
+		case NF_INET_SOCKET:
+			return "socket";
 		default:
 			break;
 		};

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] netfilter: nft_socket: socket expressions for GID & UID
  2022-04-26 21:05     ` Pablo Neira Ayuso
@ 2022-04-26 21:07       ` Pablo Neira Ayuso
  2022-04-27 18:07         ` Topi Miettinen
  0 siblings, 1 reply; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-04-26 21:07 UTC (permalink / raw)
  To: Topi Miettinen; +Cc: Jan Engelhardt, netfilter-devel

On Tue, Apr 26, 2022 at 11:05:09PM +0200, Pablo Neira Ayuso wrote:
> On Thu, Apr 21, 2022 at 07:35:06PM +0300, Topi Miettinen wrote:
> > On 21.4.2022 0.15, Jan Engelhardt wrote:
> > > 
> > > On Wednesday 2022-04-20 20:54, Topi Miettinen wrote:
> > > 
> > > > Add socket expressions for checking GID or UID of the originating
> > > > socket. These work also on input side, unlike meta skuid/skgid.
> > > 
> > > Why exactly is it that meta skuid does not work?
> > > Because of the skb_to_full_sk() call in nft_meta_get_eval_skugid()?
> > 
> > I don't know the details, but early demux isn't reliable and filters aren't
> > run after final demux. In my case, something like "ct state new meta skuid <
> > 1000 drop" as part of input filter doesn't do anything. Making "meta skuid"
> > 100% reliable would be of course preferable to adding a new expression.
> 
> Could you give a try to this kernel patch?
> 
> This patch adds a new socket hook for inet layer 4 protocols, it is
> coming after the NF_LOCAL_IN hook, where the socket information is
> available for all cases.
> 
> You also need a small patch for userspace nft.

Quickly tested it with:

 table inet x {
        chain y {
                type filter hook socket priority 0; policy accept;
                counter
        }
 }

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] netfilter: nft_socket: socket expressions for GID & UID
  2022-04-26 19:02     ` Topi Miettinen
@ 2022-04-27  5:48       ` Florian Westphal
  2022-04-27  7:01         ` Pablo Neira Ayuso
  0 siblings, 1 reply; 17+ messages in thread
From: Florian Westphal @ 2022-04-27  5:48 UTC (permalink / raw)
  To: Topi Miettinen; +Cc: Florian Westphal, netfilter-devel

Topi Miettinen <toiwoton@gmail.com> wrote:
> On 26.4.2022 1.34, Florian Westphal wrote:
> > Topi Miettinen <toiwoton@gmail.com> wrote:
> > > On 20.4.2022 21.54, Topi Miettinen wrote:
> > > > Add socket expressions for checking GID or UID of the originating
> > > > socket. These work also on input side, unlike meta skuid/skgid.
> > > 
> > > Unfortunately, there's a reproducible kernel BUG when closing a local
> > > connection:
> > > 
> > > Apr 25 21:18:13 kernel:
> > > ==================================================================
> > > Apr 25 21:18:13 kernel: BUG: KASAN: null-ptr-deref in
> > > nf_sk_lookup_slow_v6+0x45b/0x590 [nf_socket_ipv6]
> > 
> > You can pass this to scripts/faddr2line to get the location of the null deref.
> 
> Didn't work,

?

You pass the object file and the nf_sk_lookup_slow_v6+0x45b/0x590 info.
I can't do it for you because I lack the object file and the exact
source code.

> net/ipv6/netfilter/nf_socket_ipv6.c:
> 
> static struct sock *
> nf_socket_get_sock_v6(struct net *net, struct sk_buff *skb, int doff,
>                       const u8 protocol,
>                       const struct in6_addr *saddr, const struct in6_addr
> *daddr,
>                       const __be16 sport, const __be16 dport,
>                       const struct net_device *in)
> {
>         switch (protocol) {
>         case IPPROTO_TCP:
>                 return inet6_lookup(net, &tcp_hashinfo, skb, doff,
>                                     saddr, sport, daddr, dport,
>                                     in->ifindex);

What does that rule look like?  Seems like no input interface is
available, seems like a bug in existing code?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] netfilter: nft_socket: socket expressions for GID & UID
  2022-04-27  5:48       ` Florian Westphal
@ 2022-04-27  7:01         ` Pablo Neira Ayuso
  2022-04-27 15:00           ` Topi Miettinen
  0 siblings, 1 reply; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-04-27  7:01 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Topi Miettinen, netfilter-devel

On Wed, Apr 27, 2022 at 07:48:20AM +0200, Florian Westphal wrote:
> Topi Miettinen <toiwoton@gmail.com> wrote:
> > On 26.4.2022 1.34, Florian Westphal wrote:
> > > Topi Miettinen <toiwoton@gmail.com> wrote:
> > > > On 20.4.2022 21.54, Topi Miettinen wrote:
> > > > > Add socket expressions for checking GID or UID of the originating
> > > > > socket. These work also on input side, unlike meta skuid/skgid.
> > > > 
> > > > Unfortunately, there's a reproducible kernel BUG when closing a local
> > > > connection:
> > > > 
> > > > Apr 25 21:18:13 kernel:
> > > > ==================================================================
> > > > Apr 25 21:18:13 kernel: BUG: KASAN: null-ptr-deref in
> > > > nf_sk_lookup_slow_v6+0x45b/0x590 [nf_socket_ipv6]
> > > 
> > > You can pass this to scripts/faddr2line to get the location of the null deref.
> > 
> > Didn't work,
> 
> ?
> 
> You pass the object file and the nf_sk_lookup_slow_v6+0x45b/0x590 info.
> I can't do it for you because I lack the object file and the exact
> source code.
> 
> > net/ipv6/netfilter/nf_socket_ipv6.c:
> > 
> > static struct sock *
> > nf_socket_get_sock_v6(struct net *net, struct sk_buff *skb, int doff,
> >                       const u8 protocol,
> >                       const struct in6_addr *saddr, const struct in6_addr
> > *daddr,
> >                       const __be16 sport, const __be16 dport,
> >                       const struct net_device *in)
> > {
> >         switch (protocol) {
> >         case IPPROTO_TCP:
> >                 return inet6_lookup(net, &tcp_hashinfo, skb, doff,
> >                                     saddr, sport, daddr, dport,
> >                                     in->ifindex);
> 
> What does that rule look like?  Seems like no input interface is
> available, seems like a bug in existing code?

nft_socket_eval() assumes it always run from input path.

@Topi: How does you test ruleset look like?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] netfilter: nft_socket: socket expressions for GID & UID
  2022-04-27  7:01         ` Pablo Neira Ayuso
@ 2022-04-27 15:00           ` Topi Miettinen
  2022-04-27 15:28             ` Florian Westphal
  2022-04-27 15:30             ` Pablo Neira Ayuso
  0 siblings, 2 replies; 17+ messages in thread
From: Topi Miettinen @ 2022-04-27 15:00 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Florian Westphal; +Cc: netfilter-devel

On 27.4.2022 10.01, Pablo Neira Ayuso wrote:
> On Wed, Apr 27, 2022 at 07:48:20AM +0200, Florian Westphal wrote:
>> Topi Miettinen <toiwoton@gmail.com> wrote:
>>> On 26.4.2022 1.34, Florian Westphal wrote:
>>>> Topi Miettinen <toiwoton@gmail.com> wrote:
>>>>> On 20.4.2022 21.54, Topi Miettinen wrote:
>>>>>> Add socket expressions for checking GID or UID of the originating
>>>>>> socket. These work also on input side, unlike meta skuid/skgid.
>>>>>
>>>>> Unfortunately, there's a reproducible kernel BUG when closing a local
>>>>> connection:
>>>>>
>>>>> Apr 25 21:18:13 kernel:
>>>>> ==================================================================
>>>>> Apr 25 21:18:13 kernel: BUG: KASAN: null-ptr-deref in
>>>>> nf_sk_lookup_slow_v6+0x45b/0x590 [nf_socket_ipv6]
>>>>
>>>> You can pass this to scripts/faddr2line to get the location of the null deref.
>>>
>>> Didn't work,
>>
>> ?
>>
>> You pass the object file and the nf_sk_lookup_slow_v6+0x45b/0x590 info.
>> I can't do it for you because I lack the object file and the exact
>> source code.
>>

$ faddr2line nf_socket_ipv6.ko nf_sk_lookup_slow_v6+0x45b/0x590
bad symbol size: base: 0x0000000000000000 end: 0x0000000000000000
$ faddr2line nf_socket_ipv6.o nf_sk_lookup_slow_v6+0x45b/0x590
bad symbol size: base: 0x0000000000000000 end: 0x0000000000000000
$ faddr2line nf_socket_ipv6.mod nf_sk_lookup_slow_v6+0x45b/0x590
readelf: Error: nf_socket_ipv6.mod: Failed to read file header
size: nf_socket_ipv6.mod: file format not recognized
nm: nf_socket_ipv6.mod: file format not recognized
size: nf_socket_ipv6.mod: file format not recognized
nm: nf_socket_ipv6.mod: file format not recognized
no match for nf_sk_lookup_slow_v6+0x45b/0x590
$ faddr2line nf_socket_ipv6.mod.o nf_sk_lookup_slow_v6+0x45b/0x590
no match for nf_sk_lookup_slow_v6+0x45b/0x590
$ faddr2line vmlinux nf_sk_lookup_slow_v6+0x45b/0x590
no match for nf_sk_lookup_slow_v6+0x45b/0x590

>>> net/ipv6/netfilter/nf_socket_ipv6.c:
>>>
>>> static struct sock *
>>> nf_socket_get_sock_v6(struct net *net, struct sk_buff *skb, int doff,
>>>                        const u8 protocol,
>>>                        const struct in6_addr *saddr, const struct in6_addr
>>> *daddr,
>>>                        const __be16 sport, const __be16 dport,
>>>                        const struct net_device *in)
>>> {
>>>          switch (protocol) {
>>>          case IPPROTO_TCP:
>>>                  return inet6_lookup(net, &tcp_hashinfo, skb, doff,
>>>                                      saddr, sport, daddr, dport,
>>>                                      in->ifindex);
>>
>> What does that rule look like?  Seems like no input interface is
>> available, seems like a bug in existing code?
> 
> nft_socket_eval() assumes it always run from input path.
> 
> @Topi: How does you test ruleset look like?

Here's a reproducer:
#!/usr/sbin/nft -f

table inet mangle {
         chain output {
                 type route hook output priority mangle; policy accept;

                 socket uid != 0 reject with icmpx type admin-prohibited
         }
}

Start nc -6l 1 as root

Try 'telnet ::1 1' as root, press enter and close the connection. After 
1-3 tries, system hangs and Caps Lock starts blinking.

-Topi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] netfilter: nft_socket: socket expressions for GID & UID
  2022-04-27 15:00           ` Topi Miettinen
@ 2022-04-27 15:28             ` Florian Westphal
  2022-04-27 15:30             ` Pablo Neira Ayuso
  1 sibling, 0 replies; 17+ messages in thread
From: Florian Westphal @ 2022-04-27 15:28 UTC (permalink / raw)
  To: Topi Miettinen; +Cc: Pablo Neira Ayuso, Florian Westphal, netfilter-devel

Topi Miettinen <toiwoton@gmail.com> wrote:
> On 27.4.2022 10.01, Pablo Neira Ayuso wrote:
> > On Wed, Apr 27, 2022 at 07:48:20AM +0200, Florian Westphal wrote:
> > > Topi Miettinen <toiwoton@gmail.com> wrote:
> > > > On 26.4.2022 1.34, Florian Westphal wrote:
> > > > > Topi Miettinen <toiwoton@gmail.com> wrote:
> > > > > > On 20.4.2022 21.54, Topi Miettinen wrote:
> > > > > > > Add socket expressions for checking GID or UID of the originating
> > > > > > > socket. These work also on input side, unlike meta skuid/skgid.
> > > > > > 
> > > > > > Unfortunately, there's a reproducible kernel BUG when closing a local
> > > > > > connection:
> > > > > > 
> > > > > > Apr 25 21:18:13 kernel:
> > > > > > ==================================================================
> > > > > > Apr 25 21:18:13 kernel: BUG: KASAN: null-ptr-deref in
> > > > > > nf_sk_lookup_slow_v6+0x45b/0x590 [nf_socket_ipv6]
> > > > > 
> > > > > You can pass this to scripts/faddr2line to get the location of the null deref.
> > > > 
> > > > Didn't work,
> > > 
> > > ?
> > > 
> > > You pass the object file and the nf_sk_lookup_slow_v6+0x45b/0x590 info.
> > > I can't do it for you because I lack the object file and the exact
> > > source code.
> > > 
> 
> $ faddr2line nf_socket_ipv6.ko nf_sk_lookup_slow_v6+0x45b/0x590
> bad symbol size: base: 0x0000000000000000 end: 0x0000000000000000
> $ faddr2line nf_socket_ipv6.o nf_sk_lookup_slow_v6+0x45b/0x590
> bad symbol size: base: 0x0000000000000000 end: 0x0000000000000000
> $ faddr2line nf_socket_ipv6.mod nf_sk_lookup_slow_v6+0x45b/0x590
> readelf: Error: nf_socket_ipv6.mod: Failed to read file header
> size: nf_socket_ipv6.mod: file format not recognized
> nm: nf_socket_ipv6.mod: file format not recognized
> size: nf_socket_ipv6.mod: file format not recognized
> nm: nf_socket_ipv6.mod: file format not recognized
> no match for nf_sk_lookup_slow_v6+0x45b/0x590
> $ faddr2line nf_socket_ipv6.mod.o nf_sk_lookup_slow_v6+0x45b/0x590
> no match for nf_sk_lookup_slow_v6+0x45b/0x590
> $ faddr2line vmlinux nf_sk_lookup_slow_v6+0x45b/0x590
> no match for nf_sk_lookup_slow_v6+0x45b/0x590
> 
> > > > net/ipv6/netfilter/nf_socket_ipv6.c:
> > > > 
> > > > static struct sock *
> > > > nf_socket_get_sock_v6(struct net *net, struct sk_buff *skb, int doff,
> > > >                        const u8 protocol,
> > > >                        const struct in6_addr *saddr, const struct in6_addr
> > > > *daddr,
> > > >                        const __be16 sport, const __be16 dport,
> > > >                        const struct net_device *in)
> > > > {
> > > >          switch (protocol) {
> > > >          case IPPROTO_TCP:
> > > >                  return inet6_lookup(net, &tcp_hashinfo, skb, doff,
> > > >                                      saddr, sport, daddr, dport,
> > > >                                      in->ifindex);
> > > 
> > > What does that rule look like?  Seems like no input interface is
> > > available, seems like a bug in existing code?
> > 
> > nft_socket_eval() assumes it always run from input path.

Oof, there is no restriction.
I don't think we can make it illegal now, so probably best to check for
indev != NULL first.

I'll send a patch.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] netfilter: nft_socket: socket expressions for GID & UID
  2022-04-27 15:00           ` Topi Miettinen
  2022-04-27 15:28             ` Florian Westphal
@ 2022-04-27 15:30             ` Pablo Neira Ayuso
  2022-04-27 15:42               ` Florian Westphal
  1 sibling, 1 reply; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-04-27 15:30 UTC (permalink / raw)
  To: Topi Miettinen; +Cc: Florian Westphal, netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 3589 bytes --]

On Wed, Apr 27, 2022 at 06:00:49PM +0300, Topi Miettinen wrote:
> On 27.4.2022 10.01, Pablo Neira Ayuso wrote:
> > On Wed, Apr 27, 2022 at 07:48:20AM +0200, Florian Westphal wrote:
> > > Topi Miettinen <toiwoton@gmail.com> wrote:
> > > > On 26.4.2022 1.34, Florian Westphal wrote:
> > > > > Topi Miettinen <toiwoton@gmail.com> wrote:
> > > > > > On 20.4.2022 21.54, Topi Miettinen wrote:
> > > > > > > Add socket expressions for checking GID or UID of the originating
> > > > > > > socket. These work also on input side, unlike meta skuid/skgid.
> > > > > > 
> > > > > > Unfortunately, there's a reproducible kernel BUG when closing a local
> > > > > > connection:
> > > > > > 
> > > > > > Apr 25 21:18:13 kernel:
> > > > > > ==================================================================
> > > > > > Apr 25 21:18:13 kernel: BUG: KASAN: null-ptr-deref in
> > > > > > nf_sk_lookup_slow_v6+0x45b/0x590 [nf_socket_ipv6]
> > > > > 
> > > > > You can pass this to scripts/faddr2line to get the location of the null deref.
> > > > 
> > > > Didn't work,
> > > 
> > > ?
> > > 
> > > You pass the object file and the nf_sk_lookup_slow_v6+0x45b/0x590 info.
> > > I can't do it for you because I lack the object file and the exact
> > > source code.
> > > 
> 
> $ faddr2line nf_socket_ipv6.ko nf_sk_lookup_slow_v6+0x45b/0x590
> bad symbol size: base: 0x0000000000000000 end: 0x0000000000000000
> $ faddr2line nf_socket_ipv6.o nf_sk_lookup_slow_v6+0x45b/0x590
> bad symbol size: base: 0x0000000000000000 end: 0x0000000000000000
> $ faddr2line nf_socket_ipv6.mod nf_sk_lookup_slow_v6+0x45b/0x590
> readelf: Error: nf_socket_ipv6.mod: Failed to read file header
> size: nf_socket_ipv6.mod: file format not recognized
> nm: nf_socket_ipv6.mod: file format not recognized
> size: nf_socket_ipv6.mod: file format not recognized
> nm: nf_socket_ipv6.mod: file format not recognized
> no match for nf_sk_lookup_slow_v6+0x45b/0x590
> $ faddr2line nf_socket_ipv6.mod.o nf_sk_lookup_slow_v6+0x45b/0x590
> no match for nf_sk_lookup_slow_v6+0x45b/0x590
> $ faddr2line vmlinux nf_sk_lookup_slow_v6+0x45b/0x590
> no match for nf_sk_lookup_slow_v6+0x45b/0x590
>
> > > > net/ipv6/netfilter/nf_socket_ipv6.c:
> > > > 
> > > > static struct sock *
> > > > nf_socket_get_sock_v6(struct net *net, struct sk_buff *skb, int doff,
> > > >                        const u8 protocol,
> > > >                        const struct in6_addr *saddr, const struct in6_addr
> > > > *daddr,
> > > >                        const __be16 sport, const __be16 dport,
> > > >                        const struct net_device *in)
> > > > {
> > > >          switch (protocol) {
> > > >          case IPPROTO_TCP:
> > > >                  return inet6_lookup(net, &tcp_hashinfo, skb, doff,
> > > >                                      saddr, sport, daddr, dport,
> > > >                                      in->ifindex);
> > > 
> > > What does that rule look like?  Seems like no input interface is
> > > available, seems like a bug in existing code?
> > 
> > nft_socket_eval() assumes it always run from input path.
> > 
> > @Topi: How does you test ruleset look like?
> 
> Here's a reproducer:
> #!/usr/sbin/nft -f
> 
> table inet mangle {
>         chain output {
>                 type route hook output priority mangle; policy accept;
> 
>                 socket uid != 0 reject with icmpx type admin-prohibited
>         }
> }
> 
> Start nc -6l 1 as root
> 
> Try 'telnet ::1 1' as root, press enter and close the connection. After 1-3
> tries, system hangs and Caps Lock starts blinking.

Looks like skb->sk is NULL? Patch attached.

[-- Attachment #2: fix-nft-socket.patch --]
[-- Type: text/x-diff, Size: 2057 bytes --]

diff --git a/net/netfilter/nft_socket.c b/net/netfilter/nft_socket.c
index 6d9e8e0a3a7d..d6da68a3b739 100644
--- a/net/netfilter/nft_socket.c
+++ b/net/netfilter/nft_socket.c
@@ -59,21 +59,27 @@ static void nft_socket_eval(const struct nft_expr *expr,
 			    const struct nft_pktinfo *pkt)
 {
 	const struct nft_socket *priv = nft_expr_priv(expr);
+	u32 *dest = &regs->data[priv->dreg];
 	struct sk_buff *skb = pkt->skb;
+	const struct net_device *dev;
 	struct sock *sk = skb->sk;
-	u32 *dest = &regs->data[priv->dreg];
 
 	if (sk && !net_eq(nft_net(pkt), sock_net(sk)))
 		sk = NULL;
 
-	if (!sk)
+	if (nft_hook(pkt) == NF_INET_LOCAL_OUT)
+		dev = nft_out(pkt);
+	else
+		dev = nft_in(pkt);
+
+	if (!sk) {
 		switch(nft_pf(pkt)) {
 		case NFPROTO_IPV4:
-			sk = nf_sk_lookup_slow_v4(nft_net(pkt), skb, nft_in(pkt));
+			sk = nf_sk_lookup_slow_v4(nft_net(pkt), skb, dev);
 			break;
 #if IS_ENABLED(CONFIG_NF_TABLES_IPV6)
 		case NFPROTO_IPV6:
-			sk = nf_sk_lookup_slow_v6(nft_net(pkt), skb, nft_in(pkt));
+			sk = nf_sk_lookup_slow_v6(nft_net(pkt), skb, dev);
 			break;
 #endif
 		default:
@@ -81,6 +87,7 @@ static void nft_socket_eval(const struct nft_expr *expr,
 			regs->verdict.code = NFT_BREAK;
 			return;
 		}
+	}
 
 	if (!sk) {
 		regs->verdict.code = NFT_BREAK;
@@ -184,6 +191,15 @@ static int nft_socket_init(const struct nft_ctx *ctx,
 					NULL, NFT_DATA_VALUE, len);
 }
 
+static int nft_socket_validate(const struct nft_ctx *ctx,
+			       const struct nft_expr *expr,
+			       const struct nft_data **data)
+{
+	return nft_chain_validate_hooks(ctx->chain, (1 << NF_INET_PRE_ROUTING) |
+						    (1 << NF_INET_LOCAL_IN) |
+						    (1 << NF_INET_LOCAL_OUT));
+}
+
 static int nft_socket_dump(struct sk_buff *skb,
 			   const struct nft_expr *expr)
 {
@@ -230,6 +246,7 @@ static const struct nft_expr_ops nft_socket_ops = {
 	.size		= NFT_EXPR_SIZE(sizeof(struct nft_socket)),
 	.eval		= nft_socket_eval,
 	.init		= nft_socket_init,
+	.validate	= nft_socket_validate,
 	.dump		= nft_socket_dump,
 	.reduce		= nft_socket_reduce,
 };

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] netfilter: nft_socket: socket expressions for GID & UID
  2022-04-27 15:30             ` Pablo Neira Ayuso
@ 2022-04-27 15:42               ` Florian Westphal
  2022-04-27 15:45                 ` Pablo Neira Ayuso
  0 siblings, 1 reply; 17+ messages in thread
From: Florian Westphal @ 2022-04-27 15:42 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Topi Miettinen, Florian Westphal, netfilter-devel

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> Looks like skb->sk is NULL? Patch attached.

> diff --git a/net/netfilter/nft_socket.c b/net/netfilter/nft_socket.c
> index 6d9e8e0a3a7d..d6da68a3b739 100644
> --- a/net/netfilter/nft_socket.c
> +++ b/net/netfilter/nft_socket.c
> @@ -59,21 +59,27 @@ static void nft_socket_eval(const struct nft_expr *expr,
>  			    const struct nft_pktinfo *pkt)
>  {
>  	const struct nft_socket *priv = nft_expr_priv(expr);
> +	u32 *dest = &regs->data[priv->dreg];
>  	struct sk_buff *skb = pkt->skb;
> +	const struct net_device *dev;
>  	struct sock *sk = skb->sk;
> -	u32 *dest = &regs->data[priv->dreg];
>  
>  	if (sk && !net_eq(nft_net(pkt), sock_net(sk)))
>  		sk = NULL;
>  
> -	if (!sk)
> +	if (nft_hook(pkt) == NF_INET_LOCAL_OUT)
> +		dev = nft_out(pkt);
> +	else
> +		dev = nft_in(pkt);

I think its better to just NFT_BREAK for NF_INET_LOCAL_OUT && skb->sk == NULL,
I don't see how nf_sk_lookup_slow_.() could provide meaningful result
here, they assume packet header daddr/dport are the local, not the
remote addresses.

Or, check nft_in(pkt) == NULL || !sk -> BREAK, whatever seems simpler to
you.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] netfilter: nft_socket: socket expressions for GID & UID
  2022-04-27 15:42               ` Florian Westphal
@ 2022-04-27 15:45                 ` Pablo Neira Ayuso
  0 siblings, 0 replies; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-04-27 15:45 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Topi Miettinen, netfilter-devel

On Wed, Apr 27, 2022 at 05:42:19PM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > Looks like skb->sk is NULL? Patch attached.
> 
> > diff --git a/net/netfilter/nft_socket.c b/net/netfilter/nft_socket.c
> > index 6d9e8e0a3a7d..d6da68a3b739 100644
> > --- a/net/netfilter/nft_socket.c
> > +++ b/net/netfilter/nft_socket.c
> > @@ -59,21 +59,27 @@ static void nft_socket_eval(const struct nft_expr *expr,
> >  			    const struct nft_pktinfo *pkt)
> >  {
> >  	const struct nft_socket *priv = nft_expr_priv(expr);
> > +	u32 *dest = &regs->data[priv->dreg];
> >  	struct sk_buff *skb = pkt->skb;
> > +	const struct net_device *dev;
> >  	struct sock *sk = skb->sk;
> > -	u32 *dest = &regs->data[priv->dreg];
> >  
> >  	if (sk && !net_eq(nft_net(pkt), sock_net(sk)))
> >  		sk = NULL;
> >  
> > -	if (!sk)
> > +	if (nft_hook(pkt) == NF_INET_LOCAL_OUT)
> > +		dev = nft_out(pkt);
> > +	else
> > +		dev = nft_in(pkt);
> 
> I think its better to just NFT_BREAK for NF_INET_LOCAL_OUT && skb->sk == NULL,
> I don't see how nf_sk_lookup_slow_.() could provide meaningful result
> here, they assume packet header daddr/dport are the local, not the
> remote addresses.
> 
> Or, check nft_in(pkt) == NULL || !sk -> BREAK, whatever seems simpler to
> you.

Makes sense.

I'll let you follow up on this.

I'll tag

https://patchwork.ozlabs.org/project/netfilter-devel/patch/20220427153333.18424-1-pablo@netfilter.org/

as changed requested too

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] netfilter: nft_socket: socket expressions for GID & UID
  2022-04-26 21:07       ` Pablo Neira Ayuso
@ 2022-04-27 18:07         ` Topi Miettinen
  2022-05-02 17:02           ` Pablo Neira Ayuso
  0 siblings, 1 reply; 17+ messages in thread
From: Topi Miettinen @ 2022-04-27 18:07 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Jan Engelhardt, netfilter-devel

On 27.4.2022 0.07, Pablo Neira Ayuso wrote:
> On Tue, Apr 26, 2022 at 11:05:09PM +0200, Pablo Neira Ayuso wrote:
>> On Thu, Apr 21, 2022 at 07:35:06PM +0300, Topi Miettinen wrote:
>>> On 21.4.2022 0.15, Jan Engelhardt wrote:
>>>>
>>>> On Wednesday 2022-04-20 20:54, Topi Miettinen wrote:
>>>>
>>>>> Add socket expressions for checking GID or UID of the originating
>>>>> socket. These work also on input side, unlike meta skuid/skgid.
>>>>
>>>> Why exactly is it that meta skuid does not work?
>>>> Because of the skb_to_full_sk() call in nft_meta_get_eval_skugid()?
>>>
>>> I don't know the details, but early demux isn't reliable and filters aren't
>>> run after final demux. In my case, something like "ct state new meta skuid <
>>> 1000 drop" as part of input filter doesn't do anything. Making "meta skuid"
>>> 100% reliable would be of course preferable to adding a new expression.
>>
>> Could you give a try to this kernel patch?
>>
>> This patch adds a new socket hook for inet layer 4 protocols, it is
>> coming after the NF_LOCAL_IN hook, where the socket information is
>> available for all cases.
>>
>> You also need a small patch for userspace nft.
> 
> Quickly tested it with:
> 
>   table inet x {
>          chain y {
>                  type filter hook socket priority 0; policy accept;
>                  counter
>          }
>   }

Thanks. Assuming that this makes the 'meta skuid' and 'meta cgroupv2' 
always usable, I'd prefer this approach to new 'socket uid'.

I changed the hook of my input and output filters to 'socket' but then 
there are lots of errors:

/etc/nftables.conf:411:3-67: Error: Could not process rule: Operation 
not supported
                 ct state new socket cgroupv2 level 1 vmap 
@dict_cgroup_level_1_in
 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/etc/nftables.conf:412:3-79: Error: Could not process rule: Operation 
not supported
                 ct state new meta skuid < 1000 meta l4proto vmap { tcp 
: jump tcp_input_sys }
 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/etc/nftables.conf:413:3-108: Error: Could not process rule: Operation 
not supported
                 ct state new meta skuid >= 1000 meta l4proto vmap { tcp 
: jump tcp_input_user, udp : jump udp_input_user }
 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/etc/nftables.conf:417:3-36: Error: Could not process rule: Operation 
not supported
                 icmpv6 type 144-147 counter accept
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/etc/nftables.conf:418:3-56: Error: Could not process rule: Operation 
not supported
                 meta l4proto { icmp, icmpv6 } counter jump icmp_common
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/etc/nftables.conf:423:3-51: Error: Could not process rule: Operation 
not supported
                 counter log prefix "[inet-input] " flags all drop
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/etc/nftables.conf:429:3-53: Error: Could not process rule: Operation 
not supported
                 counter log prefix "[inet-forward] " flags all drop
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/etc/nftables.conf:435:3-100: Error: Could not process rule: Operation 
not supported
                 rt type 0 counter log prefix "[inet-output-rt] " flags 
all reject with icmpx type admin-prohibited
 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/etc/nftables.conf:436:3-122: Error: Could not process rule: Operation 
not supported
                 meta protocol != { ip, ip6 } counter log prefix 
"[inet-output-proto] " flags all reject with icmpx type admin-prohibited
 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/etc/nftables.conf:437:3-78: Error: Could not process rule: Operation 
not supported
                 meta l4proto != { icmp, icmpv6, tcp, udp } counter jump 
log_bad_proto_output
 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/etc/nftables.conf:438:3-50: Error: Could not process rule: Operation 
not supported
                 ct state { established, related } counter accept
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/etc/nftables.conf:441:3-68: Error: Could not process rule: Operation 
not supported
                 ct state new socket cgroupv2 level 1 vmap 
@dict_cgroup_level_1_out
 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/etc/nftables.conf:443:3-56: Error: Could not process rule: Operation 
not supported
                 meta l4proto { icmp, icmpv6 } counter jump icmp_common
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/etc/nftables.conf:448:3-87: Error: Could not process rule: Operation 
not supported
                 counter log prefix "[inet-output] " flags all reject 
with icmpx type admin-prohibited
 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

-Topi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] netfilter: nft_socket: socket expressions for GID & UID
  2022-04-27 18:07         ` Topi Miettinen
@ 2022-05-02 17:02           ` Pablo Neira Ayuso
  0 siblings, 0 replies; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-05-02 17:02 UTC (permalink / raw)
  To: Topi Miettinen; +Cc: Jan Engelhardt, netfilter-devel

On Wed, Apr 27, 2022 at 09:07:07PM +0300, Topi Miettinen wrote:
> On 27.4.2022 0.07, Pablo Neira Ayuso wrote:
> > On Tue, Apr 26, 2022 at 11:05:09PM +0200, Pablo Neira Ayuso wrote:
> > > On Thu, Apr 21, 2022 at 07:35:06PM +0300, Topi Miettinen wrote:
> > > > On 21.4.2022 0.15, Jan Engelhardt wrote:
> > > > > 
> > > > > On Wednesday 2022-04-20 20:54, Topi Miettinen wrote:
> > > > > 
> > > > > > Add socket expressions for checking GID or UID of the originating
> > > > > > socket. These work also on input side, unlike meta skuid/skgid.
> > > > > 
> > > > > Why exactly is it that meta skuid does not work?
> > > > > Because of the skb_to_full_sk() call in nft_meta_get_eval_skugid()?
> > > > 
> > > > I don't know the details, but early demux isn't reliable and filters aren't
> > > > run after final demux. In my case, something like "ct state new meta skuid <
> > > > 1000 drop" as part of input filter doesn't do anything. Making "meta skuid"
> > > > 100% reliable would be of course preferable to adding a new expression.
> > > 
> > > Could you give a try to this kernel patch?
> > > 
> > > This patch adds a new socket hook for inet layer 4 protocols, it is
> > > coming after the NF_LOCAL_IN hook, where the socket information is
> > > available for all cases.
> > > 
> > > You also need a small patch for userspace nft.
> > 
> > Quickly tested it with:
> > 
> >   table inet x {
> >          chain y {
> >                  type filter hook socket priority 0; policy accept;
> >                  counter
> >          }
> >   }
> 
> Thanks. Assuming that this makes the 'meta skuid' and 'meta cgroupv2' always
> usable, I'd prefer this approach to new 'socket uid'.
> 
> I changed the hook of my input and output filters to 'socket' but then there
> are lots of errors:
> 
> /etc/nftables.conf:411:3-67: Error: Could not process rule: Operation not
> supported
>                 ct state new socket cgroupv2 level 1 vmap
> @dict_cgroup_level_1_in

My patch is missing an update for nft_socket.

Could you send me your ruleset so I can test my next patchset works
with it? Private email is fine.

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-05-02 17:03 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-20 18:54 [PATCH] netfilter: nft_socket: socket expressions for GID & UID Topi Miettinen
2022-04-20 21:15 ` Jan Engelhardt
2022-04-21 16:35   ` Topi Miettinen
2022-04-26 21:05     ` Pablo Neira Ayuso
2022-04-26 21:07       ` Pablo Neira Ayuso
2022-04-27 18:07         ` Topi Miettinen
2022-05-02 17:02           ` Pablo Neira Ayuso
2022-04-25 18:45 ` Topi Miettinen
2022-04-25 22:34   ` Florian Westphal
2022-04-26 19:02     ` Topi Miettinen
2022-04-27  5:48       ` Florian Westphal
2022-04-27  7:01         ` Pablo Neira Ayuso
2022-04-27 15:00           ` Topi Miettinen
2022-04-27 15:28             ` Florian Westphal
2022-04-27 15:30             ` Pablo Neira Ayuso
2022-04-27 15:42               ` Florian Westphal
2022-04-27 15:45                 ` Pablo Neira Ayuso

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.