From: Jakub Sitnicki <jakub@cloudflare.com> To: Martin KaFai Lau <kafai@fb.com> Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, dccp@vger.kernel.org, kernel-team@cloudflare.com, Alexei Starovoitov <ast@kernel.org>, "Daniel Borkmann" <daniel@iogearbox.net>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Gerrit Renker <gerrit@erg.abdn.ac.uk>, Jakub Kicinski <kuba@kernel.org>, Andrii Nakryiko <andrii.nakryiko@gmail.com> Subject: Re: [PATCH bpf-next v2 00/17] Run a BPF program on socket lookup Date: Wed, 13 May 2020 19:54:09 +0200 [thread overview] Message-ID: <87sgg3u3em.fsf@cloudflare.com> (raw) In-Reply-To: <20200512163411.4ae2faoa6hwnrv3k@kafai-mbp.dhcp.thefacebook.com> On Tue, May 12, 2020 at 06:34 PM CEST, Martin KaFai Lau wrote: > On Tue, May 12, 2020 at 01:57:45PM +0200, Jakub Sitnicki wrote: >> On Mon, May 11, 2020 at 09:45 PM CEST, Martin KaFai Lau wrote: >> > On Mon, May 11, 2020 at 08:52:01PM +0200, Jakub Sitnicki wrote: [...] >> >> RX pps measured with `ifpps -d <dev> -t 1000 --csv --loop` for 60 seconds. >> >> >> >> | tcp4 SYN flood | rx pps (mean ± sstdev) | Δ rx pps | >> >> |------------------------------+------------------------+----------| >> >> | 5.6.3 vanilla (baseline) | 939,616 ± 0.5% | - | >> >> | no SK_LOOKUP prog attached | 929,275 ± 1.2% | -1.1% | >> >> | with SK_LOOKUP prog attached | 918,582 ± 0.4% | -2.2% | >> >> >> >> | tcp6 SYN flood | rx pps (mean ± sstdev) | Δ rx pps | >> >> |------------------------------+------------------------+----------| >> >> | 5.6.3 vanilla (baseline) | 875,838 ± 0.5% | - | >> >> | no SK_LOOKUP prog attached | 872,005 ± 0.3% | -0.4% | >> >> | with SK_LOOKUP prog attached | 856,250 ± 0.5% | -2.2% | >> >> >> >> | udp4 0-len flood | rx pps (mean ± sstdev) | Δ rx pps | >> >> |------------------------------+------------------------+----------| >> >> | 5.6.3 vanilla (baseline) | 2,738,662 ± 1.5% | - | >> >> | no SK_LOOKUP prog attached | 2,576,893 ± 1.0% | -5.9% | >> >> | with SK_LOOKUP prog attached | 2,530,698 ± 1.0% | -7.6% | >> >> >> >> | udp6 0-len flood | rx pps (mean ± sstdev) | Δ rx pps | >> >> |------------------------------+------------------------+----------| >> >> | 5.6.3 vanilla (baseline) | 2,867,885 ± 1.4% | - | >> >> | no SK_LOOKUP prog attached | 2,646,875 ± 1.0% | -7.7% | >> > What is causing this regression? >> > >> >> I need to go back to archived perf.data and see if perf-annotate or >> perf-diff provide any clues that will help me tell where CPU cycles are >> going. Will get back to you on that. >> >> Wild guess is that for udp6 we're loading and coping more data to >> populate v6 addresses in program context. See inet6_lookup_run_bpf >> (patch 7). > If that is the case, > rcu_access_pointer(net->sk_lookup_prog) should be tested first before > doing ctx initialization. Coming back after looking for more hints in recorded perf.data. `perf diff` between baseline and no-prog-attached shows: # Event 'cycles' # # Baseline Delta Symbol # ........ ....... ...................................... # 4.63% +0.07% [k] udpv6_queue_rcv_one_skb 3.88% -0.38% [k] __netif_receive_skb_core 3.54% +0.21% [k] udp6_lib_lookup2 3.01% -0.42% [k] 0xffffffffc04926cc 2.69% -0.10% [k] mlx5e_skb_from_cqe_linear 2.56% -0.20% [k] ipv6_gro_receive 2.31% -0.15% [k] dev_gro_receive 2.20% -0.13% [k] do_csum 2.02% -0.68% [k] ip6_pol_route 1.94% +0.79% [k] __udp6_lib_lookup So __udp6_lib_lookup is where to look, as expected. `perf annotate __udp6_lib_lookup` for no-prog-attached has a hot spot right were we populate the context object: : /* Lookup redirect from BPF */ : result = inet6_lookup_run_bpf(net, udptable->protocol, 0.00 : ffffffff818530db: mov 0x18(%r14),%edx : inet6_lookup_run_bpf(): : const struct in6_addr *saddr, : __be16 sport, : const struct in6_addr *daddr, : unsigned short hnum) : { : struct bpf_inet_lookup_kern ctx = { inet6_hashtables.h:115 1.27 : ffffffff818530df: lea -0x78(%rbp),%r9 0.00 : ffffffff818530e3: xor %eax,%eax 0.00 : ffffffff818530e5: mov $0x8,%ecx 0.00 : ffffffff818530ea: mov %r9,%rdi inet6_hashtables.h:115 26.09 : ffffffff818530ed: rep stos %rax,%es:(%rdi) inet6_hashtables.h:115 1.35 : ffffffff818530f0: mov $0xa,%eax 0.00 : ffffffff818530f5: mov %bx,-0x60(%rbp) 0.00 : ffffffff818530f9: mov %ax,-0x78(%rbp) 0.00 : ffffffff818530fd: mov (%r15),%rax 1.42 : ffffffff81853100: mov %dl,-0x76(%rbp) 0.00 : ffffffff81853103: mov 0x8(%r15),%rdx 0.00 : ffffffff81853107: mov %r11w,-0x48(%rbp) 0.00 : ffffffff8185310c: mov %rax,-0x70(%rbp) 1.27 : ffffffff81853110: mov (%r12),%rax 0.02 : ffffffff81853114: mov %rdx,-0x68(%rbp) 0.00 : ffffffff81853118: mov 0x8(%r12),%rdx 0.02 : ffffffff8185311d: mov %rax,-0x58(%rbp) 1.24 : ffffffff81853121: mov %rdx,-0x50(%rbp) : __read_once_size(): : }) : : static __always_inline : void __read_once_size(const volatile void *p, void *res, int size) : { : __READ_ONCE_SIZE; 0.05 : ffffffff81853125: mov 0xd28(%r13),%rdx Note, struct bpf_inet_lookup has been renamed to bpf_sk_lookup since then. I'll switch to copying just the pointer to in6_addr{} and push the context initialization after test for net->sk_lookup_prog, like you suggested. I can post full output from perf-diff/annotate to some pastebin if you would like to take a deeper look. Thanks, Jakub > >> >> This makes me realize the copy is unnecessary, I could just store the >> pointer to in6_addr{}. Will make this change in v3. >> >> As to why udp6 is taking a bigger hit than udp4 - comparing top 10 in >> `perf report --no-children` shows that in our test setup, socket lookup >> contributes less to CPU cycles on receive for udp4 than for udp6. >> >> * udp4 baseline (no children) >> >> # Overhead Samples Symbol >> # ........ ............ ...................................... >> # >> 8.11% 19429 [k] fib_table_lookup >> 4.31% 10333 [k] udp_queue_rcv_one_skb >> 3.75% 8991 [k] fib4_rule_action >> 3.66% 8763 [k] __netif_receive_skb_core >> 3.42% 8198 [k] fib_rules_lookup >> 3.05% 7314 [k] fib4_rule_match >> 2.71% 6507 [k] mlx5e_skb_from_cqe_linear >> 2.58% 6192 [k] inet_gro_receive >> 2.49% 5981 [k] __x86_indirect_thunk_rax >> 2.36% 5656 [k] udp4_lib_lookup2 >> >> * udp6 baseline (no children) >> >> # Overhead Samples Symbol >> # ........ ............ ...................................... >> # >> 4.63% 11100 [k] udpv6_queue_rcv_one_skb >> 3.88% 9308 [k] __netif_receive_skb_core >> 3.54% 8480 [k] udp6_lib_lookup2 >> 2.69% 6442 [k] mlx5e_skb_from_cqe_linear >> 2.56% 6137 [k] ipv6_gro_receive >> 2.31% 5540 [k] dev_gro_receive >> 2.20% 5264 [k] do_csum >> 2.02% 4835 [k] ip6_pol_route >> 1.94% 4639 [k] __udp6_lib_lookup >> 1.89% 4540 [k] selinux_socket_sock_rcv_skb >> >> Notice that __udp4_lib_lookup didn't even make the cut. That could >> explain why adding instructions to __udp6_lib_lookup has more effect on >> RX PPS. >> >> Frankly, that is something that suprised us, but we didn't have time to >> investigate further, yet. > The perf report should be able to annotate bpf prog also. > e.g. may be part of it is because the bpf_prog itself is also dealing > with a longer address? > >> >> >> | with SK_LOOKUP prog attached | 2,520,474 ± 0.7% | -12.1% | >> > This also looks very different from udp4. >> > >> >> Thanks for the questions, >> Jakub
WARNING: multiple messages have this Message-ID (diff)
From: Jakub Sitnicki <jakub@cloudflare.com> To: dccp@vger.kernel.org Subject: Re: [PATCH bpf-next v2 00/17] Run a BPF program on socket lookup Date: Wed, 13 May 2020 17:54:09 +0000 [thread overview] Message-ID: <87sgg3u3em.fsf@cloudflare.com> (raw) In-Reply-To: <20200511185218.1422406-1-jakub@cloudflare.com> On Tue, May 12, 2020 at 06:34 PM CEST, Martin KaFai Lau wrote: > On Tue, May 12, 2020 at 01:57:45PM +0200, Jakub Sitnicki wrote: >> On Mon, May 11, 2020 at 09:45 PM CEST, Martin KaFai Lau wrote: >> > On Mon, May 11, 2020 at 08:52:01PM +0200, Jakub Sitnicki wrote: [...] >> >> RX pps measured with `ifpps -d <dev> -t 1000 --csv --loop` for 60 seconds. >> >> >> >> | tcp4 SYN flood | rx pps (mean ± sstdev) | Δ rx pps | >> >> |------------------------------+------------------------+----------| >> >> | 5.6.3 vanilla (baseline) | 939,616 ± 0.5% | - | >> >> | no SK_LOOKUP prog attached | 929,275 ± 1.2% | -1.1% | >> >> | with SK_LOOKUP prog attached | 918,582 ± 0.4% | -2.2% | >> >> >> >> | tcp6 SYN flood | rx pps (mean ± sstdev) | Δ rx pps | >> >> |------------------------------+------------------------+----------| >> >> | 5.6.3 vanilla (baseline) | 875,838 ± 0.5% | - | >> >> | no SK_LOOKUP prog attached | 872,005 ± 0.3% | -0.4% | >> >> | with SK_LOOKUP prog attached | 856,250 ± 0.5% | -2.2% | >> >> >> >> | udp4 0-len flood | rx pps (mean ± sstdev) | Δ rx pps | >> >> |------------------------------+------------------------+----------| >> >> | 5.6.3 vanilla (baseline) | 2,738,662 ± 1.5% | - | >> >> | no SK_LOOKUP prog attached | 2,576,893 ± 1.0% | -5.9% | >> >> | with SK_LOOKUP prog attached | 2,530,698 ± 1.0% | -7.6% | >> >> >> >> | udp6 0-len flood | rx pps (mean ± sstdev) | Δ rx pps | >> >> |------------------------------+------------------------+----------| >> >> | 5.6.3 vanilla (baseline) | 2,867,885 ± 1.4% | - | >> >> | no SK_LOOKUP prog attached | 2,646,875 ± 1.0% | -7.7% | >> > What is causing this regression? >> > >> >> I need to go back to archived perf.data and see if perf-annotate or >> perf-diff provide any clues that will help me tell where CPU cycles are >> going. Will get back to you on that. >> >> Wild guess is that for udp6 we're loading and coping more data to >> populate v6 addresses in program context. See inet6_lookup_run_bpf >> (patch 7). > If that is the case, > rcu_access_pointer(net->sk_lookup_prog) should be tested first before > doing ctx initialization. Coming back after looking for more hints in recorded perf.data. `perf diff` between baseline and no-prog-attached shows: # Event 'cycles' # # Baseline Delta Symbol # ........ ....... ...................................... # 4.63% +0.07% [k] udpv6_queue_rcv_one_skb 3.88% -0.38% [k] __netif_receive_skb_core 3.54% +0.21% [k] udp6_lib_lookup2 3.01% -0.42% [k] 0xffffffffc04926cc 2.69% -0.10% [k] mlx5e_skb_from_cqe_linear 2.56% -0.20% [k] ipv6_gro_receive 2.31% -0.15% [k] dev_gro_receive 2.20% -0.13% [k] do_csum 2.02% -0.68% [k] ip6_pol_route 1.94% +0.79% [k] __udp6_lib_lookup So __udp6_lib_lookup is where to look, as expected. `perf annotate __udp6_lib_lookup` for no-prog-attached has a hot spot right were we populate the context object: : /* Lookup redirect from BPF */ : result = inet6_lookup_run_bpf(net, udptable->protocol, 0.00 : ffffffff818530db: mov 0x18(%r14),%edx : inet6_lookup_run_bpf(): : const struct in6_addr *saddr, : __be16 sport, : const struct in6_addr *daddr, : unsigned short hnum) : { : struct bpf_inet_lookup_kern ctx = { inet6_hashtables.h:115 1.27 : ffffffff818530df: lea -0x78(%rbp),%r9 0.00 : ffffffff818530e3: xor %eax,%eax 0.00 : ffffffff818530e5: mov $0x8,%ecx 0.00 : ffffffff818530ea: mov %r9,%rdi inet6_hashtables.h:115 26.09 : ffffffff818530ed: rep stos %rax,%es:(%rdi) inet6_hashtables.h:115 1.35 : ffffffff818530f0: mov $0xa,%eax 0.00 : ffffffff818530f5: mov %bx,-0x60(%rbp) 0.00 : ffffffff818530f9: mov %ax,-0x78(%rbp) 0.00 : ffffffff818530fd: mov (%r15),%rax 1.42 : ffffffff81853100: mov %dl,-0x76(%rbp) 0.00 : ffffffff81853103: mov 0x8(%r15),%rdx 0.00 : ffffffff81853107: mov %r11w,-0x48(%rbp) 0.00 : ffffffff8185310c: mov %rax,-0x70(%rbp) 1.27 : ffffffff81853110: mov (%r12),%rax 0.02 : ffffffff81853114: mov %rdx,-0x68(%rbp) 0.00 : ffffffff81853118: mov 0x8(%r12),%rdx 0.02 : ffffffff8185311d: mov %rax,-0x58(%rbp) 1.24 : ffffffff81853121: mov %rdx,-0x50(%rbp) : __read_once_size(): : }) : : static __always_inline : void __read_once_size(const volatile void *p, void *res, int size) : { : __READ_ONCE_SIZE; 0.05 : ffffffff81853125: mov 0xd28(%r13),%rdx Note, struct bpf_inet_lookup has been renamed to bpf_sk_lookup since then. I'll switch to copying just the pointer to in6_addr{} and push the context initialization after test for net->sk_lookup_prog, like you suggested. I can post full output from perf-diff/annotate to some pastebin if you would like to take a deeper look. Thanks, Jakub > >> >> This makes me realize the copy is unnecessary, I could just store the >> pointer to in6_addr{}. Will make this change in v3. >> >> As to why udp6 is taking a bigger hit than udp4 - comparing top 10 in >> `perf report --no-children` shows that in our test setup, socket lookup >> contributes less to CPU cycles on receive for udp4 than for udp6. >> >> * udp4 baseline (no children) >> >> # Overhead Samples Symbol >> # ........ ............ ...................................... >> # >> 8.11% 19429 [k] fib_table_lookup >> 4.31% 10333 [k] udp_queue_rcv_one_skb >> 3.75% 8991 [k] fib4_rule_action >> 3.66% 8763 [k] __netif_receive_skb_core >> 3.42% 8198 [k] fib_rules_lookup >> 3.05% 7314 [k] fib4_rule_match >> 2.71% 6507 [k] mlx5e_skb_from_cqe_linear >> 2.58% 6192 [k] inet_gro_receive >> 2.49% 5981 [k] __x86_indirect_thunk_rax >> 2.36% 5656 [k] udp4_lib_lookup2 >> >> * udp6 baseline (no children) >> >> # Overhead Samples Symbol >> # ........ ............ ...................................... >> # >> 4.63% 11100 [k] udpv6_queue_rcv_one_skb >> 3.88% 9308 [k] __netif_receive_skb_core >> 3.54% 8480 [k] udp6_lib_lookup2 >> 2.69% 6442 [k] mlx5e_skb_from_cqe_linear >> 2.56% 6137 [k] ipv6_gro_receive >> 2.31% 5540 [k] dev_gro_receive >> 2.20% 5264 [k] do_csum >> 2.02% 4835 [k] ip6_pol_route >> 1.94% 4639 [k] __udp6_lib_lookup >> 1.89% 4540 [k] selinux_socket_sock_rcv_skb >> >> Notice that __udp4_lib_lookup didn't even make the cut. That could >> explain why adding instructions to __udp6_lib_lookup has more effect on >> RX PPS. >> >> Frankly, that is something that suprised us, but we didn't have time to >> investigate further, yet. > The perf report should be able to annotate bpf prog also. > e.g. may be part of it is because the bpf_prog itself is also dealing > with a longer address? > >> >> >> | with SK_LOOKUP prog attached | 2,520,474 ± 0.7% | -12.1% | >> > This also looks very different from udp4. >> > >> >> Thanks for the questions, >> Jakub
next prev parent reply other threads:[~2020-05-13 17:54 UTC|newest] Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-05-11 18:52 [PATCH bpf-next v2 00/17] Run a BPF program on socket lookup Jakub Sitnicki 2020-05-11 18:52 ` Jakub Sitnicki 2020-05-11 18:52 ` [PATCH bpf-next v2 01/17] flow_dissector: Extract attach/detach/query helpers Jakub Sitnicki 2020-05-11 18:52 ` Jakub Sitnicki 2020-05-11 18:52 ` [PATCH bpf-next v2 02/17] bpf: Introduce SK_LOOKUP program type with a dedicated attach point Jakub Sitnicki 2020-05-11 18:52 ` Jakub Sitnicki 2020-05-11 19:06 ` Jakub Sitnicki 2020-05-11 19:06 ` Jakub Sitnicki 2020-05-13 5:41 ` Martin KaFai Lau 2020-05-13 5:41 ` Martin KaFai Lau 2020-05-13 14:34 ` Jakub Sitnicki 2020-05-13 14:34 ` Jakub Sitnicki 2020-05-13 18:10 ` Martin KaFai Lau 2020-05-13 18:10 ` Martin KaFai Lau 2020-05-11 18:52 ` [PATCH bpf-next v2 03/17] inet: Store layer 4 protocol in inet_hashinfo Jakub Sitnicki 2020-05-11 18:52 ` Jakub Sitnicki 2020-05-11 18:52 ` [PATCH bpf-next v2 04/17] inet: Extract helper for selecting socket from reuseport group Jakub Sitnicki 2020-05-11 18:52 ` Jakub Sitnicki 2020-05-11 18:52 ` [PATCH bpf-next v2 05/17] inet: Run SK_LOOKUP BPF program on socket lookup Jakub Sitnicki 2020-05-11 18:52 ` Jakub Sitnicki 2020-05-11 20:44 ` Alexei Starovoitov 2020-05-11 20:44 ` Alexei Starovoitov 2020-05-12 13:52 ` Jakub Sitnicki 2020-05-12 13:52 ` Jakub Sitnicki 2020-05-12 23:58 ` Alexei Starovoitov 2020-05-12 23:58 ` Alexei Starovoitov 2020-05-13 13:55 ` Jakub Sitnicki 2020-05-13 13:55 ` Jakub Sitnicki 2020-05-13 14:21 ` Lorenz Bauer 2020-05-13 14:21 ` Lorenz Bauer 2020-05-13 14:50 ` Jakub Sitnicki 2020-05-13 14:50 ` Jakub Sitnicki 2020-05-15 12:28 ` Jakub Sitnicki 2020-05-15 12:28 ` Jakub Sitnicki 2020-05-15 15:07 ` Alexei Starovoitov 2020-05-15 15:07 ` Alexei Starovoitov 2020-05-11 18:52 ` [PATCH bpf-next v2 06/17] inet6: Extract helper for selecting socket from reuseport group Jakub Sitnicki 2020-05-11 18:52 ` Jakub Sitnicki 2020-05-11 18:52 ` [PATCH bpf-next v2 07/17] inet6: Run SK_LOOKUP BPF program on socket lookup Jakub Sitnicki 2020-05-11 18:52 ` Jakub Sitnicki 2020-05-11 18:52 ` [PATCH bpf-next v2 08/17] udp: Store layer 4 protocol in udp_table Jakub Sitnicki 2020-05-11 18:52 ` Jakub Sitnicki 2020-05-11 18:52 ` [PATCH bpf-next v2 09/17] udp: Extract helper for selecting socket from reuseport group Jakub Sitnicki 2020-05-11 18:52 ` Jakub Sitnicki 2020-05-11 18:52 ` [PATCH bpf-next v2 10/17] udp: Run SK_LOOKUP BPF program on socket lookup Jakub Sitnicki 2020-05-11 18:52 ` Jakub Sitnicki 2020-05-11 18:52 ` [PATCH bpf-next v2 11/17] udp6: Extract helper for selecting socket from reuseport group Jakub Sitnicki 2020-05-11 18:52 ` Jakub Sitnicki 2020-05-11 18:52 ` [PATCH bpf-next v2 12/17] udp6: Run SK_LOOKUP BPF program on socket lookup Jakub Sitnicki 2020-05-11 18:52 ` Jakub Sitnicki 2020-05-11 18:52 ` [PATCH bpf-next v2 13/17] bpf: Sync linux/bpf.h to tools/ Jakub Sitnicki 2020-05-11 18:52 ` Jakub Sitnicki 2020-05-11 18:52 ` [PATCH bpf-next v2 14/17] libbpf: Add support for SK_LOOKUP program type Jakub Sitnicki 2020-05-11 18:52 ` Jakub Sitnicki 2020-05-11 18:52 ` [PATCH bpf-next v2 15/17] selftests/bpf: Add verifier tests for bpf_sk_lookup context access Jakub Sitnicki 2020-05-11 18:52 ` Jakub Sitnicki 2020-05-11 18:52 ` [PATCH bpf-next v2 16/17] selftests/bpf: Rename test_sk_lookup_kern.c to test_ref_track_kern.c Jakub Sitnicki 2020-05-11 18:52 ` Jakub Sitnicki 2020-05-11 18:52 ` [PATCH bpf-next v2 17/17] selftests/bpf: Tests for BPF_SK_LOOKUP attach point Jakub Sitnicki 2020-05-11 18:52 ` Jakub Sitnicki 2020-05-11 19:45 ` [PATCH bpf-next v2 00/17] Run a BPF program on socket lookup Martin KaFai Lau 2020-05-11 19:45 ` Martin KaFai Lau 2020-05-12 11:57 ` Jakub Sitnicki 2020-05-12 11:57 ` Jakub Sitnicki 2020-05-12 16:34 ` Martin KaFai Lau 2020-05-12 16:34 ` Martin KaFai Lau 2020-05-13 17:54 ` Jakub Sitnicki [this message] 2020-05-13 17:54 ` Jakub Sitnicki
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=87sgg3u3em.fsf@cloudflare.com \ --to=jakub@cloudflare.com \ --cc=andrii.nakryiko@gmail.com \ --cc=ast@kernel.org \ --cc=bpf@vger.kernel.org \ --cc=daniel@iogearbox.net \ --cc=davem@davemloft.net \ --cc=dccp@vger.kernel.org \ --cc=edumazet@google.com \ --cc=gerrit@erg.abdn.ac.uk \ --cc=kafai@fb.com \ --cc=kernel-team@cloudflare.com \ --cc=kuba@kernel.org \ --cc=netdev@vger.kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.