bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] bpf: enable program stats
@ 2019-03-01 22:03 Guenter Roeck
  2019-03-01 22:17 ` Eric Dumazet
  2019-03-01 22:17 ` Eric Dumazet
  0 siblings, 2 replies; 4+ messages in thread
From: Guenter Roeck @ 2019-03-01 22:03 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Daniel Borkmann, netdev, bpf, linux-kernel

Hi,

On Mon, Feb 25, 2019 at 02:28:39PM -0800, Alexei Starovoitov wrote:
> JITed BPF programs are indistinguishable from kernel functions, but unlike
> kernel code BPF code can be changed often.
> Typical approach of "perf record" + "perf report" profiling and tuning of
> kernel code works just as well for BPF programs, but kernel code doesn't
> need to be monitored whereas BPF programs do.
> Users load and run large amount of BPF programs.
> These BPF stats allow tools monitor the usage of BPF on the server.
> The monitoring tools will turn sysctl kernel.bpf_stats_enabled
> on and off for few seconds to sample average cost of the programs.
> Aggregated data over hours and days will provide an insight into cost of BPF
> and alarms can trigger in case given program suddenly gets more expensive.
> 
> The cost of two sched_clock() per program invocation adds ~20 nsec.
> Fast BPF progs (like selftests/bpf/progs/test_pkt_access.c) will slow down
> from ~10 nsec to ~30 nsec.
> static_key minimizes the cost of the stats collection.
> There is no measurable difference before/after this patch
> with kernel.bpf_stats_enabled=0
> 

This patch causes my qemu tests for 'parisc' to crash. Reverting this patch
as well as "bpf: expose program stats via bpf_prog_info" fixes the problem.

Crash log and bisect results are attached. Bisect ends with the merge;
I identified the two patches manually.

I suspect that
	prog->aux->stats = alloc_percpu_gfp(struct bpf_prog_stats, gfp_flags);
	...
	u64_stats_init(&prog->aux->stats->syncp);
may be wrong. At the very least it looks odd, and I don't find a similar use
of u64_stats_init() anywhere else in the kernel.

Guenter

---
Crash logs:

..
PDC_CHASSIS: Fault (1), CHASSIS  0
Backtrace:
 [<109c6dac>] bpf_prog_create+0x58/0xc4
 [<10131fd0>] 0x10131fd0
 [<101314ac>] 0x101314ac
 [<1018e1c4>] do_one_initcall+0x54/0x1a8
 [<101011e8>] 0x101011e8
 [<10b218d0>] kernel_init+0x1c/0x150
 [<1019601c>] ret_from_kernel_thread+0x1c/0x24
Kernel Fault: Code=18 (Data memory protection/unaligned access trap) at addr 10193fd0
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc8-next-20190301 #1
Hardware name: 9000/778/B160L
     YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00000000000001000011111100001111 Not tainted
r00-03  00043f0f 1017ec10 10260078 2f83c380
r04-07  00000dc0 00810000 2f834620 10dbbb44
r08-11  10dcf000 00000002 10179528 10e4cc10
r12-15  10e4cc10 1013c010 00000008 f0018f04
r16-19  f001672e 00000001 00000028 1131d000
r20-23  00000000 11370fd8 00000018 00000006
r24-27  00000018 00000000 2f834620 10db9c10
r28-31  10193fc0 00000400 2f83c400 11370fc0
sr00-03  00000000 00000000 00000000 00000000
sr04-07  00000000 00000000 00000000 00000000
IASQ: 00000000 00000000 IAOQ: 10260088 1026008c
 IIR: 6b800020    ISR: 00000000  IOR: 10193fd0
 CPU:        0   CR30: 2f83c000 CR31: 00000000
 ORIG_R28: 00000000
 IAOQ[0]: bpf_prog_alloc+0x50/0x8c
 IAOQ[1]: bpf_prog_alloc+0x54/0x8c
 RP(r2): bpf_prog_alloc+0x40/0x8c
Backtrace:
 [<109c6dac>] bpf_prog_create+0x58/0xc4
 [<10131fd0>] 0x10131fd0
 [<101314ac>] 0x101314ac
 [<1018e1c4>] do_one_initcall+0x54/0x1a8
 [<101011e8>] 0x101011e8
 [<10b218d0>] kernel_init+0x1c/0x150
 [<1019601c>] ret_from_kernel_thread+0x1c/0x24
SeaBIOS wants SYSTEM HALT.

Note: The missing call is from ptp_classifier_init(), called from
sock_init(). The actual crash occurs in u64_stats_init().

---
Another crash log:

...
random: get_random_u32 called from bucket_table_alloc+0xa0/0x1e4 with crng_init=0
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
futex hash table entries: 1024 (order: 3, 49152 bytes)
xor: measuring software checksum speed
   8regs     :  1326.000 MB/sec
   8regs_prefetch:   719.000 MB/sec
   32regs    :   722.000 MB/sec
   32regs_prefetch:   690.000 MB/sec
xor: using function: 8regs (1326.000 MB/sec)
PDC_CHASSIS: Fault (1), CHASSIS  0
Backtrace:
 [<109c6dac>] bpf_prog_create+0x58/0xc4
 [<10131fd0>] 0x10131fd0
 [<101314ac>] 0x101314ac
 [<1018e1c4>] do_one_initcall+0x54/0x1a8
 [<101011e8>] 0x101011e8
 [<10b218d0>] kernel_init+0x1c/0x150
 [<1019601c>] ret_from_kernel_thread+0x1c/0x24
Kernel Fault: Code=18 (Data memory protection/unaligned access trap) at addr 10193fd0
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc8-next-20190301 #1
Hardware name: 9000/778/B160L
     YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00000000000001100011111100001111 Not tainted
r00-03  00063f0f 1017ec10 10260078 2fc3c380
r04-07  00000dc0 00810000 2fc356e0 10dbbb44
r08-11  10dcf000 00000002 10179528 10e4cc10
r12-15  10e4cc10 1013c010 00000008 f0018f04
r16-19  f001672e 00000001 00000028 1131d000
r20-23  00000000 11370fd8 00000018 00000006
r24-27  00000018 00000000 2fc356e0 10db9c10
r28-31  10193fc0 00000400 2fc3c400 11370fc0
sr00-03  00000000 00000000 00000000 00000000
sr04-07  00000000 00000000 00000000 00000000
IASQ: 00000000 00000000 IAOQ: 10260088 1026008c
 IIR: 6b800020    ISR: 00000000  IOR: 10193fd0
 CPU:        0   CR30: 2fc3c000 CR31: 00000000
 ORIG_R28: 00000000
 IAOQ[0]: bpf_prog_alloc+0x50/0x8c
 IAOQ[1]: bpf_prog_alloc+0x54/0x8c
 RP(r2): bpf_prog_alloc+0x40/0x8c
Backtrace:
 [<109c6dac>] bpf_prog_create+0x58/0xc4
 [<10131fd0>] 0x10131fd0
 [<101314ac>] 0x101314ac
 [<1018e1c4>] do_one_initcall+0x54/0x1a8
 [<101011e8>] 0x101011e8
 [<10b218d0>] kernel_init+0x1c/0x150
 [<1019601c>] ret_from_kernel_thread+0x1c/0x24
SeaBIOS wants SYSTEM HALT.

---
# bad: [42fd8df9d1d9c83046226e771a4f5f3d05a706ff] Add linux-next specific files for 20190228
# good: [5908e6b738e3357af42c10e1183753c70a0117a9] Linux 5.0-rc8
git bisect start 'HEAD' 'v5.0-rc8'
# good: [7c3eb312dbd9653e6a932eec87ae55d0a0cc43b6] Merge remote-tracking branch 'net-next/master'
git bisect good 7c3eb312dbd9653e6a932eec87ae55d0a0cc43b6
# bad: [f7ae930a8cb9ca6429ac096c7fdac52560732524] Merge remote-tracking branch 'spi/for-next'
git bisect bad f7ae930a8cb9ca6429ac096c7fdac52560732524
# bad: [45821a15aebf0d27a37dfa9e8fc0ed4e6bfa9307] Merge remote-tracking branch 'drm/drm-next'
git bisect bad 45821a15aebf0d27a37dfa9e8fc0ed4e6bfa9307
# good: [148fb2e2e38769d99e3dfafdc03d06eeedcf7ad3] Merge branch 'drm-next-5.1' of git://people.freedesktop.org/~agd5f/linux into drm-next
git bisect good 148fb2e2e38769d99e3dfafdc03d06eeedcf7ad3
# good: [a5f2fafece141ef3509e686cea576366d55cabb6] Merge https://gitlab.freedesktop.org/drm/msm into drm-next
git bisect good a5f2fafece141ef3509e686cea576366d55cabb6
# bad: [07b71f8ab1b2714eef4f00eed8086fbeb30f85bf] Merge remote-tracking branch 'mtd/mtd/next'
git bisect bad 07b71f8ab1b2714eef4f00eed8086fbeb30f85bf
# bad: [3e23a64449722b73f941341f3a4b91a4963d3b0e] Merge remote-tracking branch 'wireless-drivers-next/master'
git bisect bad 3e23a64449722b73f941341f3a4b91a4963d3b0e
# bad: [236de6acd6dd91716cdc468e6ba3b413324d4064] Merge remote-tracking branch 'netfilter-next/master'
git bisect bad 236de6acd6dd91716cdc468e6ba3b413324d4064
# good: [d2e614cb0795d935aee879e47aab231247274f13] samples: bpf: fix: broken sample regarding removed function
git bisect good d2e614cb0795d935aee879e47aab231247274f13
# good: [03fe5efc4c9c8c7157092d0cbb5d1bbea629ce48] netfilter: nat: remove csum_update hook
git bisect good 03fe5efc4c9c8c7157092d0cbb5d1bbea629ce48
# good: [35acfbab6e5b229ea3c4807d1949c3ab8fda9818] netfilter: remove unneeded switch fall-through
git bisect good 35acfbab6e5b229ea3c4807d1949c3ab8fda9818
# good: [a01cbae57ec29b161d42ee1caa4ffffda5d519c2] netfilter: nft_set_hash: bogus element self comparison from deactivation path
git bisect good a01cbae57ec29b161d42ee1caa4ffffda5d519c2
# good: [123f89c8aab782854d8b27b1c01f3f2cff33c20d] netfilter: nft_set_hash: remove nft_hash_key()
git bisect good 123f89c8aab782854d8b27b1c01f3f2cff33c20d
# bad: [688a5cf30666a8145c3ef8da6d089e6a9cdc414d] Merge remote-tracking branch 'bpf-next/master'
git bisect bad 688a5cf30666a8145c3ef8da6d089e6a9cdc414d
# first bad commit: [688a5cf30666a8145c3ef8da6d089e6a9cdc414d] Merge remote-tracking branch 'bpf-next/master'


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] bpf: enable program stats
  2019-03-01 22:03 [PATCH] bpf: enable program stats Guenter Roeck
@ 2019-03-01 22:17 ` Eric Dumazet
  2019-03-01 22:17 ` Eric Dumazet
  1 sibling, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2019-03-01 22:17 UTC (permalink / raw)
  To: Guenter Roeck, Alexei Starovoitov
  Cc: Daniel Borkmann, netdev, bpf, linux-kernel



On 03/01/2019 02:03 PM, Guenter Roeck wrote:
> Hi,
> 
> On Mon, Feb 25, 2019 at 02:28:39PM -0800, Alexei Starovoitov wrote:
>> JITed BPF programs are indistinguishable from kernel functions, but unlike
>> kernel code BPF code can be changed often.
>> Typical approach of "perf record" + "perf report" profiling and tuning of
>> kernel code works just as well for BPF programs, but kernel code doesn't
>> need to be monitored whereas BPF programs do.
>> Users load and run large amount of BPF programs.
>> These BPF stats allow tools monitor the usage of BPF on the server.
>> The monitoring tools will turn sysctl kernel.bpf_stats_enabled
>> on and off for few seconds to sample average cost of the programs.
>> Aggregated data over hours and days will provide an insight into cost of BPF
>> and alarms can trigger in case given program suddenly gets more expensive.
>>
>> The cost of two sched_clock() per program invocation adds ~20 nsec.
>> Fast BPF progs (like selftests/bpf/progs/test_pkt_access.c) will slow down
>> from ~10 nsec to ~30 nsec.
>> static_key minimizes the cost of the stats collection.
>> There is no measurable difference before/after this patch
>> with kernel.bpf_stats_enabled=0
>>
> 
> This patch causes my qemu tests for 'parisc' to crash. Reverting this patch
> as well as "bpf: expose program stats via bpf_prog_info" fixes the problem.
> 
> Crash log and bisect results are attached. Bisect ends with the merge;
> I identified the two patches manually.
> 
> I suspect that
> 	prog->aux->stats = alloc_percpu_gfp(struct bpf_prog_stats, gfp_flags);
> 	...
> 	u64_stats_init(&prog->aux->stats->syncp);
> may be wrong. At the very least it looks odd, and I don't find a similar use
> of u64_stats_init() anywhere else in the kernel.

Yes, a loop is needed there.

Something like :

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 1c14c347f3cfe1f7c0cf8a7eccff8135b16df81f..3f08c257858e1570339cd64a6351824bcc332ee3 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -109,6 +109,7 @@ struct bpf_prog *bpf_prog_alloc(unsigned int size, gfp_t gfp_extra_flags)
 {
        gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO | gfp_extra_flags;
        struct bpf_prog *prog;
+       int cpu;
 
        prog = bpf_prog_alloc_no_stats(size, gfp_extra_flags);
        if (!prog)
@@ -121,7 +122,12 @@ struct bpf_prog *bpf_prog_alloc(unsigned int size, gfp_t gfp_extra_flags)
                return NULL;
        }
 
-       u64_stats_init(&prog->aux->stats->syncp);
+       for_each_possible_cpu(cpu) {
+               struct bpf_prog_stats *pstats;
+
+               pstats = per_cpu_ptr(prog->aux->stats, cpu);
+               u64_stats_init(&pstats->syncp);
+       }
        return prog;
 }
 EXPORT_SYMBOL_GPL(bpf_prog_alloc);


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] bpf: enable program stats
  2019-03-01 22:03 [PATCH] bpf: enable program stats Guenter Roeck
  2019-03-01 22:17 ` Eric Dumazet
@ 2019-03-01 22:17 ` Eric Dumazet
  2019-03-01 22:27   ` Guenter Roeck
  1 sibling, 1 reply; 4+ messages in thread
From: Eric Dumazet @ 2019-03-01 22:17 UTC (permalink / raw)
  To: Guenter Roeck, Alexei Starovoitov
  Cc: Daniel Borkmann, netdev, bpf, linux-kernel



On 03/01/2019 02:03 PM, Guenter Roeck wrote:
> Hi,
> 
> On Mon, Feb 25, 2019 at 02:28:39PM -0800, Alexei Starovoitov wrote:
>> JITed BPF programs are indistinguishable from kernel functions, but unlike
>> kernel code BPF code can be changed often.
>> Typical approach of "perf record" + "perf report" profiling and tuning of
>> kernel code works just as well for BPF programs, but kernel code doesn't
>> need to be monitored whereas BPF programs do.
>> Users load and run large amount of BPF programs.
>> These BPF stats allow tools monitor the usage of BPF on the server.
>> The monitoring tools will turn sysctl kernel.bpf_stats_enabled
>> on and off for few seconds to sample average cost of the programs.
>> Aggregated data over hours and days will provide an insight into cost of BPF
>> and alarms can trigger in case given program suddenly gets more expensive.
>>
>> The cost of two sched_clock() per program invocation adds ~20 nsec.
>> Fast BPF progs (like selftests/bpf/progs/test_pkt_access.c) will slow down
>> from ~10 nsec to ~30 nsec.
>> static_key minimizes the cost of the stats collection.
>> There is no measurable difference before/after this patch
>> with kernel.bpf_stats_enabled=0
>>
> 
> This patch causes my qemu tests for 'parisc' to crash. Reverting this patch
> as well as "bpf: expose program stats via bpf_prog_info" fixes the problem.
> 
> Crash log and bisect results are attached. Bisect ends with the merge;
> I identified the two patches manually.
> 
> I suspect that
> 	prog->aux->stats = alloc_percpu_gfp(struct bpf_prog_stats, gfp_flags);
> 	...
> 	u64_stats_init(&prog->aux->stats->syncp);
> may be wrong. At the very least it looks odd, and I don't find a similar use
> of u64_stats_init() anywhere else in the kernel.

Yes, a loop is needed there.

Something like :

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 1c14c347f3cfe1f7c0cf8a7eccff8135b16df81f..3f08c257858e1570339cd64a6351824bcc332ee3 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -109,6 +109,7 @@ struct bpf_prog *bpf_prog_alloc(unsigned int size, gfp_t gfp_extra_flags)
 {
        gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO | gfp_extra_flags;
        struct bpf_prog *prog;
+       int cpu;
 
        prog = bpf_prog_alloc_no_stats(size, gfp_extra_flags);
        if (!prog)
@@ -121,7 +122,12 @@ struct bpf_prog *bpf_prog_alloc(unsigned int size, gfp_t gfp_extra_flags)
                return NULL;
        }
 
-       u64_stats_init(&prog->aux->stats->syncp);
+       for_each_possible_cpu(cpu) {
+               struct bpf_prog_stats *pstats;
+
+               pstats = per_cpu_ptr(prog->aux->stats, cpu);
+               u64_stats_init(&pstats->syncp);
+       }
        return prog;
 }
 EXPORT_SYMBOL_GPL(bpf_prog_alloc);


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] bpf: enable program stats
  2019-03-01 22:17 ` Eric Dumazet
@ 2019-03-01 22:27   ` Guenter Roeck
  0 siblings, 0 replies; 4+ messages in thread
From: Guenter Roeck @ 2019-03-01 22:27 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Alexei Starovoitov, Daniel Borkmann, netdev, bpf, linux-kernel

On Fri, Mar 01, 2019 at 02:17:40PM -0800, Eric Dumazet wrote:
> 
> 
> On 03/01/2019 02:03 PM, Guenter Roeck wrote:
> > Hi,
> > 
> > On Mon, Feb 25, 2019 at 02:28:39PM -0800, Alexei Starovoitov wrote:
> >> JITed BPF programs are indistinguishable from kernel functions, but unlike
> >> kernel code BPF code can be changed often.
> >> Typical approach of "perf record" + "perf report" profiling and tuning of
> >> kernel code works just as well for BPF programs, but kernel code doesn't
> >> need to be monitored whereas BPF programs do.
> >> Users load and run large amount of BPF programs.
> >> These BPF stats allow tools monitor the usage of BPF on the server.
> >> The monitoring tools will turn sysctl kernel.bpf_stats_enabled
> >> on and off for few seconds to sample average cost of the programs.
> >> Aggregated data over hours and days will provide an insight into cost of BPF
> >> and alarms can trigger in case given program suddenly gets more expensive.
> >>
> >> The cost of two sched_clock() per program invocation adds ~20 nsec.
> >> Fast BPF progs (like selftests/bpf/progs/test_pkt_access.c) will slow down
> >> from ~10 nsec to ~30 nsec.
> >> static_key minimizes the cost of the stats collection.
> >> There is no measurable difference before/after this patch
> >> with kernel.bpf_stats_enabled=0
> >>
> > 
> > This patch causes my qemu tests for 'parisc' to crash. Reverting this patch
> > as well as "bpf: expose program stats via bpf_prog_info" fixes the problem.
> > 
> > Crash log and bisect results are attached. Bisect ends with the merge;
> > I identified the two patches manually.
> > 
> > I suspect that
> > 	prog->aux->stats = alloc_percpu_gfp(struct bpf_prog_stats, gfp_flags);
> > 	...
> > 	u64_stats_init(&prog->aux->stats->syncp);
> > may be wrong. At the very least it looks odd, and I don't find a similar use
> > of u64_stats_init() anywhere else in the kernel.
> 
> Yes, a loop is needed there.
> 
> Something like :
> 
> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> index 1c14c347f3cfe1f7c0cf8a7eccff8135b16df81f..3f08c257858e1570339cd64a6351824bcc332ee3 100644
> --- a/kernel/bpf/core.c
> +++ b/kernel/bpf/core.c
> @@ -109,6 +109,7 @@ struct bpf_prog *bpf_prog_alloc(unsigned int size, gfp_t gfp_extra_flags)
>  {
>         gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO | gfp_extra_flags;
>         struct bpf_prog *prog;
> +       int cpu;
>  
>         prog = bpf_prog_alloc_no_stats(size, gfp_extra_flags);
>         if (!prog)
> @@ -121,7 +122,12 @@ struct bpf_prog *bpf_prog_alloc(unsigned int size, gfp_t gfp_extra_flags)
>                 return NULL;
>         }
>  
> -       u64_stats_init(&prog->aux->stats->syncp);
> +       for_each_possible_cpu(cpu) {
> +               struct bpf_prog_stats *pstats;
> +
> +               pstats = per_cpu_ptr(prog->aux->stats, cpu);
> +               u64_stats_init(&pstats->syncp);
> +       }
>         return prog;
>  }
>  EXPORT_SYMBOL_GPL(bpf_prog_alloc);
> 

Yes, that works, or at least my test no longer crashes after applying the
above patch. Feel free to add

Tested-by: Guenter Roeck <linux@roeck-us.net>

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-03-01 22:27 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-01 22:03 [PATCH] bpf: enable program stats Guenter Roeck
2019-03-01 22:17 ` Eric Dumazet
2019-03-01 22:17 ` Eric Dumazet
2019-03-01 22:27   ` Guenter Roeck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).