linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: APPS Crash - Page fault Execute error: PC = bfqg_stats_update_idle_time+0x14/0x50 LR = bfq_dispatch_request+0x398/0x948
       [not found] <ade528d81fda9fe1d85b71ffb41c1aeb@codeaurora.org>
@ 2019-09-09 11:09 ` ppvk
  0 siblings, 0 replies; only message in thread
From: ppvk @ 2019-09-09 11:09 UTC (permalink / raw)
  To: paolo.valente; +Cc: linux-block, axboe

Hi Paolo,

Did you get a chance to take a look on this issue ?

Regards,
Pradeep

On 2019-09-06 21:09, ppvk@codeaurora.org wrote:
> Hi Paolo,
> 
> Good Evening!!
> 
> We started using BFQ IO Sched. from last 3 Months. it was awesome !!
> 
> Its been running great, but we just hit a "free-after-use" crash.
> We are running stability test (includes heavy IO, video playbacks etc)
> for 48~72hrs. We encountered below error. This crash is seen 2 to 3
> times during our stability testing.
> 
>    120.444910:   <6> binder: undelivered transaction 163882, process 
> died.
>    120.470254:   <6> binder: undelivered transaction 163953, process 
> died.
>    120.474841:   <6> Unable to handle kernel paging request at virtual
> address 006b6b6b6b6b7773
>    120.484651:   <6> Mem abort info:
>    120.487534:   <6>   ESR = 0x96000004
>    120.490680:   <6>   Exception class = DABT (current EL), IL = 32 
> bits
>    120.496760:   <6>   SET = 0, FnV = 0
>    120.499906:   <6>   EA = 0, S1PTW = 0
>    120.503145:   <6> Data abort info:
>    120.506111:   <6>   ISV = 0, ISS = 0x00000004
>    120.510056:   <6>   CM = 0, WnR = 0
>    120.513106:   <6> [006b6b6b6b6b7773] address between user and
> kernel address ranges
>    120.520437:   <6> Internal error: Oops: 96000004 [#1] PREEMPT SMP
>    120.522998:   <6> binder_alloc: 12672: binder_alloc_buf, no vma
>    120.526162:   <6> Modules linked in: wlan(O) machine_dlkm(O)
> wcd938x_slave_dlkm(O) wcd938x_dlkm(O) wcd9xxx_dlkm(O) mbhc_dlkm(O)
> tx_macro_dlkm(O) rx_macro_dlkm(O) va_macro_dlkm(O) wsa_macro_dlkm(O)
> swr_ctrl_dlkm(O) bolero_cdc_dlkm(O) wsa881x_dlkm(O) wcd_core_dlkm(O)
> stub_dlkm(O) hdmi_dlkm(O) swr_dlkm(O) pinctrl_lpi_dlkm(O) usf_dlkm(O)
> native_dlkm(O) platform_dlkm(O) q6_dlkm(O) adsp_loader_dlkm(O)
> apr_dlkm(O) snd_event_dlkm(O) q6_notifier_dlkm(O) q6_pdr_dlkm(O)
>    120.537723:   <6> binder: 1280:9939 transaction failed 29189/-3,
> size 80-0 line 3277
>    120.572951:   <6> Process kworker/4:1H (pid: 310, stack limit =
> 0xffffff8010670000)
>    120.572958:   <6> CPU: 4 PID: 310 Comm: kworker/4:1H Tainted: G S
>    W  O      4.19.66+ #1
>    120.572961:   <6> Hardware name: Qualcomm Technologies, Inc. Lito 
> MTP (DT)
>    120.572973:   <6> Workqueue: kblockd blk_mq_run_work_fn
>    120.572980:   <2> pstate: a0c00085 (NzCv daIf +PAN +UAO)
>    120.572988:   <2> pc : bfqg_stats_update_idle_time+0x14/0x50
>    120.572992:   <2> lr : bfq_dispatch_request+0x398/0x948
>    120.572995:   <2> sp : ffffff8010673bb0
>    120.572998:   <2> x29: ffffff8010673bc0 x28: 0000000000000001
>    120.573004:   <2> x27: 0000001c0c5e468f x26: fffffff6aa182a80
>    120.573011:   <2> x25: 0000000000000000 x24: fffffff6aa197a80
>    120.573018:   <2> x23: fffffff63feb4008 x22: 0000000000000001
>    120.573024:   <2> x21: fffffff6ab1eee08 x20: fffffff63feb4008
>    120.573030:   <2> x19: 6b6b6b6b6b6b6ad3 x18: 0000000000000004
>    120.573037:   <2> x17: 000000000000775f x16: 0000000000004e20
>    120.573043:   <2> x15: fffffff5dcd83d40 x14: 2faa9e9b959219ea
>    120.573049:   <2> x13: 0000000000000004 x12: 0000000089c9fa08
>    120.573056:   <2> x11: 2a768cc68edc0700 x10: 0000000000000000
>    120.573062:   <2> x9 : 0000000000000001 x8 : 6b6b6b6b6b6b6b6b
>    120.573069:   <2> x7 : ffffff943cfb1678 x6 : 0000000000000000
>    120.573075:   <2> x5 : 0000000000000080 x4 : 0000000000000001
>    120.573081:   <2> x3 : 0000000000000000 x2 : 0000000000000000
>    120.573087:   <2> x1 : 0000000000000000 x0 : 6b6b6b6b6b6b6ad3
>    120.573098:   <6>
>    :
>    :
>    121.185250:   <2> Call trace:
>    121.187772:   <2>  bfqg_stats_update_idle_time+0x14/0x50
>    121.192701:   <2>  bfq_dispatch_request+0x398/0x948
>    121.197188:   <2>  blk_mq_do_dispatch_sched+0x84/0x118
>    121.198271:   <6> CPU7: update max cpu_capacity 1024
>    121.206504:   <2>  blk_mq_sched_dispatch_requests+0x130/0x190
>    121.211874:   <2>  __blk_mq_run_hw_queue+0xcc/0x148
>    121.216359:   <2>  blk_mq_run_work_fn+0x24/0x30
>    121.220490:   <2>  process_one_work+0x328/0x6b0
>    121.224620:   <2>  worker_thread+0x330/0x4d0
>    121.228476:   <2>  kthread+0x128/0x138
>    121.231806:   <2>  ret_from_fork+0x10/0x1c
>    121.235484:   <6> Code: a9017bfd 910043fd aa0003f3 d503201f 
> (39728268)
>    121.241751:   <6> ---[ end trace 83177e232bbbf1f0 ]---
>    121.246514:   <6> Kernel panic - not syncing: Fatal exception
>    121.251896:   <6> SMP: stopping secondary CPUs
>    121.255938:   <6> CPU1: stopping
> 
> After doing code walk on bfq-iosched, i could see that in fn.
> bfq_dispatch_request()
> 
> in_serv_queue is assigned with bfqd->in_service_queue.
> 
> and after this, we are getting the request using 
> __bfq_dispatch_request() fn.
> 
> 
> In __bfq_dispatch_request() fn., we see that bfqd->in_service_queue is
> being updated via. __bfq_dispatch_request --> bfq_select_queue -->
> bfq_set_in_service_queue -->  __bfq_set_in_service_queue.
> I presumed that the in service queue time/budget got expired and it
> selected a new queue from a different service group.
> (If there are no more queues to be served, it could pick the new queue
> from it's next group of service tree.) during this updation, we are
> not updating the variable "in_serv_queue", we are passing same it to
> bfq_update_dispatch_stats() fn. and resulting in above crash while
> updating bfqg idle time stats.
> 
> 
> I also seen the comments written just above 
> bfgq_stats_update_idle_time() fn. as
> " Since the idle timer has been disabled, in_serv_queue contained some
> request when __bfq_dispatch_request
> was invoked above, which implies that rq was picked exactly from
> in_serv_queue. Thus in_serv_queue == bfqq,
> and is therefore guaranteed to exist because of the above arguments."
> 
> if it is so, can we suppose to use "bfqq" instead of "in_serv_queue"
> as bfqq is extracted from request "rq" and the "rq" is updated
> properly by __bfq_dispatch_request() fn. ?
> 
> Please correct me if my understanding was wrong.
> 
> Many thanks in advance !!
> 
> Best Regards,
> Pradeep

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2019-09-09 11:09 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <ade528d81fda9fe1d85b71ffb41c1aeb@codeaurora.org>
2019-09-09 11:09 ` APPS Crash - Page fault Execute error: PC = bfqg_stats_update_idle_time+0x14/0x50 LR = bfq_dispatch_request+0x398/0x948 ppvk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).