linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] block: add missing block_bio_complete() tracepoint
@ 2012-01-29  9:41 Namhyung Kim
  2012-01-29 19:24 ` Tejun Heo
  2012-01-30  6:38 ` Namhyung Kim
  0 siblings, 2 replies; 16+ messages in thread
From: Namhyung Kim @ 2012-01-29  9:41 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, Tejun Heo, Steven Rostedt, dm-devel

The block_bio_complete() TP has been missed so long, so that bio-based
drivers haven't been able to trace its IO behavior. Add it.

In some rare cases, such as loop_switch, @bio->bi_bdev can be NULL.
Thus convert it to TRACE_EVENT_CONDITION() as Steven suggested.

>From now on, request-based drivers will also get BLK_TA_COMPLETEs for
all bio's in requests. This needs to be handled in userland properly.

Also remove external use of the TP in DM and unexport it.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: dm-devel@redhat.com
---
* v2:
 - Merge previous patches into one as suggested by Tejun.
 - Drop BIO_COMPLETE_MASK. Now I think it should be handled in userland
   like other (maybe duplicated, in some sense) bio complete reports.

 block/blk-core.c             |    1 -
 drivers/md/dm.c              |    1 -
 fs/bio.c                     |    2 ++
 include/trace/events/block.h |    4 +++-
 4 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 1b4fd93af2c0..399c128f516c 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -37,7 +37,6 @@
 
 EXPORT_TRACEPOINT_SYMBOL_GPL(block_bio_remap);
 EXPORT_TRACEPOINT_SYMBOL_GPL(block_rq_remap);
-EXPORT_TRACEPOINT_SYMBOL_GPL(block_bio_complete);
 
 DEFINE_IDA(blk_queue_ida);
 
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 4720f68f817e..01185fa0eb74 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -648,7 +648,6 @@ static void dec_pending(struct dm_io *io, int error)
 			queue_io(md, bio);
 		} else {
 			/* done with normal IO or empty flush */
-			trace_block_bio_complete(md->queue, bio, io_error);
 			bio_endio(bio, io_error);
 		}
 	}
diff --git a/fs/bio.c b/fs/bio.c
index b1fe82cf88cf..14c03eaf384e 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -1447,6 +1447,8 @@ void bio_endio(struct bio *bio, int error)
 	else if (!test_bit(BIO_UPTODATE, &bio->bi_flags))
 		error = -EIO;
 
+	trace_block_bio_complete(bdev_get_queue(bio->bi_bdev), bio, error);
+
 	if (bio->bi_end_io)
 		bio->bi_end_io(bio, error);
 }
diff --git a/include/trace/events/block.h b/include/trace/events/block.h
index 05c5e61f0a7c..96955f4828b3 100644
--- a/include/trace/events/block.h
+++ b/include/trace/events/block.h
@@ -213,12 +213,14 @@ TRACE_EVENT(block_bio_bounce,
  * This tracepoint indicates there is no further work to do on this
  * block IO operation @bio.
  */
-TRACE_EVENT(block_bio_complete,
+TRACE_EVENT_CONDITION(block_bio_complete,
 
 	TP_PROTO(struct request_queue *q, struct bio *bio, int error),
 
 	TP_ARGS(q, bio, error),
 
+	TP_CONDITION(bio->bi_bdev != NULL),
+
 	TP_STRUCT__entry(
 		__field( dev_t,		dev		)
 		__field( sector_t,	sector		)
-- 
1.7.8.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] block: add missing block_bio_complete() tracepoint
  2012-01-29  9:41 [PATCH] block: add missing block_bio_complete() tracepoint Namhyung Kim
@ 2012-01-29 19:24 ` Tejun Heo
  2012-01-30  1:44   ` Namhyung Kim
  2012-01-30  6:38 ` Namhyung Kim
  1 sibling, 1 reply; 16+ messages in thread
From: Tejun Heo @ 2012-01-29 19:24 UTC (permalink / raw)
  To: Namhyung Kim; +Cc: Jens Axboe, linux-kernel, Steven Rostedt, dm-devel

On Sun, Jan 29, 2012 at 06:41:33PM +0900, Namhyung Kim wrote:
> The block_bio_complete() TP has been missed so long, so that bio-based
> drivers haven't been able to trace its IO behavior. Add it.
> 
> In some rare cases, such as loop_switch, @bio->bi_bdev can be NULL.
> Thus convert it to TRACE_EVENT_CONDITION() as Steven suggested.
> 
> From now on, request-based drivers will also get BLK_TA_COMPLETEs for
> all bio's in requests. This needs to be handled in userland properly.
> 
> Also remove external use of the TP in DM and unexport it.
> 
> Signed-off-by: Namhyung Kim <namhyung@gmail.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: dm-devel@redhat.com

I like the smiplicity change but do we know how we can filter this out
from userland?  Also, what's the reason not to do it from blktrace.c?
What would be the downside of doing that?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] block: add missing block_bio_complete() tracepoint
  2012-01-29 19:24 ` Tejun Heo
@ 2012-01-30  1:44   ` Namhyung Kim
  2012-01-30  1:47     ` Tejun Heo
  0 siblings, 1 reply; 16+ messages in thread
From: Namhyung Kim @ 2012-01-30  1:44 UTC (permalink / raw)
  To: linux-kernel; +Cc: dm-devel

Hello,

2012-01-30 4:24 AM, Tejun Heo wrote:
> On Sun, Jan 29, 2012 at 06:41:33 PM +0900, Namhyung Kim wrote:
>> The block_bio_complete() TP has been missed so long, so that bio-based
>> drivers haven't been able to trace its IO behavior. Add it.
>>
>> In some rare cases, such as loop_switch, @bio->bi_bdev can be NULL.
>> Thus convert it to TRACE_EVENT_CONDITION() as Steven suggested.
>>
>>  From now on, request-based drivers will also get BLK_TA_COMPLETEs for
>> all bio's in requests. This needs to be handled in userland properly.
>>
>> Also remove external use of the TP in DM and unexport it.
>>
>> Signed-off-by: Namhyung Kim <namhyung@gmail.com>
>> Cc: Tejun Heo <tj@kernel.org>
>> Cc: Steven Rostedt <rostedt@goodmis.org>
>> Cc: dm-devel@redhat.com
>
> I like the smiplicity change but do we know how we can filter this out
> from userland?  Also, what's the reason not to do it from blktrace.c?
> What would be the downside of doing that?
>
> Thanks.
>

The userland tool cannot distinguish bounced bio from original one at 
completion TP, but it can expect there will be a duplicated 
BLK_TA_COMPLETE as it sees BLK_TA_BOUNCE for the bio before.

Filtering it out from kernel side seems to hide a real information that 
(paranoid?) user might want to get, and it looks like providing "polcy 
not mechanism" IMHO. That's why I changed my mind finally.

I cannot think of the downside, anyway it's not a big deal, if you think 
it's wrong choice, I'm OK to change it again.


Thanks,
Namhyung


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] block: add missing block_bio_complete() tracepoint
  2012-01-30  1:44   ` Namhyung Kim
@ 2012-01-30  1:47     ` Tejun Heo
  2012-01-30  2:22       ` Namhyung Kim
  0 siblings, 1 reply; 16+ messages in thread
From: Tejun Heo @ 2012-01-30  1:47 UTC (permalink / raw)
  To: Namhyung Kim; +Cc: Jens Axboe, linux-kernel, Steven Rostedt, dm-devel

On Mon, Jan 30, 2012 at 10:44:19AM +0900, Namhyung Kim wrote:
> The userland tool cannot distinguish bounced bio from original one
> at completion TP, but it can expect there will be a duplicated
> BLK_TA_COMPLETE as it sees BLK_TA_BOUNCE for the bio before.
> 
> Filtering it out from kernel side seems to hide a real information
> that (paranoid?) user might want to get, and it looks like providing
> "polcy not mechanism" IMHO. That's why I changed my mind finally.
> 
> I cannot think of the downside, anyway it's not a big deal, if you
> think it's wrong choice, I'm OK to change it again.

It's just that this patch as it stands will break the existing tools
and is likely to cause some amount of head scratching for blktrace
users upgrading to new kernel with existing userland, so yeah, I think
it should be filtered in kernel from blktrace.c.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] block: add missing block_bio_complete() tracepoint
  2012-01-30  1:47     ` Tejun Heo
@ 2012-01-30  2:22       ` Namhyung Kim
  2012-01-30  2:30         ` Tejun Heo
  0 siblings, 1 reply; 16+ messages in thread
From: Namhyung Kim @ 2012-01-30  2:22 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jens Axboe, linux-kernel, Steven Rostedt, dm-devel

2012-01-30 10:47 AM, Tejun Heo wrote:
> On Mon, Jan 30, 2012 at 10:44:19 AM +0900, Namhyung Kim wrote:
>> The userland tool cannot distinguish bounced bio from original one
>> at completion TP, but it can expect there will be a duplicated
>> BLK_TA_COMPLETE as it sees BLK_TA_BOUNCE for the bio before.
>>
>> Filtering it out from kernel side seems to hide a real information
>> that (paranoid?) user might want to get, and it looks like providing
>> "polcy not mechanism" IMHO. That's why I changed my mind finally.
>>
>> I cannot think of the downside, anyway it's not a big deal, if you
>> think it's wrong choice, I'm OK to change it again.
>
> It's just that this patch as it stands will break the existing tools
> and is likely to cause some amount of head scratching for blktrace
> users upgrading to new kernel with existing userland, so yeah, I think
> it should be filtered in kernel from blktrace.c.
>
> Thanks.
>

Will it break blktrace? Looking at the code, not tested though, it will 
just add one more 'C' line for bounced bio, and that's it? The blktrace 
will get way more 'C' lines for normal request based devices and it 
needs to be handled anyway. Am I missing something?

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] block: add missing block_bio_complete() tracepoint
  2012-01-30  2:22       ` Namhyung Kim
@ 2012-01-30  2:30         ` Tejun Heo
  2012-01-30  2:49           ` Namhyung Kim
  0 siblings, 1 reply; 16+ messages in thread
From: Tejun Heo @ 2012-01-30  2:30 UTC (permalink / raw)
  To: Namhyung Kim; +Cc: Jens Axboe, linux-kernel, Steven Rostedt, dm-devel

On Mon, Jan 30, 2012 at 11:22:29AM +0900, Namhyung Kim wrote:
> >It's just that this patch as it stands will break the existing tools
> >and is likely to cause some amount of head scratching for blktrace
> >users upgrading to new kernel with existing userland, so yeah, I think
> >it should be filtered in kernel from blktrace.c.
> 
> Will it break blktrace? Looking at the code, not tested though, it
> will just add one more 'C' line for bounced bio, and that's it? The
> blktrace will get way more 'C' lines for normal request based
> devices and it needs to be handled anyway. Am I missing something?

Hmmm... maybe I'm confused.  Wouldn't it cause duplicate completions
for all rq based drivers too?

-- 
tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] block: add missing block_bio_complete() tracepoint
  2012-01-30  2:30         ` Tejun Heo
@ 2012-01-30  2:49           ` Namhyung Kim
  2012-01-30  2:53             ` Tejun Heo
  0 siblings, 1 reply; 16+ messages in thread
From: Namhyung Kim @ 2012-01-30  2:49 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jens Axboe, linux-kernel, Steven Rostedt, dm-devel

2012-01-30 11:30 AM, Tejun Heo wrote:
> On Mon, Jan 30, 2012 at 11:22:29AM +0900, Namhyung Kim wrote:
>>> It's just that this patch as it stands will break the existing tools
>>> and is likely to cause some amount of head scratching for blktrace
>>> users upgrading to new kernel with existing userland, so yeah, I think
>>> it should be filtered in kernel from blktrace.c.
>>
>> Will it break blktrace? Looking at the code, not tested though, it
>> will just add one more 'C' line for bounced bio, and that's it? The
>> blktrace will get way more 'C' lines for normal request based
>> devices and it needs to be handled anyway. Am I missing something?
>
> Hmmm... maybe I'm confused.  Wouldn't it cause duplicate completions
> for all rq based drivers too?
>

Yes, blktrace will see both of completions for rq and bio's it contains 
from now on, and I thought it's your intention? For the bounced bio's, 
it'll see one more completion per bio.

The users of older blktrace might get confused by this, thus changing 
the tool is inevitable regardless of the bounced bio's.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] block: add missing block_bio_complete() tracepoint
  2012-01-30  2:49           ` Namhyung Kim
@ 2012-01-30  2:53             ` Tejun Heo
  2012-01-30  5:51               ` Namhyung Kim
  0 siblings, 1 reply; 16+ messages in thread
From: Tejun Heo @ 2012-01-30  2:53 UTC (permalink / raw)
  To: Namhyung Kim; +Cc: Jens Axboe, linux-kernel, Steven Rostedt, dm-devel

Hello,

On Mon, Jan 30, 2012 at 11:49:43AM +0900, Namhyung Kim wrote:
> Yes, blktrace will see both of completions for rq and bio's it
> contains from now on, and I thought it's your intention? For the
> bounced bio's, it'll see one more completion per bio.

Umm... no, blktrace users shouldn't see confusing outputs.  My point
was that we should leave raw tracepoints as raw monitor points so that
they can provide raw data to different users and filter from
blktrace.c so that the existing users aren't disturbed.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] block: add missing block_bio_complete() tracepoint
  2012-01-30  2:53             ` Tejun Heo
@ 2012-01-30  5:51               ` Namhyung Kim
  2012-01-30  5:54                 ` Tejun Heo
  0 siblings, 1 reply; 16+ messages in thread
From: Namhyung Kim @ 2012-01-30  5:51 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jens Axboe, linux-kernel, Steven Rostedt, dm-devel

Hi,

2012-01-30 11:53 AM, Tejun Heo wrote:
> Hello,
>
> On Mon, Jan 30, 2012 at 11:49:43 AM +0900, Namhyung Kim wrote:
>> Yes, blktrace will see both of completions for rq and bio's it
>> contains from now on, and I thought it's your intention? For the
>> bounced bio's, it'll see one more completion per bio.
>
> Umm... no, blktrace users shouldn't see confusing outputs.  My point
> was that we should leave raw tracepoints as raw monitor points so that
> they can provide raw data to different users and filter from
> blktrace.c so that the existing users aren't disturbed.
>

That means blktrace should not be changed, and not see duplicated 
completions for rq based drivers at all?

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] block: add missing block_bio_complete() tracepoint
  2012-01-30  5:51               ` Namhyung Kim
@ 2012-01-30  5:54                 ` Tejun Heo
  2012-01-30  6:02                   ` Namhyung Kim
  0 siblings, 1 reply; 16+ messages in thread
From: Tejun Heo @ 2012-01-30  5:54 UTC (permalink / raw)
  To: Namhyung Kim; +Cc: Jens Axboe, linux-kernel, Steven Rostedt, dm-devel

On Mon, Jan 30, 2012 at 02:51:43PM +0900, Namhyung Kim wrote:
> Hi,
> 
> 2012-01-30 11:53 AM, Tejun Heo wrote:
> >Hello,
> >
> >On Mon, Jan 30, 2012 at 11:49:43 AM +0900, Namhyung Kim wrote:
> >>Yes, blktrace will see both of completions for rq and bio's it
> >>contains from now on, and I thought it's your intention? For the
> >>bounced bio's, it'll see one more completion per bio.
> >
> >Umm... no, blktrace users shouldn't see confusing outputs.  My point
> >was that we should leave raw tracepoints as raw monitor points so that
> >they can provide raw data to different users and filter from
> >blktrace.c so that the existing users aren't disturbed.
> >
> 
> That means blktrace should not be changed, and not see duplicated
> completions for rq based drivers at all?

Yeap

-- 
tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] block: add missing block_bio_complete() tracepoint
  2012-01-30  5:54                 ` Tejun Heo
@ 2012-01-30  6:02                   ` Namhyung Kim
  0 siblings, 0 replies; 16+ messages in thread
From: Namhyung Kim @ 2012-01-30  6:02 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jens Axboe, linux-kernel, Steven Rostedt, dm-devel

2012-01-30 2:54 PM, Tejun Heo wrote:
> On Mon, Jan 30, 2012 at 02:51:43PM +0900, Namhyung Kim wrote:
>> Hi,
>>
>> 2012-01-30 11:53 AM, Tejun Heo wrote:
>>> Hello,
>>>
>>> On Mon, Jan 30, 2012 at 11:49:43 AM +0900, Namhyung Kim wrote:
>>>> Yes, blktrace will see both of completions for rq and bio's it
>>>> contains from now on, and I thought it's your intention? For the
>>>> bounced bio's, it'll see one more completion per bio.
>>>
>>> Umm... no, blktrace users shouldn't see confusing outputs.  My point
>>> was that we should leave raw tracepoints as raw monitor points so that
>>> they can provide raw data to different users and filter from
>>> blktrace.c so that the existing users aren't disturbed.
>>>
>>
>> That means blktrace should not be changed, and not see duplicated
>> completions for rq based drivers at all?
>
> Yeap
>

I see. I'll prepare the patch soon.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] block: add missing block_bio_complete() tracepoint
  2012-01-29  9:41 [PATCH] block: add missing block_bio_complete() tracepoint Namhyung Kim
  2012-01-29 19:24 ` Tejun Heo
@ 2012-01-30  6:38 ` Namhyung Kim
  2012-01-30 17:05   ` Tejun Heo
  1 sibling, 1 reply; 16+ messages in thread
From: Namhyung Kim @ 2012-01-30  6:38 UTC (permalink / raw)
  Cc: Jens Axboe, linux-kernel, Tejun Heo, Steven Rostedt, dm-devel

2012-01-29 6:41 PM, Namhyung Kim wrote:
> The block_bio_complete() TP has been missed so long, so that bio-based
> drivers haven't been able to trace its IO behavior. Add it.
>
> In some rare cases, such as loop_switch, @bio->bi_bdev can be NULL.
> Thus convert it to TRACE_EVENT_CONDITION() as Steven suggested.
>

Now I see that it seems TRACE_EVENT_CONDITION() can protect event 
tracing from such condition, but what about other users of the TP like 
blktrace? I think it'll still get NULL pointer dereference on 
bdev_get_queue() after the change, right? If so, convert to T_E_C() 
looks meaningless IMHO. Do I miss something?

Thanks,
Namhyung


>  From now on, request-based drivers will also get BLK_TA_COMPLETEs for
> all bio's in requests. This needs to be handled in userland properly.
>
> Also remove external use of the TP in DM and unexport it.
>
> Signed-off-by: Namhyung Kim <namhyung@gmail.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: dm-devel@redhat.com
> ---
> * v2:
>   - Merge previous patches into one as suggested by Tejun.
>   - Drop BIO_COMPLETE_MASK. Now I think it should be handled in userland
>     like other (maybe duplicated, in some sense) bio complete reports.
>
>   block/blk-core.c             |    1 -
>   drivers/md/dm.c              |    1 -
>   fs/bio.c                     |    2 ++
>   include/trace/events/block.h |    4 +++-
>   4 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 1b4fd93af2c0..399c128f516c 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -37,7 +37,6 @@
>
>   EXPORT_TRACEPOINT_SYMBOL_GPL(block_bio_remap);
>   EXPORT_TRACEPOINT_SYMBOL_GPL(block_rq_remap);
> -EXPORT_TRACEPOINT_SYMBOL_GPL(block_bio_complete);
>
>   DEFINE_IDA(blk_queue_ida);
>
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 4720f68f817e..01185fa0eb74 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -648,7 +648,6 @@ static void dec_pending(struct dm_io *io, int error)
>   			queue_io(md, bio);
>   		} else {
>   			/* done with normal IO or empty flush */
> -			trace_block_bio_complete(md->queue, bio, io_error);
>   			bio_endio(bio, io_error);
>   		}
>   	}
> diff --git a/fs/bio.c b/fs/bio.c
> index b1fe82cf88cf..14c03eaf384e 100644
> --- a/fs/bio.c
> +++ b/fs/bio.c
> @@ -1447,6 +1447,8 @@ void bio_endio(struct bio *bio, int error)
>   	else if (!test_bit(BIO_UPTODATE,&bio->bi_flags))
>   		error = -EIO;
>
> +	trace_block_bio_complete(bdev_get_queue(bio->bi_bdev), bio, error);
> +
>   	if (bio->bi_end_io)
>   		bio->bi_end_io(bio, error);
>   }
> diff --git a/include/trace/events/block.h b/include/trace/events/block.h
> index 05c5e61f0a7c..96955f4828b3 100644
> --- a/include/trace/events/block.h
> +++ b/include/trace/events/block.h
> @@ -213,12 +213,14 @@ TRACE_EVENT(block_bio_bounce,
>    * This tracepoint indicates there is no further work to do on this
>    * block IO operation @bio.
>    */
> -TRACE_EVENT(block_bio_complete,
> +TRACE_EVENT_CONDITION(block_bio_complete,
>
>   	TP_PROTO(struct request_queue *q, struct bio *bio, int error),
>
>   	TP_ARGS(q, bio, error),
>
> +	TP_CONDITION(bio->bi_bdev != NULL),
> +
>   	TP_STRUCT__entry(
>   		__field( dev_t,		dev		)
>   		__field( sector_t,	sector		)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] block: add missing block_bio_complete() tracepoint
  2012-01-30  6:38 ` Namhyung Kim
@ 2012-01-30 17:05   ` Tejun Heo
  2012-01-31  6:30     ` Namhyung Kim
  0 siblings, 1 reply; 16+ messages in thread
From: Tejun Heo @ 2012-01-30 17:05 UTC (permalink / raw)
  To: Namhyung Kim; +Cc: Jens Axboe, linux-kernel, Steven Rostedt, dm-devel

On Mon, Jan 30, 2012 at 03:38:57PM +0900, Namhyung Kim wrote:
> 2012-01-29 6:41 PM, Namhyung Kim wrote:
> >The block_bio_complete() TP has been missed so long, so that bio-based
> >drivers haven't been able to trace its IO behavior. Add it.
> >
> >In some rare cases, such as loop_switch, @bio->bi_bdev can be NULL.
> >Thus convert it to TRACE_EVENT_CONDITION() as Steven suggested.
> >
> 
> Now I see that it seems TRACE_EVENT_CONDITION() can protect event
> tracing from such condition, but what about other users of the TP
> like blktrace? I think it'll still get NULL pointer dereference on
> bdev_get_queue() after the change, right? If so, convert to T_E_C()
> looks meaningless IMHO. Do I miss something?

Not really following, but the whole point of using
TRACE_EVENT_CONDITION() is avoiding the conditional jump when the TP
is disabled.  Whether the TP users need to test again / more isn't too
important.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] block: add missing block_bio_complete() tracepoint
  2012-01-30 17:05   ` Tejun Heo
@ 2012-01-31  6:30     ` Namhyung Kim
  2012-01-31 10:39       ` Tejun Heo
  0 siblings, 1 reply; 16+ messages in thread
From: Namhyung Kim @ 2012-01-31  6:30 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jens Axboe, linux-kernel, Steven Rostedt, dm-devel

Hi,

2012-01-31 2:05 AM, Tejun Heo wrote:
> On Mon, Jan 30, 2012 at 03:38:57PM +0900, Namhyung Kim wrote:
>> 2012-01-29 6:41 PM, Namhyung Kim wrote:
>>> The block_bio_complete() TP has been missed so long, so that bio-based
>>> drivers haven't been able to trace its IO behavior. Add it.
>>>
>>> In some rare cases, such as loop_switch, @bio->bi_bdev can be NULL.
>>> Thus convert it to TRACE_EVENT_CONDITION() as Steven suggested.
>>>
>>
>> Now I see that it seems TRACE_EVENT_CONDITION() can protect event
>> tracing from such condition, but what about other users of the TP
>> like blktrace? I think it'll still get NULL pointer dereference on
>> bdev_get_queue() after the change, right? If so, convert to T_E_C()
>> looks meaningless IMHO. Do I miss something?
>
> Not really following, but the whole point of using
> TRACE_EVENT_CONDITION() is avoiding the conditional jump when the TP
> is disabled.  Whether the TP users need to test again / more isn't too
> important.
>
> Thanks.
>

Right, but the point is it could make a NULL pointer dereference during 
evaluation of the argument of the TP AFAICS. I'm not sure about the TP 
implementation though, I think I was wrong - T_E_C() cannot protect us 
from it because it happens just before jumping to the TP, right?

So I think we need a conditional jump (with the "likely" annotation) for 
this even when the TP is disabled.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] block: add missing block_bio_complete() tracepoint
  2012-01-31  6:30     ` Namhyung Kim
@ 2012-01-31 10:39       ` Tejun Heo
  2012-02-01  2:18         ` Namhyung Kim
  0 siblings, 1 reply; 16+ messages in thread
From: Tejun Heo @ 2012-01-31 10:39 UTC (permalink / raw)
  To: Namhyung Kim; +Cc: Jens Axboe, linux-kernel, Steven Rostedt, dm-devel

Hello,

On Mon, Jan 30, 2012 at 10:30 PM, Namhyung Kim <namhyung@gmail.com> wrote:
> Right, but the point is it could make a NULL pointer dereference during
> evaluation of the argument of the TP AFAICS. I'm not sure about the TP
> implementation though, I think I was wrong - T_E_C() cannot protect us from
> it because it happens just before jumping to the TP, right?
>
> So I think we need a conditional jump (with the "likely" annotation) for
> this even when the TP is disabled.

Hmmm... still not following. Where the said NULL dereference happen?
TEC conditional is equivalent to "if (COND) TP;".  If you don't use
TEC, it'll be "if (COND) if (TP enabled) TP;".  With TEC, it will be
"if (TP enabled) if (COND) TP;".  There's no other difference.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] block: add missing block_bio_complete() tracepoint
  2012-01-31 10:39       ` Tejun Heo
@ 2012-02-01  2:18         ` Namhyung Kim
  0 siblings, 0 replies; 16+ messages in thread
From: Namhyung Kim @ 2012-02-01  2:18 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jens Axboe, linux-kernel, Steven Rostedt, dm-devel

Hi,

2012-01-31 7:39 PM, Tejun Heo wrote:
> Hello,
>
> On Mon, Jan 30, 2012 at 10:30 PM, Namhyung Kim<namhyung@gmail.com>  wrote:
>> Right, but the point is it could make a NULL pointer dereference during
>> evaluation of the argument of the TP AFAICS. I'm not sure about the TP
>> implementation though, I think I was wrong - T_E_C() cannot protect us from
>> it because it happens just before jumping to the TP, right?
>>
>> So I think we need a conditional jump (with the "likely" annotation) for
>> this even when the TP is disabled.
>
> Hmmm... still not following. Where the said NULL dereference happen?
> TEC conditional is equivalent to "if (COND) TP;".  If you don't use
> TEC, it'll be "if (COND) if (TP enabled) TP;".  With TEC, it will be
> "if (TP enabled) if (COND) TP;".  There's no other difference.
>
> Thanks.
>

I've made a quick investigation on TP implementation, and finally 
figured out what I was wrong - I thought the COND would be checkd in a 
probe, but it's not. Thanks for pointing it out.

However, for some reason, it seems gcc generated code that evaluates the 
arguments - bdev_get_queue() in this case - before checking the COND. 
Simple test module below caused a NULL pointer dereference when I used 
TRACE_EVENT_CONDITION(), but not for conditional jump:

#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/bio.h>

static int __init init_mod(void)
{
         struct bio *bio = bio_alloc(GFP_KERNEL, 0);
         bio_endio(bio, 0);
         bio_put(bio);
         return 0;
}

static void __exit exit_mod(void)
{
}

module_init(init_mod);
module_exit(exit_mod);


Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2012-02-01  2:18 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-29  9:41 [PATCH] block: add missing block_bio_complete() tracepoint Namhyung Kim
2012-01-29 19:24 ` Tejun Heo
2012-01-30  1:44   ` Namhyung Kim
2012-01-30  1:47     ` Tejun Heo
2012-01-30  2:22       ` Namhyung Kim
2012-01-30  2:30         ` Tejun Heo
2012-01-30  2:49           ` Namhyung Kim
2012-01-30  2:53             ` Tejun Heo
2012-01-30  5:51               ` Namhyung Kim
2012-01-30  5:54                 ` Tejun Heo
2012-01-30  6:02                   ` Namhyung Kim
2012-01-30  6:38 ` Namhyung Kim
2012-01-30 17:05   ` Tejun Heo
2012-01-31  6:30     ` Namhyung Kim
2012-01-31 10:39       ` Tejun Heo
2012-02-01  2:18         ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).