linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] io_uring: Add to traces the req pointer when available
       [not found] <60ac946e.1c69fb81.5efc2.65deSMTPIN_ADDED_MISSING@mx.google.com>
@ 2021-05-25  8:21 ` Pavel Begunkov
  2021-05-25  8:33   ` Pavel Begunkov
                     ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Pavel Begunkov @ 2021-05-25  8:21 UTC (permalink / raw)
  To: Olivier Langlois, Jens Axboe, Steven Rostedt, Ingo Molnar,
	io-uring, linux-kernel

On 5/25/21 6:54 AM, Olivier Langlois wrote:
> The req pointer uniquely identify a specific request.
> Having it in traces can provide valuable insights that is not possible
> to have if the calling process is reusing the same user_data value.

How about hashing kernel pointers per discussed? Even if it's better
to have it done by tracing or something as you mentioned, there is no
such a thing at the moment, so should be done by hand.

Btw, I'd incline you to split it in two patches, a cleanup and one
adding req, because it's unreadable and hides the real change

> ---
>  fs/io_uring.c                   |  8 +--
>  include/trace/events/io_uring.h | 95 ++++++++++++++++++++++-----------
>  2 files changed, 67 insertions(+), 36 deletions(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 5f82954004f6..496588ca984c 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -5059,7 +5059,7 @@ static void io_async_task_func(struct callback_head *cb)
>  	struct async_poll *apoll = req->apoll;
>  	struct io_ring_ctx *ctx = req->ctx;
>  
> -	trace_io_uring_task_run(req->ctx, req->opcode, req->user_data);
> +	trace_io_uring_task_run(req->ctx, req, req->opcode, req->user_data);
>  
>  	if (io_poll_rewait(req, &apoll->poll)) {
>  		spin_unlock_irq(&ctx->completion_lock);
> @@ -5192,8 +5192,8 @@ static bool io_arm_poll_handler(struct io_kiocb *req)
>  		return false;
>  	}
>  	spin_unlock_irq(&ctx->completion_lock);
> -	trace_io_uring_poll_arm(ctx, req->opcode, req->user_data, mask,
> -					apoll->poll.events);
> +	trace_io_uring_poll_arm(ctx, req, req->opcode, req->user_data,
> +				mask, apoll->poll.events);
>  	return true;
>  }
>  
> @@ -6578,7 +6578,7 @@ static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
>  		goto fail_req;
>  
>  	/* don't need @sqe from now on */
> -	trace_io_uring_submit_sqe(ctx, req->opcode, req->user_data,
> +	trace_io_uring_submit_sqe(ctx, req, req->opcode, req->user_data,
>  				true, ctx->flags & IORING_SETUP_SQPOLL);
>  
>  	/*
> diff --git a/include/trace/events/io_uring.h b/include/trace/events/io_uring.h
> index abb8b24744fd..12861e98c08d 100644
> --- a/include/trace/events/io_uring.h
> +++ b/include/trace/events/io_uring.h
> @@ -129,7 +129,7 @@ TRACE_EVENT(io_uring_file_get,
>   *
>   * @ctx:	pointer to a ring context structure
>   * @hashed:	type of workqueue, hashed or normal
> - * @req:	pointer to a submitted request
> + * @req:	pointer to a queued request
>   * @work:	pointer to a submitted io_wq_work
>   *
>   * Allows to trace asynchronous work submission.
> @@ -142,9 +142,9 @@ TRACE_EVENT(io_uring_queue_async_work,
>  	TP_ARGS(ctx, rw, req, work, flags),
>  
>  	TP_STRUCT__entry (
> -		__field(  void *,				ctx		)
> -		__field(  int,					rw		)
> -		__field(  void *,				req		)
> +		__field(  void *,			ctx	)
> +		__field(  int,				rw	)
> +		__field(  void *,			req	)
>  		__field(  struct io_wq_work *,		work	)
>  		__field(  unsigned int,			flags	)
>  	),
> @@ -196,10 +196,10 @@ TRACE_EVENT(io_uring_defer,
>  
>  /**
>   * io_uring_link - called before the io_uring request added into link_list of
> - * 				   another request
> + *		   another request
>   *
> - * @ctx:			pointer to a ring context structure
> - * @req:			pointer to a linked request
> + * @ctx:		pointer to a ring context structure
> + * @req:		pointer to a linked request
>   * @target_req:		pointer to a previous request, that would contain @req
>   *
>   * Allows to track linked requests, to understand dependencies between requests
> @@ -212,8 +212,8 @@ TRACE_EVENT(io_uring_link,
>  	TP_ARGS(ctx, req, target_req),
>  
>  	TP_STRUCT__entry (
> -		__field(  void *,	ctx			)
> -		__field(  void *,	req			)
> +		__field(  void *,	ctx		)
> +		__field(  void *,	req		)
>  		__field(  void *,	target_req	)
>  	),
>  
> @@ -244,7 +244,7 @@ TRACE_EVENT(io_uring_cqring_wait,
>  	TP_ARGS(ctx, min_events),
>  
>  	TP_STRUCT__entry (
> -		__field(  void *,	ctx			)
> +		__field(  void *,	ctx		)
>  		__field(  int,		min_events	)
>  	),
>  
> @@ -272,7 +272,7 @@ TRACE_EVENT(io_uring_fail_link,
>  	TP_ARGS(req, link),
>  
>  	TP_STRUCT__entry (
> -		__field(  void *,	req		)
> +		__field(  void *,	req	)
>  		__field(  void *,	link	)
>  	),
>  
> @@ -314,15 +314,15 @@ TRACE_EVENT(io_uring_complete,
>  	),
>  
>  	TP_printk("ring %p, user_data 0x%llx, result %ld, cflags %x",
> -			  __entry->ctx, (unsigned long long)__entry->user_data,
> -			  __entry->res, __entry->cflags)
> +		   __entry->ctx, (unsigned long long)__entry->user_data,
> +		   __entry->res, __entry->cflags)
>  );
>  
> -
>  /**
>   * io_uring_submit_sqe - called before submitting one SQE
>   *
>   * @ctx:		pointer to a ring context structure
> + * @req:		pointer to a submitted request
>   * @opcode:		opcode of request
>   * @user_data:		user data associated with the request
>   * @force_nonblock:	whether a context blocking or not
> @@ -333,13 +333,14 @@ TRACE_EVENT(io_uring_complete,
>   */
>  TRACE_EVENT(io_uring_submit_sqe,
>  
> -	TP_PROTO(void *ctx, u8 opcode, u64 user_data, bool force_nonblock,
> -		 bool sq_thread),
> +	TP_PROTO(void *ctx, void *req, u8 opcode, u64 user_data,
> +		 bool force_nonblock, bool sq_thread),
>  
> -	TP_ARGS(ctx, opcode, user_data, force_nonblock, sq_thread),
> +	TP_ARGS(ctx, req, opcode, user_data, force_nonblock, sq_thread),
>  
>  	TP_STRUCT__entry (
>  		__field(  void *,	ctx		)
> +		__field(  void *,	req		)
>  		__field(  u8,		opcode		)
>  		__field(  u64,		user_data	)
>  		__field(  bool,		force_nonblock	)
> @@ -348,26 +349,42 @@ TRACE_EVENT(io_uring_submit_sqe,
>  
>  	TP_fast_assign(
>  		__entry->ctx		= ctx;
> +		__entry->req		= req;
>  		__entry->opcode		= opcode;
>  		__entry->user_data	= user_data;
>  		__entry->force_nonblock	= force_nonblock;
>  		__entry->sq_thread	= sq_thread;
>  	),
>  
> -	TP_printk("ring %p, op %d, data 0x%llx, non block %d, sq_thread %d",
> -			  __entry->ctx, __entry->opcode,
> -			  (unsigned long long) __entry->user_data,
> -			  __entry->force_nonblock, __entry->sq_thread)
> +	TP_printk("ring %p, req %p, op %d, data 0x%llx, non block %d, "
> +		  "sq_thread %d",  __entry->ctx, __entry->req,
> +		  __entry->opcode, (unsigned long long)__entry->user_data,
> +		  __entry->force_nonblock, __entry->sq_thread)
>  );
>  
> +/*
> + * io_uring_poll_arm - called after arming a poll wait if successful
> + *
> + * @ctx:		pointer to a ring context structure
> + * @req:		pointer to the armed request
> + * @opcode:		opcode of request
> + * @user_data:		user data associated with the request
> + * @mask:		request poll events mask
> + * @events:		registered events of interest
> + *
> + * Allows to track which fds are waiting for and what are the events of
> + * interest.
> + */
>  TRACE_EVENT(io_uring_poll_arm,
>  
> -	TP_PROTO(void *ctx, u8 opcode, u64 user_data, int mask, int events),
> +	TP_PROTO(void *ctx, void *req, u8 opcode, u64 user_data,
> +		 int mask, int events),
>  
> -	TP_ARGS(ctx, opcode, user_data, mask, events),
> +	TP_ARGS(ctx, req, opcode, user_data, events, fd),
>  
>  	TP_STRUCT__entry (
>  		__field(  void *,	ctx		)
> +		__field(  void *,	req		)
>  		__field(  u8,		opcode		)
>  		__field(  u64,		user_data	)
>  		__field(  int,		mask		)
> @@ -376,16 +393,17 @@ TRACE_EVENT(io_uring_poll_arm,
>  
>  	TP_fast_assign(
>  		__entry->ctx		= ctx;
> +		__entry->req		= req;
>  		__entry->opcode		= opcode;
>  		__entry->user_data	= user_data;
>  		__entry->mask		= mask;
>  		__entry->events		= events;
>  	),
>  
> -	TP_printk("ring %p, op %d, data 0x%llx, mask 0x%x, events 0x%x",
> -			  __entry->ctx, __entry->opcode,
> -			  (unsigned long long) __entry->user_data,
> -			  __entry->mask, __entry->events)
> +	TP_printk("ring %p, req %p, op %d, data 0x%llx, mask 0x%x, events 0x%x",
> +		  __entry->ctx, __entry->req, __entry->opcode,
> +		  (unsigned long long) __entry->user_data,
> +		  __entry->mask, __entry->events)
>  );
>  
>  TRACE_EVENT(io_uring_poll_wake,
> @@ -440,27 +458,40 @@ TRACE_EVENT(io_uring_task_add,
>  			  __entry->mask)
>  );
>  
> +/*
> + * io_uring_task_run - called when task_work_run() executes the poll events
> + *                     notification callbacks
> + *
> + * @ctx:		pointer to a ring context structure
> + * @req:		pointer to the armed request
> + * @opcode:		opcode of request
> + * @user_data:		user data associated with the request
> + *
> + * Allows to track when notified poll events are processed
> + */
>  TRACE_EVENT(io_uring_task_run,
>  
> -	TP_PROTO(void *ctx, u8 opcode, u64 user_data),
> +	TP_PROTO(void *ctx, void *req, u8 opcode, u64 user_data),
>  
> -	TP_ARGS(ctx, opcode, user_data),
> +	TP_ARGS(ctx, req, opcode, user_data),
>  
>  	TP_STRUCT__entry (
>  		__field(  void *,	ctx		)
> +		__field(  void *,	req		)
>  		__field(  u8,		opcode		)
>  		__field(  u64,		user_data	)
>  	),
>  
>  	TP_fast_assign(
>  		__entry->ctx		= ctx;
> +		__entry->req		= req;
>  		__entry->opcode		= opcode;
>  		__entry->user_data	= user_data;
>  	),
>  
> -	TP_printk("ring %p, op %d, data 0x%llx",
> -			  __entry->ctx, __entry->opcode,
> -			  (unsigned long long) __entry->user_data)
> +	TP_printk("ring %p, req %p, op %d, data 0x%llx",
> +		  __entry->ctx, __entry->req, __entry->opcode,
> +		  (unsigned long long) __entry->user_data)
>  );
>  
>  #endif /* _TRACE_IO_URING_H */
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] io_uring: Add to traces the req pointer when available
  2021-05-25  8:21 ` [PATCH] io_uring: Add to traces the req pointer when available Pavel Begunkov
@ 2021-05-25  8:33   ` Pavel Begunkov
  2021-05-25 21:26     ` Olivier Langlois
  2021-05-25 21:29   ` Olivier Langlois
  2021-05-26 12:38   ` Stefan Metzmacher
  2 siblings, 1 reply; 15+ messages in thread
From: Pavel Begunkov @ 2021-05-25  8:33 UTC (permalink / raw)
  To: Olivier Langlois, Jens Axboe, Steven Rostedt, Ingo Molnar,
	io-uring, linux-kernel

On 5/25/21 9:21 AM, Pavel Begunkov wrote:
> On 5/25/21 6:54 AM, Olivier Langlois wrote:
>> The req pointer uniquely identify a specific request.
>> Having it in traces can provide valuable insights that is not possible
>> to have if the calling process is reusing the same user_data value.
> 
> How about hashing kernel pointers per discussed? Even if it's better
> to have it done by tracing or something as you mentioned, there is no
> such a thing at the moment, so should be done by hand.

Or do you mean that it's already the case? Can anyone
confirm if so?

> Btw, I'd incline you to split it in two patches, a cleanup and one
> adding req, because it's unreadable and hides the real change
> 
>> ---
>>  fs/io_uring.c                   |  8 +--
>>  include/trace/events/io_uring.h | 95 ++++++++++++++++++++++-----------
>>  2 files changed, 67 insertions(+), 36 deletions(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 5f82954004f6..496588ca984c 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -5059,7 +5059,7 @@ static void io_async_task_func(struct callback_head *cb)
>>  	struct async_poll *apoll = req->apoll;
>>  	struct io_ring_ctx *ctx = req->ctx;
>>  
>> -	trace_io_uring_task_run(req->ctx, req->opcode, req->user_data);
>> +	trace_io_uring_task_run(req->ctx, req, req->opcode, req->user_data);
>>  
>>  	if (io_poll_rewait(req, &apoll->poll)) {
>>  		spin_unlock_irq(&ctx->completion_lock);
>> @@ -5192,8 +5192,8 @@ static bool io_arm_poll_handler(struct io_kiocb *req)
>>  		return false;
>>  	}
>>  	spin_unlock_irq(&ctx->completion_lock);
>> -	trace_io_uring_poll_arm(ctx, req->opcode, req->user_data, mask,
>> -					apoll->poll.events);
>> +	trace_io_uring_poll_arm(ctx, req, req->opcode, req->user_data,
>> +				mask, apoll->poll.events);
>>  	return true;
>>  }
>>  
>> @@ -6578,7 +6578,7 @@ static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
>>  		goto fail_req;
>>  
>>  	/* don't need @sqe from now on */
>> -	trace_io_uring_submit_sqe(ctx, req->opcode, req->user_data,
>> +	trace_io_uring_submit_sqe(ctx, req, req->opcode, req->user_data,
>>  				true, ctx->flags & IORING_SETUP_SQPOLL);
>>  
>>  	/*
>> diff --git a/include/trace/events/io_uring.h b/include/trace/events/io_uring.h
>> index abb8b24744fd..12861e98c08d 100644
>> --- a/include/trace/events/io_uring.h
>> +++ b/include/trace/events/io_uring.h
>> @@ -129,7 +129,7 @@ TRACE_EVENT(io_uring_file_get,
>>   *
>>   * @ctx:	pointer to a ring context structure
>>   * @hashed:	type of workqueue, hashed or normal
>> - * @req:	pointer to a submitted request
>> + * @req:	pointer to a queued request
>>   * @work:	pointer to a submitted io_wq_work
>>   *
>>   * Allows to trace asynchronous work submission.
>> @@ -142,9 +142,9 @@ TRACE_EVENT(io_uring_queue_async_work,
>>  	TP_ARGS(ctx, rw, req, work, flags),
>>  
>>  	TP_STRUCT__entry (
>> -		__field(  void *,				ctx		)
>> -		__field(  int,					rw		)
>> -		__field(  void *,				req		)
>> +		__field(  void *,			ctx	)
>> +		__field(  int,				rw	)
>> +		__field(  void *,			req	)
>>  		__field(  struct io_wq_work *,		work	)
>>  		__field(  unsigned int,			flags	)
>>  	),
>> @@ -196,10 +196,10 @@ TRACE_EVENT(io_uring_defer,
>>  
>>  /**
>>   * io_uring_link - called before the io_uring request added into link_list of
>> - * 				   another request
>> + *		   another request
>>   *
>> - * @ctx:			pointer to a ring context structure
>> - * @req:			pointer to a linked request
>> + * @ctx:		pointer to a ring context structure
>> + * @req:		pointer to a linked request
>>   * @target_req:		pointer to a previous request, that would contain @req
>>   *
>>   * Allows to track linked requests, to understand dependencies between requests
>> @@ -212,8 +212,8 @@ TRACE_EVENT(io_uring_link,
>>  	TP_ARGS(ctx, req, target_req),
>>  
>>  	TP_STRUCT__entry (
>> -		__field(  void *,	ctx			)
>> -		__field(  void *,	req			)
>> +		__field(  void *,	ctx		)
>> +		__field(  void *,	req		)
>>  		__field(  void *,	target_req	)
>>  	),
>>  
>> @@ -244,7 +244,7 @@ TRACE_EVENT(io_uring_cqring_wait,
>>  	TP_ARGS(ctx, min_events),
>>  
>>  	TP_STRUCT__entry (
>> -		__field(  void *,	ctx			)
>> +		__field(  void *,	ctx		)
>>  		__field(  int,		min_events	)
>>  	),
>>  
>> @@ -272,7 +272,7 @@ TRACE_EVENT(io_uring_fail_link,
>>  	TP_ARGS(req, link),
>>  
>>  	TP_STRUCT__entry (
>> -		__field(  void *,	req		)
>> +		__field(  void *,	req	)
>>  		__field(  void *,	link	)
>>  	),
>>  
>> @@ -314,15 +314,15 @@ TRACE_EVENT(io_uring_complete,
>>  	),
>>  
>>  	TP_printk("ring %p, user_data 0x%llx, result %ld, cflags %x",
>> -			  __entry->ctx, (unsigned long long)__entry->user_data,
>> -			  __entry->res, __entry->cflags)
>> +		   __entry->ctx, (unsigned long long)__entry->user_data,
>> +		   __entry->res, __entry->cflags)
>>  );
>>  
>> -
>>  /**
>>   * io_uring_submit_sqe - called before submitting one SQE
>>   *
>>   * @ctx:		pointer to a ring context structure
>> + * @req:		pointer to a submitted request
>>   * @opcode:		opcode of request
>>   * @user_data:		user data associated with the request
>>   * @force_nonblock:	whether a context blocking or not
>> @@ -333,13 +333,14 @@ TRACE_EVENT(io_uring_complete,
>>   */
>>  TRACE_EVENT(io_uring_submit_sqe,
>>  
>> -	TP_PROTO(void *ctx, u8 opcode, u64 user_data, bool force_nonblock,
>> -		 bool sq_thread),
>> +	TP_PROTO(void *ctx, void *req, u8 opcode, u64 user_data,
>> +		 bool force_nonblock, bool sq_thread),
>>  
>> -	TP_ARGS(ctx, opcode, user_data, force_nonblock, sq_thread),
>> +	TP_ARGS(ctx, req, opcode, user_data, force_nonblock, sq_thread),
>>  
>>  	TP_STRUCT__entry (
>>  		__field(  void *,	ctx		)
>> +		__field(  void *,	req		)
>>  		__field(  u8,		opcode		)
>>  		__field(  u64,		user_data	)
>>  		__field(  bool,		force_nonblock	)
>> @@ -348,26 +349,42 @@ TRACE_EVENT(io_uring_submit_sqe,
>>  
>>  	TP_fast_assign(
>>  		__entry->ctx		= ctx;
>> +		__entry->req		= req;
>>  		__entry->opcode		= opcode;
>>  		__entry->user_data	= user_data;
>>  		__entry->force_nonblock	= force_nonblock;
>>  		__entry->sq_thread	= sq_thread;
>>  	),
>>  
>> -	TP_printk("ring %p, op %d, data 0x%llx, non block %d, sq_thread %d",
>> -			  __entry->ctx, __entry->opcode,
>> -			  (unsigned long long) __entry->user_data,
>> -			  __entry->force_nonblock, __entry->sq_thread)
>> +	TP_printk("ring %p, req %p, op %d, data 0x%llx, non block %d, "
>> +		  "sq_thread %d",  __entry->ctx, __entry->req,
>> +		  __entry->opcode, (unsigned long long)__entry->user_data,
>> +		  __entry->force_nonblock, __entry->sq_thread)
>>  );
>>  
>> +/*
>> + * io_uring_poll_arm - called after arming a poll wait if successful
>> + *
>> + * @ctx:		pointer to a ring context structure
>> + * @req:		pointer to the armed request
>> + * @opcode:		opcode of request
>> + * @user_data:		user data associated with the request
>> + * @mask:		request poll events mask
>> + * @events:		registered events of interest
>> + *
>> + * Allows to track which fds are waiting for and what are the events of
>> + * interest.
>> + */
>>  TRACE_EVENT(io_uring_poll_arm,
>>  
>> -	TP_PROTO(void *ctx, u8 opcode, u64 user_data, int mask, int events),
>> +	TP_PROTO(void *ctx, void *req, u8 opcode, u64 user_data,
>> +		 int mask, int events),
>>  
>> -	TP_ARGS(ctx, opcode, user_data, mask, events),
>> +	TP_ARGS(ctx, req, opcode, user_data, events, fd),
>>  
>>  	TP_STRUCT__entry (
>>  		__field(  void *,	ctx		)
>> +		__field(  void *,	req		)
>>  		__field(  u8,		opcode		)
>>  		__field(  u64,		user_data	)
>>  		__field(  int,		mask		)
>> @@ -376,16 +393,17 @@ TRACE_EVENT(io_uring_poll_arm,
>>  
>>  	TP_fast_assign(
>>  		__entry->ctx		= ctx;
>> +		__entry->req		= req;
>>  		__entry->opcode		= opcode;
>>  		__entry->user_data	= user_data;
>>  		__entry->mask		= mask;
>>  		__entry->events		= events;
>>  	),
>>  
>> -	TP_printk("ring %p, op %d, data 0x%llx, mask 0x%x, events 0x%x",
>> -			  __entry->ctx, __entry->opcode,
>> -			  (unsigned long long) __entry->user_data,
>> -			  __entry->mask, __entry->events)
>> +	TP_printk("ring %p, req %p, op %d, data 0x%llx, mask 0x%x, events 0x%x",
>> +		  __entry->ctx, __entry->req, __entry->opcode,
>> +		  (unsigned long long) __entry->user_data,
>> +		  __entry->mask, __entry->events)
>>  );
>>  
>>  TRACE_EVENT(io_uring_poll_wake,
>> @@ -440,27 +458,40 @@ TRACE_EVENT(io_uring_task_add,
>>  			  __entry->mask)
>>  );
>>  
>> +/*
>> + * io_uring_task_run - called when task_work_run() executes the poll events
>> + *                     notification callbacks
>> + *
>> + * @ctx:		pointer to a ring context structure
>> + * @req:		pointer to the armed request
>> + * @opcode:		opcode of request
>> + * @user_data:		user data associated with the request
>> + *
>> + * Allows to track when notified poll events are processed
>> + */
>>  TRACE_EVENT(io_uring_task_run,
>>  
>> -	TP_PROTO(void *ctx, u8 opcode, u64 user_data),
>> +	TP_PROTO(void *ctx, void *req, u8 opcode, u64 user_data),
>>  
>> -	TP_ARGS(ctx, opcode, user_data),
>> +	TP_ARGS(ctx, req, opcode, user_data),
>>  
>>  	TP_STRUCT__entry (
>>  		__field(  void *,	ctx		)
>> +		__field(  void *,	req		)
>>  		__field(  u8,		opcode		)
>>  		__field(  u64,		user_data	)
>>  	),
>>  
>>  	TP_fast_assign(
>>  		__entry->ctx		= ctx;
>> +		__entry->req		= req;
>>  		__entry->opcode		= opcode;
>>  		__entry->user_data	= user_data;
>>  	),
>>  
>> -	TP_printk("ring %p, op %d, data 0x%llx",
>> -			  __entry->ctx, __entry->opcode,
>> -			  (unsigned long long) __entry->user_data)
>> +	TP_printk("ring %p, req %p, op %d, data 0x%llx",
>> +		  __entry->ctx, __entry->req, __entry->opcode,
>> +		  (unsigned long long) __entry->user_data)
>>  );
>>  
>>  #endif /* _TRACE_IO_URING_H */
>>
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] io_uring: Add to traces the req pointer when available
  2021-05-25  8:33   ` Pavel Begunkov
@ 2021-05-25 21:26     ` Olivier Langlois
  2021-05-25 21:48       ` Olivier Langlois
  0 siblings, 1 reply; 15+ messages in thread
From: Olivier Langlois @ 2021-05-25 21:26 UTC (permalink / raw)
  To: Pavel Begunkov, Jens Axboe, Steven Rostedt, Ingo Molnar,
	io-uring, linux-kernel

Hi Pavel,

On Tue, 2021-05-25 at 09:33 +0100, Pavel Begunkov wrote:
> On 5/25/21 9:21 AM, Pavel Begunkov wrote:
> > On 5/25/21 6:54 AM, Olivier Langlois wrote:
> > > The req pointer uniquely identify a specific request.
> > > Having it in traces can provide valuable insights that is not
> > > possible
> > > to have if the calling process is reusing the same user_data value.
> > 
> > How about hashing kernel pointers per discussed? Even if it's better
> > to have it done by tracing or something as you mentioned, there is no
> > such a thing at the moment, so should be done by hand.
> 
> Or do you mean that it's already the case? Can anyone
> confirm if so?

I did consider your option but then I did some research on the pointer
hashing idea.

It turns out to be already addressed by the trace subsystem.

Here is what I have found:

whippet2 /sys/kernel/tracing # cat trace_options 
print-parent
nosym-offset
nosym-addr
noverbose
noraw
nohex
nobin
noblock
trace_printk
annotate
nouserstacktrace
nosym-userobj
noprintk-msg-only
context-info
nolatency-format
record-cmd
norecord-tgid
overwrite
nodisable_on_free
irq-info
markers
noevent-fork
nopause-on-trace
hash-ptr
function-trace
nofunction-fork
nodisplay-graph
nostacktrace
notest_nop_accept
notest_nop_refuse

hash-ptr option is enabled by default.

I am not 100% sure to understand why I am getting naked pointer values
when I am getting the traces with 'sudo perf':

  9287.369 test/625 io_uring:io_uring_task_run(ctx: 0xffff8fbf9a834800,
opcode: 22, user_data: 216454257090494477, result: 195)
  9287.386 test/625 io_uring:io_uring_task_run(ctx: 0xffff8fbf9a834800,
opcode: 22, user_data: 216454257090494477, result: 195)

but the pointers should be hashed by trace.

That would be nice if someone more knowledgeable about the tracing
system could jump in and comment about the hash-ptr option and tell
when it is applied and when it is not...



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] io_uring: Add to traces the req pointer when available
  2021-05-25  8:21 ` [PATCH] io_uring: Add to traces the req pointer when available Pavel Begunkov
  2021-05-25  8:33   ` Pavel Begunkov
@ 2021-05-25 21:29   ` Olivier Langlois
  2021-05-26 12:38   ` Stefan Metzmacher
  2 siblings, 0 replies; 15+ messages in thread
From: Olivier Langlois @ 2021-05-25 21:29 UTC (permalink / raw)
  To: Pavel Begunkov, Jens Axboe, Steven Rostedt, Ingo Molnar,
	io-uring, linux-kernel

On Tue, 2021-05-25 at 09:21 +0100, Pavel Begunkov wrote:
> 
> Btw, I'd incline you to split it in two patches, a cleanup and one
> adding req, because it's unreadable and hides the real change

good idea and I agree 100%. My git skills are a bit rusty but I am
going to take that as an opportunity to learn how to submit a patch
like a real kernel dev.

Give me a couple of hours and I'll send a corrected patch set.

Greetings,



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] io_uring: Add to traces the req pointer when available
  2021-05-25 21:26     ` Olivier Langlois
@ 2021-05-25 21:48       ` Olivier Langlois
  2021-05-25 22:28         ` Jens Axboe
  0 siblings, 1 reply; 15+ messages in thread
From: Olivier Langlois @ 2021-05-25 21:48 UTC (permalink / raw)
  To: Pavel Begunkov, Jens Axboe, Steven Rostedt, Ingo Molnar,
	io-uring, linux-kernel

On Tue, 2021-05-25 at 17:26 -0400, Olivier Langlois wrote:
> but the pointers should be hashed by trace.
> 
> That would be nice if someone more knowledgeable about the tracing
> system could jump in and comment about the hash-ptr option and tell
> when it is applied and when it is not...

My concern about hashing pointers directly in the io_uring code
directly. It is that by doing so will make it impossible for a
sufficiently priviledged user to get the raw pointer values without
reverting back the pointer hashing stuff.

that would not be the right way to address the security concern if the
tracing subsystem already hash them by default and is configurable to
display raw pointers if desired.

My issue is that I haven't been able to see hashed pointers output from
trace.

The only way that I know to get the traces is with 'sudo perf' and I
have never seen hashed pointers with it.

but the code is there... If you grep 'TRACE_ITER_HASH_PTR' in
kernel/trace/trace.c

I just haven't spent days in studying the tracing code to figure out
how it all works...



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] io_uring: Add to traces the req pointer when available
  2021-05-25 21:48       ` Olivier Langlois
@ 2021-05-25 22:28         ` Jens Axboe
  2021-05-26  8:28           ` Olivier Langlois
  0 siblings, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2021-05-25 22:28 UTC (permalink / raw)
  To: Olivier Langlois, Pavel Begunkov, Steven Rostedt, Ingo Molnar,
	io-uring, linux-kernel

On 5/25/21 3:48 PM, Olivier Langlois wrote:
> On Tue, 2021-05-25 at 17:26 -0400, Olivier Langlois wrote:
>> but the pointers should be hashed by trace.
>>
>> That would be nice if someone more knowledgeable about the tracing
>> system could jump in and comment about the hash-ptr option and tell
>> when it is applied and when it is not...
> 
> My concern about hashing pointers directly in the io_uring code
> directly. It is that by doing so will make it impossible for a
> sufficiently priviledged user to get the raw pointer values without
> reverting back the pointer hashing stuff.
> 
> that would not be the right way to address the security concern if the
> tracing subsystem already hash them by default and is configurable to
> display raw pointers if desired.
> 
> My issue is that I haven't been able to see hashed pointers output from
> trace.

Just a quick guess, but does it rely on using %p to print the pointers?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] io_uring: Add to traces the req pointer when available
  2021-05-25 22:28         ` Jens Axboe
@ 2021-05-26  8:28           ` Olivier Langlois
  0 siblings, 0 replies; 15+ messages in thread
From: Olivier Langlois @ 2021-05-26  8:28 UTC (permalink / raw)
  To: Jens Axboe, Pavel Begunkov, Steven Rostedt, Ingo Molnar,
	io-uring, linux-kernel

On Tue, 2021-05-25 at 16:28 -0600, Jens Axboe wrote:
> > My issue is that I haven't been able to see hashed pointers output
> > from trace.
> 
> Just a quick guess, but does it rely on using %p to print the pointers?
> 
My very limited understanding of how the trace subsystem works is that
by default, it doesn't use the provided TK_printk macro at all.

the kernel trace subsystem does format internally the passed parameters
before sending the output to a ring buffer (yes another ring!).

You can override this method through the tracing option to use printk
instead and when you do, this is where the TK_printk() macro is used.

Before I did realize that, this was making me scratch my head as to
why, I was getting a different format output. ie:

  9287.369 test/625 io_uring:io_uring_task_run(ctx: 0xffff8fbf9a834800,
opcode: 22, user_data: 216454257090494477, result: 195)
  9287.386 test/625 io_uring:io_uring_task_run(ctx: 0xffff8fbf9a834800,
opcode: 22, user_data: 216454257090494477, result: 195)

while the TK_printk macro is:
        TP_printk("ring %p, req %p, op %d, data 0x%llx",
                  __entry->ctx, __entry->req, __entry->opcode,
                  (unsigned long long) __entry->user_data)

The TK_printk macro is naming the ctx variable as 'ring', yet you still
get ctx in the trace output!

but the pointer hashing that it is supposed to do is a mystery to me...



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] io_uring: Add to traces the req pointer when available
  2021-05-25  8:21 ` [PATCH] io_uring: Add to traces the req pointer when available Pavel Begunkov
  2021-05-25  8:33   ` Pavel Begunkov
  2021-05-25 21:29   ` Olivier Langlois
@ 2021-05-26 12:38   ` Stefan Metzmacher
  2021-05-26 16:18     ` Olivier Langlois
  2 siblings, 1 reply; 15+ messages in thread
From: Stefan Metzmacher @ 2021-05-26 12:38 UTC (permalink / raw)
  To: Pavel Begunkov, Olivier Langlois, Jens Axboe, Steven Rostedt,
	Ingo Molnar, io-uring, linux-kernel

Hi,

>> @@ -333,13 +333,14 @@ TRACE_EVENT(io_uring_complete,
>>   */
>>  TRACE_EVENT(io_uring_submit_sqe,
>>  
>> -	TP_PROTO(void *ctx, u8 opcode, u64 user_data, bool force_nonblock,
>> -		 bool sq_thread),
>> +	TP_PROTO(void *ctx, void *req, u8 opcode, u64 user_data,
>> +		 bool force_nonblock, bool sq_thread),
>>  
>> -	TP_ARGS(ctx, opcode, user_data, force_nonblock, sq_thread),
>> +	TP_ARGS(ctx, req, opcode, user_data, force_nonblock, sq_thread),
>>  
>>  	TP_STRUCT__entry (
>>  		__field(  void *,	ctx		)
>> +		__field(  void *,	req		)
>>  		__field(  u8,		opcode		)
>>  		__field(  u64,		user_data	)
>>  		__field(  bool,		force_nonblock	)
>> @@ -348,26 +349,42 @@ TRACE_EVENT(io_uring_submit_sqe,
>>  
>>  	TP_fast_assign(
>>  		__entry->ctx		= ctx;
>> +		__entry->req		= req;
>>  		__entry->opcode		= opcode;
>>  		__entry->user_data	= user_data;
>>  		__entry->force_nonblock	= force_nonblock;
>>  		__entry->sq_thread	= sq_thread;
>>  	),
>>  
>> -	TP_printk("ring %p, op %d, data 0x%llx, non block %d, sq_thread %d",
>> -			  __entry->ctx, __entry->opcode,
>> -			  (unsigned long long) __entry->user_data,
>> -			  __entry->force_nonblock, __entry->sq_thread)
>> +	TP_printk("ring %p, req %p, op %d, data 0x%llx, non block %d, "
>> +		  "sq_thread %d",  __entry->ctx, __entry->req,
>> +		  __entry->opcode, (unsigned long long)__entry->user_data,
>> +		  __entry->force_nonblock, __entry->sq_thread)
>>  );

If that gets changed, could be also include the personality id and flags here,
and maybe also translated the opcode and flags to human readable strings?

metze

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] io_uring: Add to traces the req pointer when available
  2021-05-26 12:38   ` Stefan Metzmacher
@ 2021-05-26 16:18     ` Olivier Langlois
  2021-05-26 17:19       ` Olivier Langlois
  2021-05-28 22:42       ` Steven Rostedt
  0 siblings, 2 replies; 15+ messages in thread
From: Olivier Langlois @ 2021-05-26 16:18 UTC (permalink / raw)
  To: Stefan Metzmacher, Pavel Begunkov, Jens Axboe, Steven Rostedt,
	Ingo Molnar, io-uring, linux-kernel

On Wed, 2021-05-26 at 14:38 +0200, Stefan Metzmacher wrote:
> Hi,
> 
> > > @@ -333,13 +333,14 @@ TRACE_EVENT(io_uring_complete,
> > >   */
> > >  TRACE_EVENT(io_uring_submit_sqe,
> > >  
> > > -       TP_PROTO(void *ctx, u8 opcode, u64 user_data, bool
> > > force_nonblock,
> > > -                bool sq_thread),
> > > +       TP_PROTO(void *ctx, void *req, u8 opcode, u64 user_data,
> > > +                bool force_nonblock, bool sq_thread),
> > >  
> > > -       TP_ARGS(ctx, opcode, user_data, force_nonblock,
> > > sq_thread),
> > > +       TP_ARGS(ctx, req, opcode, user_data, force_nonblock,
> > > sq_thread),
> > >  
> > >         TP_STRUCT__entry (
> > >                 __field(  void *,       ctx             )
> > > +               __field(  void *,       req             )
> > >                 __field(  u8,           opcode          )
> > >                 __field(  u64,          user_data       )
> > >                 __field(  bool,         force_nonblock  )
> > > @@ -348,26 +349,42 @@ TRACE_EVENT(io_uring_submit_sqe,
> > >  
> > >         TP_fast_assign(
> > >                 __entry->ctx            = ctx;
> > > +               __entry->req            = req;
> > >                 __entry->opcode         = opcode;
> > >                 __entry->user_data      = user_data;
> > >                 __entry->force_nonblock = force_nonblock;
> > >                 __entry->sq_thread      = sq_thread;
> > >         ),
> > >  
> > > -       TP_printk("ring %p, op %d, data 0x%llx, non block %d,
> > > sq_thread %d",
> > > -                         __entry->ctx, __entry->opcode,
> > > -                         (unsigned long long) __entry-
> > > >user_data,
> > > -                         __entry->force_nonblock, __entry-
> > > >sq_thread)
> > > +       TP_printk("ring %p, req %p, op %d, data 0x%llx, non block
> > > %d, "
> > > +                 "sq_thread %d",  __entry->ctx, __entry->req,
> > > +                 __entry->opcode, (unsigned long long)__entry-
> > > >user_data,
> > > +                 __entry->force_nonblock, __entry->sq_thread)
> > >  );
> 
> If that gets changed, could be also include the personality id and
> flags here,
> and maybe also translated the opcode and flags to human readable
> strings?
> 
If Jens and Pavel agrees that they would like to see this info in the
traces, I have no objection adding it.

Still waiting input from Steven Rostedt which I believe is the trace
system maintainer concerning the hash-ptr situation.

I did receive an auto-respond from him saying that he was in vacation
until May 28th...

Greetings,



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] io_uring: Add to traces the req pointer when available
  2021-05-26 16:18     ` Olivier Langlois
@ 2021-05-26 17:19       ` Olivier Langlois
  2021-05-29 12:18         ` Pavel Begunkov
  2021-05-28 22:42       ` Steven Rostedt
  1 sibling, 1 reply; 15+ messages in thread
From: Olivier Langlois @ 2021-05-26 17:19 UTC (permalink / raw)
  To: Stefan Metzmacher, Pavel Begunkov, Jens Axboe, Steven Rostedt,
	Ingo Molnar, io-uring, linux-kernel

On Wed, 2021-05-26 at 12:18 -0400, Olivier Langlois wrote:
> > 
> > If that gets changed, could be also include the personality id and
> > flags here,
> > and maybe also translated the opcode and flags to human readable
> > strings?
> > 
> If Jens and Pavel agrees that they would like to see this info in the
> traces, I have no objection adding it.
> 
I need to learn to think longer before replying...

opcode in readable string:
If Jens and Pavel agrees to it, easy to add

flags:
You have my support that it is indeed a very useful info to have in the
submit_sqe trace when debugging with traces

flags in readable string:
After thinking about it, I wouldn't do it. Converting a bitmask of
flags into a string isn't that complex but it isn't trivial neither.
This certainly adds a maintenance burden every time the flags would be
updated. I wouldn't want that burden on my shoulders.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] io_uring: Add to traces the req pointer when available
  2021-05-26 16:18     ` Olivier Langlois
  2021-05-26 17:19       ` Olivier Langlois
@ 2021-05-28 22:42       ` Steven Rostedt
  2021-05-29 12:30         ` Pavel Begunkov
  1 sibling, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2021-05-28 22:42 UTC (permalink / raw)
  To: Olivier Langlois
  Cc: Stefan Metzmacher, Pavel Begunkov, Jens Axboe, Ingo Molnar,
	io-uring, linux-kernel

On Wed, 26 May 2021 12:18:37 -0400
Olivier Langlois <olivier@trillion01.com> wrote:

> > If that gets changed, could be also include the personality id and
> > flags here,
> > and maybe also translated the opcode and flags to human readable
> > strings?
> >   
> If Jens and Pavel agrees that they would like to see this info in the
> traces, I have no objection adding it.
> 
> Still waiting input from Steven Rostedt which I believe is the trace
> system maintainer concerning the hash-ptr situation.
> 
> I did receive an auto-respond from him saying that he was in vacation
> until May 28th...

Yep, I'm back now.

Here's how it works using your patch as an example:

>  	TP_fast_assign(
>  		__entry->ctx		= ctx;
> +		__entry->req		= req;

The "__entry" is a structure defined by TP_STRUCT__entry() that is located
on the ring buffer that can be read directly by user space (aka trace-cmd).
So yes, that value is never hashed, and one of the reasons that tracefs
requires root privilege to read it.

>  		__entry->opcode		= opcode;
>  		__entry->user_data	= user_data;
>  		__entry->force_nonblock	= force_nonblock;
>  		__entry->sq_thread	= sq_thread;
>  	),
>  
> -	TP_printk("ring %p, op %d, data 0x%llx, non block %d, sq_thread %d",
> -			  __entry->ctx, __entry->opcode,
> -			  (unsigned long long) __entry->user_data,
> -			  __entry->force_nonblock, __entry->sq_thread)
> +	TP_printk("ring %p, req %p, op %d, data 0x%llx, non block %d, "
> +		  "sq_thread %d",  __entry->ctx, __entry->req,
> +		  __entry->opcode, (unsigned long long)__entry->user_data,
> +		  __entry->force_nonblock, __entry->sq_thread)
>  );

The TP_printk() macro *is* used when reading the "trace" or "trace_pipe"
file, and that uses vsnprintf() to process it. Which will hash the values
for %p (by default, because that's what it always did when vsnprintf()
started hashing values).

Masami Hiramatsu added the hash-ptr option (which I told him to be the
default as that was the behavior before that option was created), where the
use could turn off the hashing.

There's lots of trace events that expose the raw pointers when hash-ptr is
off or if the ring buffers are read via the trace_pip_raw interface.

What's special about these pointers to hash them before they are recorded?

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] io_uring: Add to traces the req pointer when available
  2021-05-26 17:19       ` Olivier Langlois
@ 2021-05-29 12:18         ` Pavel Begunkov
  0 siblings, 0 replies; 15+ messages in thread
From: Pavel Begunkov @ 2021-05-29 12:18 UTC (permalink / raw)
  To: Olivier Langlois, Stefan Metzmacher, Jens Axboe, Steven Rostedt,
	Ingo Molnar, io-uring, linux-kernel

On 5/26/21 6:19 PM, Olivier Langlois wrote:
> On Wed, 2021-05-26 at 12:18 -0400, Olivier Langlois wrote:
>>>
>>> If that gets changed, could be also include the personality id and
>>> flags here,
>>> and maybe also translated the opcode and flags to human readable
>>> strings?
>>>
>> If Jens and Pavel agrees that they would like to see this info in the
>> traces, I have no objection adding it.
>>
> I need to learn to think longer before replying...
> 
> opcode in readable string:
> If Jens and Pavel agrees to it, easy to add

Don't mind having them, may be useful, but let's leave stringification
to the userspace, values are well defined so shouldn't be a problem

liburing maybe?

> 
> flags:
> You have my support that it is indeed a very useful info to have in the
> submit_sqe trace when debugging with traces
> 
> flags in readable string:
> After thinking about it, I wouldn't do it. Converting a bitmask of
> flags into a string isn't that complex but it isn't trivial neither.
> This certainly adds a maintenance burden every time the flags would be
> updated. I wouldn't want that burden on my shoulders.
> 
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] io_uring: Add to traces the req pointer when available
  2021-05-28 22:42       ` Steven Rostedt
@ 2021-05-29 12:30         ` Pavel Begunkov
  2021-05-29 12:34           ` Pavel Begunkov
  2021-05-29 17:55           ` Steven Rostedt
  0 siblings, 2 replies; 15+ messages in thread
From: Pavel Begunkov @ 2021-05-29 12:30 UTC (permalink / raw)
  To: Steven Rostedt, Olivier Langlois
  Cc: Stefan Metzmacher, Jens Axboe, Ingo Molnar, io-uring, linux-kernel

On 5/28/21 11:42 PM, Steven Rostedt wrote:
> On Wed, 26 May 2021 12:18:37 -0400
> Olivier Langlois <olivier@trillion01.com> wrote:
> 
>>> If that gets changed, could be also include the personality id and
>>> flags here,
>>> and maybe also translated the opcode and flags to human readable
>>> strings?
>>>   
>> If Jens and Pavel agrees that they would like to see this info in the
>> traces, I have no objection adding it.
>>
>> Still waiting input from Steven Rostedt which I believe is the trace
>> system maintainer concerning the hash-ptr situation.
>>
>> I did receive an auto-respond from him saying that he was in vacation
>> until May 28th...
> 
> Yep, I'm back now.
> 
> Here's how it works using your patch as an example:
> 
>>  	TP_fast_assign(
>>  		__entry->ctx		= ctx;
>> +		__entry->req		= req;
> 
> The "__entry" is a structure defined by TP_STRUCT__entry() that is located
> on the ring buffer that can be read directly by user space (aka trace-cmd).
> So yes, that value is never hashed, and one of the reasons that tracefs
> requires root privilege to read it.
> 
>>  		__entry->opcode		= opcode;
>>  		__entry->user_data	= user_data;
>>  		__entry->force_nonblock	= force_nonblock;
>>  		__entry->sq_thread	= sq_thread;
>>  	),
>>  
>> -	TP_printk("ring %p, op %d, data 0x%llx, non block %d, sq_thread %d",
>> -			  __entry->ctx, __entry->opcode,
>> -			  (unsigned long long) __entry->user_data,
>> -			  __entry->force_nonblock, __entry->sq_thread)
>> +	TP_printk("ring %p, req %p, op %d, data 0x%llx, non block %d, "
>> +		  "sq_thread %d",  __entry->ctx, __entry->req,
>> +		  __entry->opcode, (unsigned long long)__entry->user_data,
>> +		  __entry->force_nonblock, __entry->sq_thread)
>>  );
> 
> The TP_printk() macro *is* used when reading the "trace" or "trace_pipe"
> file, and that uses vsnprintf() to process it. Which will hash the values
> for %p (by default, because that's what it always did when vsnprintf()
> started hashing values).
> 
> Masami Hiramatsu added the hash-ptr option (which I told him to be the
> default as that was the behavior before that option was created), where the
> use could turn off the hashing.
> 
> There's lots of trace events that expose the raw pointers when hash-ptr is
> off or if the ring buffers are read via the trace_pip_raw interface.
> 
> What's special about these pointers to hash them before they are recorded?

io_uring offers all different operations and has internal request/memory
recycling, so it may be an easy vector of attack in case of some
vulnerabilities found, but nothing special. As that's the status quo,
I wouldn't care, let's put aside my concerns and print them raw.

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] io_uring: Add to traces the req pointer when available
  2021-05-29 12:30         ` Pavel Begunkov
@ 2021-05-29 12:34           ` Pavel Begunkov
  2021-05-29 17:55           ` Steven Rostedt
  1 sibling, 0 replies; 15+ messages in thread
From: Pavel Begunkov @ 2021-05-29 12:34 UTC (permalink / raw)
  To: Steven Rostedt, Olivier Langlois
  Cc: Stefan Metzmacher, Jens Axboe, Ingo Molnar, io-uring, linux-kernel

On 5/29/21 1:30 PM, Pavel Begunkov wrote:
> On 5/28/21 11:42 PM, Steven Rostedt wrote:
>> On Wed, 26 May 2021 12:18:37 -0400
>> Olivier Langlois <olivier@trillion01.com> wrote:
>>
>>>> If that gets changed, could be also include the personality id and
>>>> flags here,
>>>> and maybe also translated the opcode and flags to human readable
>>>> strings?
>>>>   
>>> If Jens and Pavel agrees that they would like to see this info in the
>>> traces, I have no objection adding it.
>>>
>>> Still waiting input from Steven Rostedt which I believe is the trace
>>> system maintainer concerning the hash-ptr situation.
>>>
>>> I did receive an auto-respond from him saying that he was in vacation
>>> until May 28th...
>>
>> Yep, I'm back now.
>>
>> Here's how it works using your patch as an example:
>>
>>>  	TP_fast_assign(
>>>  		__entry->ctx		= ctx;
>>> +		__entry->req		= req;
>>
>> The "__entry" is a structure defined by TP_STRUCT__entry() that is located
>> on the ring buffer that can be read directly by user space (aka trace-cmd).
>> So yes, that value is never hashed, and one of the reasons that tracefs
>> requires root privilege to read it.
>>
>>>  		__entry->opcode		= opcode;
>>>  		__entry->user_data	= user_data;
>>>  		__entry->force_nonblock	= force_nonblock;
>>>  		__entry->sq_thread	= sq_thread;
>>>  	),
>>>  
>>> -	TP_printk("ring %p, op %d, data 0x%llx, non block %d, sq_thread %d",
>>> -			  __entry->ctx, __entry->opcode,
>>> -			  (unsigned long long) __entry->user_data,
>>> -			  __entry->force_nonblock, __entry->sq_thread)
>>> +	TP_printk("ring %p, req %p, op %d, data 0x%llx, non block %d, "
>>> +		  "sq_thread %d",  __entry->ctx, __entry->req,
>>> +		  __entry->opcode, (unsigned long long)__entry->user_data,
>>> +		  __entry->force_nonblock, __entry->sq_thread)
>>>  );
>>
>> The TP_printk() macro *is* used when reading the "trace" or "trace_pipe"
>> file, and that uses vsnprintf() to process it. Which will hash the values
>> for %p (by default, because that's what it always did when vsnprintf()
>> started hashing values).
>>
>> Masami Hiramatsu added the hash-ptr option (which I told him to be the
>> default as that was the behavior before that option was created), where the
>> use could turn off the hashing.
>>
>> There's lots of trace events that expose the raw pointers when hash-ptr is
>> off or if the ring buffers are read via the trace_pip_raw interface.
>>
>> What's special about these pointers to hash them before they are recorded?
> 
> io_uring offers all different operations and has internal request/memory
> recycling, so it may be an easy vector of attack in case of some
> vulnerabilities found, but nothing special. As that's the status quo,
> I wouldn't care, let's put aside my concerns and print them raw.

edit: not print obviously, have but have them raw in __entry

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] io_uring: Add to traces the req pointer when available
  2021-05-29 12:30         ` Pavel Begunkov
  2021-05-29 12:34           ` Pavel Begunkov
@ 2021-05-29 17:55           ` Steven Rostedt
  1 sibling, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2021-05-29 17:55 UTC (permalink / raw)
  To: Pavel Begunkov
  Cc: Olivier Langlois, Stefan Metzmacher, Jens Axboe, Ingo Molnar,
	io-uring, linux-kernel

On Sat, 29 May 2021 13:30:31 +0100
Pavel Begunkov <asml.silence@gmail.com> wrote:

> io_uring offers all different operations and has internal request/memory
> recycling, so it may be an easy vector of attack in case of some
> vulnerabilities found, but nothing special. As that's the status quo,
> I wouldn't care, let's put aside my concerns and print them raw.

There's a lot of information that the tracing subsystem gives that would
help several vectors of attack without the need for pointers. That's
why there's a lockdown method to disable all tracing. But the tracing
system is also good at finding out if your system has been compromised ;-)

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-05-29 17:56 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <60ac946e.1c69fb81.5efc2.65deSMTPIN_ADDED_MISSING@mx.google.com>
2021-05-25  8:21 ` [PATCH] io_uring: Add to traces the req pointer when available Pavel Begunkov
2021-05-25  8:33   ` Pavel Begunkov
2021-05-25 21:26     ` Olivier Langlois
2021-05-25 21:48       ` Olivier Langlois
2021-05-25 22:28         ` Jens Axboe
2021-05-26  8:28           ` Olivier Langlois
2021-05-25 21:29   ` Olivier Langlois
2021-05-26 12:38   ` Stefan Metzmacher
2021-05-26 16:18     ` Olivier Langlois
2021-05-26 17:19       ` Olivier Langlois
2021-05-29 12:18         ` Pavel Begunkov
2021-05-28 22:42       ` Steven Rostedt
2021-05-29 12:30         ` Pavel Begunkov
2021-05-29 12:34           ` Pavel Begunkov
2021-05-29 17:55           ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).