linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] habanalabs: improve error messages
@ 2019-03-28  7:13 Oded Gabbay
  2019-03-29 16:29 ` Greg KH
  0 siblings, 1 reply; 5+ messages in thread
From: Oded Gabbay @ 2019-03-28  7:13 UTC (permalink / raw)
  To: linux-kernel; +Cc: gregkh

This patch improves two error messages to help the user to
better understand what error occurred.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
---
 drivers/misc/habanalabs/command_submission.c | 3 ++-
 drivers/misc/habanalabs/memory.c             | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/habanalabs/command_submission.c b/drivers/misc/habanalabs/command_submission.c
index f908643f871f..02c48da0b645 100644
--- a/drivers/misc/habanalabs/command_submission.c
+++ b/drivers/misc/habanalabs/command_submission.c
@@ -261,7 +261,8 @@ static void cs_timedout(struct work_struct *work)
 	ctx_asid = cs->ctx->asid;
 
 	/* TODO: add information about last signaled seq and last emitted seq */
-	dev_err(hdev->dev, "CS %d.%llu got stuck!\n", ctx_asid, cs->sequence);
+	dev_err(hdev->dev, "User %d command submission %llu got stuck!\n",
+		ctx_asid, cs->sequence);
 
 	cs_put(cs);
 
diff --git a/drivers/misc/habanalabs/memory.c b/drivers/misc/habanalabs/memory.c
index ce1fda40a8b8..39788b1cf8d0 100644
--- a/drivers/misc/habanalabs/memory.c
+++ b/drivers/misc/habanalabs/memory.c
@@ -109,7 +109,7 @@ static int alloc_device_memory(struct hl_ctx *ctx, struct hl_mem_in *args,
 							page_size);
 			if (!phys_pg_pack->pages[i]) {
 				dev_err(hdev->dev,
-					"ioctl failed to allocate page\n");
+					"Failed to allocate device memory (out of memory)\n");
 				rc = -ENOMEM;
 				goto page_err;
 			}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] habanalabs: improve error messages
  2019-03-28  7:13 [PATCH] habanalabs: improve error messages Oded Gabbay
@ 2019-03-29 16:29 ` Greg KH
  2019-03-29 16:39   ` Joe Perches
  2019-03-29 17:29   ` Oded Gabbay
  0 siblings, 2 replies; 5+ messages in thread
From: Greg KH @ 2019-03-29 16:29 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: linux-kernel

On Thu, Mar 28, 2019 at 09:13:13AM +0200, Oded Gabbay wrote:
> This patch improves two error messages to help the user to
> better understand what error occurred.
> 
> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
> ---
>  drivers/misc/habanalabs/command_submission.c | 3 ++-
>  drivers/misc/habanalabs/memory.c             | 2 +-
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/misc/habanalabs/command_submission.c b/drivers/misc/habanalabs/command_submission.c
> index f908643f871f..02c48da0b645 100644
> --- a/drivers/misc/habanalabs/command_submission.c
> +++ b/drivers/misc/habanalabs/command_submission.c
> @@ -261,7 +261,8 @@ static void cs_timedout(struct work_struct *work)
>  	ctx_asid = cs->ctx->asid;
>  
>  	/* TODO: add information about last signaled seq and last emitted seq */
> -	dev_err(hdev->dev, "CS %d.%llu got stuck!\n", ctx_asid, cs->sequence);
> +	dev_err(hdev->dev, "User %d command submission %llu got stuck!\n",
> +		ctx_asid, cs->sequence);
>  
>  	cs_put(cs);
>  
> diff --git a/drivers/misc/habanalabs/memory.c b/drivers/misc/habanalabs/memory.c
> index ce1fda40a8b8..39788b1cf8d0 100644
> --- a/drivers/misc/habanalabs/memory.c
> +++ b/drivers/misc/habanalabs/memory.c
> @@ -109,7 +109,7 @@ static int alloc_device_memory(struct hl_ctx *ctx, struct hl_mem_in *args,
>  							page_size);
>  			if (!phys_pg_pack->pages[i]) {
>  				dev_err(hdev->dev,
> -					"ioctl failed to allocate page\n");
> +					"Failed to allocate device memory (out of memory)\n");

No need for a message at all here, right?  The core should have already
told you you had a problem.

greg k-h

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] habanalabs: improve error messages
  2019-03-29 16:29 ` Greg KH
@ 2019-03-29 16:39   ` Joe Perches
  2019-03-29 17:29   ` Oded Gabbay
  1 sibling, 0 replies; 5+ messages in thread
From: Joe Perches @ 2019-03-29 16:39 UTC (permalink / raw)
  To: Greg KH, Oded Gabbay; +Cc: linux-kernel

On Fri, 2019-03-29 at 17:29 +0100, Greg KH wrote:
> On Thu, Mar 28, 2019 at 09:13:13AM +0200, Oded Gabbay wrote:
> > This patch improves two error messages to help the user to
> > better understand what error occurred.
[]
> > diff --git a/drivers/misc/habanalabs/memory.c b/drivers/misc/habanalabs/memory.c
[]
> > @@ -109,7 +109,7 @@ static int alloc_device_memory(struct hl_ctx *ctx, struct hl_mem_in *args,
> >  							page_size);
> >  			if (!phys_pg_pack->pages[i]) {
> >  				dev_err(hdev->dev,
> > -					"ioctl failed to allocate page\n");
> > +					"Failed to allocate device memory (out of memory)\n");
> 
> No need for a message at all here, right?  The core should have already
> told you you had a problem.

I do not understand this code all that well, but I am uncertain
if all the possible pool allocation functions emit an OOM message
on failure.

Also, there is another pool allocation for the contiguous
allocation attempt that has a similar OOM in this function.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] habanalabs: improve error messages
  2019-03-29 16:29 ` Greg KH
  2019-03-29 16:39   ` Joe Perches
@ 2019-03-29 17:29   ` Oded Gabbay
  2019-03-29 18:50     ` Greg KH
  1 sibling, 1 reply; 5+ messages in thread
From: Oded Gabbay @ 2019-03-29 17:29 UTC (permalink / raw)
  To: Greg KH; +Cc: Linux-Kernel@Vger. Kernel. Org

On Fri, Mar 29, 2019 at 7:29 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Thu, Mar 28, 2019 at 09:13:13AM +0200, Oded Gabbay wrote:
> > This patch improves two error messages to help the user to
> > better understand what error occurred.
> >
> > Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
> > ---
> >  drivers/misc/habanalabs/command_submission.c | 3 ++-
> >  drivers/misc/habanalabs/memory.c             | 2 +-
> >  2 files changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/misc/habanalabs/command_submission.c b/drivers/misc/habanalabs/command_submission.c
> > index f908643f871f..02c48da0b645 100644
> > --- a/drivers/misc/habanalabs/command_submission.c
> > +++ b/drivers/misc/habanalabs/command_submission.c
> > @@ -261,7 +261,8 @@ static void cs_timedout(struct work_struct *work)
> >       ctx_asid = cs->ctx->asid;
> >
> >       /* TODO: add information about last signaled seq and last emitted seq */
> > -     dev_err(hdev->dev, "CS %d.%llu got stuck!\n", ctx_asid, cs->sequence);
> > +     dev_err(hdev->dev, "User %d command submission %llu got stuck!\n",
> > +             ctx_asid, cs->sequence);
> >
> >       cs_put(cs);
> >
> > diff --git a/drivers/misc/habanalabs/memory.c b/drivers/misc/habanalabs/memory.c
> > index ce1fda40a8b8..39788b1cf8d0 100644
> > --- a/drivers/misc/habanalabs/memory.c
> > +++ b/drivers/misc/habanalabs/memory.c
> > @@ -109,7 +109,7 @@ static int alloc_device_memory(struct hl_ctx *ctx, struct hl_mem_in *args,
> >                                                       page_size);
> >                       if (!phys_pg_pack->pages[i]) {
> >                               dev_err(hdev->dev,
> > -                                     "ioctl failed to allocate page\n");
> > +                                     "Failed to allocate device memory (out of memory)\n");
>
> No need for a message at all here, right?  The core should have already
> told you you had a problem.
>
> greg k-h

No, I don't think so, because this function is called for allocating
memory in the device's DRAM. So we don't pass through the Linux core
code inside.
We use the generic genalloc module to implement the device's DRAM
physical page allocator, and AFAIK, you won't get any message in case
genalloc fails to find a free memory in its pool.
Even if genalloc prints something, I would prefer to display a more
meaningful message to the user in this case. This allocation is
directly requested by the user (its part of the IOCTL code) and he
should know how it failed, no ?

Oded

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] habanalabs: improve error messages
  2019-03-29 17:29   ` Oded Gabbay
@ 2019-03-29 18:50     ` Greg KH
  0 siblings, 0 replies; 5+ messages in thread
From: Greg KH @ 2019-03-29 18:50 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: Linux-Kernel@Vger. Kernel. Org

On Fri, Mar 29, 2019 at 08:29:08PM +0300, Oded Gabbay wrote:
> On Fri, Mar 29, 2019 at 7:29 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Thu, Mar 28, 2019 at 09:13:13AM +0200, Oded Gabbay wrote:
> > > This patch improves two error messages to help the user to
> > > better understand what error occurred.
> > >
> > > Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
> > > ---
> > >  drivers/misc/habanalabs/command_submission.c | 3 ++-
> > >  drivers/misc/habanalabs/memory.c             | 2 +-
> > >  2 files changed, 3 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/misc/habanalabs/command_submission.c b/drivers/misc/habanalabs/command_submission.c
> > > index f908643f871f..02c48da0b645 100644
> > > --- a/drivers/misc/habanalabs/command_submission.c
> > > +++ b/drivers/misc/habanalabs/command_submission.c
> > > @@ -261,7 +261,8 @@ static void cs_timedout(struct work_struct *work)
> > >       ctx_asid = cs->ctx->asid;
> > >
> > >       /* TODO: add information about last signaled seq and last emitted seq */
> > > -     dev_err(hdev->dev, "CS %d.%llu got stuck!\n", ctx_asid, cs->sequence);
> > > +     dev_err(hdev->dev, "User %d command submission %llu got stuck!\n",
> > > +             ctx_asid, cs->sequence);
> > >
> > >       cs_put(cs);
> > >
> > > diff --git a/drivers/misc/habanalabs/memory.c b/drivers/misc/habanalabs/memory.c
> > > index ce1fda40a8b8..39788b1cf8d0 100644
> > > --- a/drivers/misc/habanalabs/memory.c
> > > +++ b/drivers/misc/habanalabs/memory.c
> > > @@ -109,7 +109,7 @@ static int alloc_device_memory(struct hl_ctx *ctx, struct hl_mem_in *args,
> > >                                                       page_size);
> > >                       if (!phys_pg_pack->pages[i]) {
> > >                               dev_err(hdev->dev,
> > > -                                     "ioctl failed to allocate page\n");
> > > +                                     "Failed to allocate device memory (out of memory)\n");
> >
> > No need for a message at all here, right?  The core should have already
> > told you you had a problem.
> >
> > greg k-h
> 
> No, I don't think so, because this function is called for allocating
> memory in the device's DRAM. So we don't pass through the Linux core
> code inside.
> We use the generic genalloc module to implement the device's DRAM
> physical page allocator, and AFAIK, you won't get any message in case
> genalloc fails to find a free memory in its pool.
> Even if genalloc prints something, I would prefer to display a more
> meaningful message to the user in this case. This allocation is
> directly requested by the user (its part of the IOCTL code) and he
> should know how it failed, no ?

Ok, that sounds reasonable, nevermind :)

greg k-h

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-03-29 18:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-28  7:13 [PATCH] habanalabs: improve error messages Oded Gabbay
2019-03-29 16:29 ` Greg KH
2019-03-29 16:39   ` Joe Perches
2019-03-29 17:29   ` Oded Gabbay
2019-03-29 18:50     ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).