Re: [RFC 4/4] ahci media error reporting

From: John Snow <jsnow@redhat.com>
To: Tony Asleson <tasleson@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>, qemu-devel <qemu-devel@nongnu.org>,
	Qemu-block <qemu-block@nongnu.org>
Subject: Re: [RFC 4/4] ahci media error reporting
Date: Thu, 19 Sep 2019 16:43:57 -0400	[thread overview]
Message-ID: <df07a621-8515-2414-2f59-a7eb7eebd75b@redhat.com> (raw)
In-Reply-To: <20190919194847.18518-5-tasleson@redhat.com>

On 9/19/19 3:48 PM, Tony Asleson wrote:
> Initial attempt at returning a media error for ahci.  This is certainly
> wrong and needs serious improvement.
> 

Hi; I have the unfortunate distinction of being the AHCI maintainer.
Please CC me on future revisions and discussion if you are interacting
with my problem child.

Also remember to CC qemu-block.

> Signed-off-by: Tony Asleson <tasleson@redhat.com>
> ---
>  hw/ide/ahci.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
> index d45393c019..f487764106 100644
> --- a/hw/ide/ahci.c
> +++ b/hw/ide/ahci.c
> @@ -36,6 +36,7 @@
>  #include "hw/ide/internal.h"
>  #include "hw/ide/pci.h"
>  #include "ahci_internal.h"
> +#include "block/error_inject.h"
>  
>  #include "trace.h"
>  
> @@ -999,6 +1000,22 @@ static void ncq_err(NCQTransferState *ncq_tfs)
>      ncq_tfs->used = 0;
>  }
>  
> +/*
> + * Figure out correct way to report media error, this is at best a guess
> + * and based on the output of linux kernel, not even remotely close.
> + */

Depends on what kind of error, exactly, you're trying to report, and at
what level. (AHCI, NCQ, SATA, ATA?)

Keep in mind that you've inserted an error path for SATA drives using
NCQ, but seemingly haven't touched SATA drives not using NCQ, or ATAPI
devices using either PATA/SATA, or ATA drives on PATA.

> +static void ncq_media_err(NCQTransferState *ncq_tfs, uint64_t err_sector)
> +{
> +    IDEState *ide_state = &ncq_tfs->drive->port.ifs[0];
> +
> +    ide_state->error = ECC_ERR;
> +    ide_state->status = READY_STAT | ERR_STAT;
> +    ncq_tfs->drive->port_regs.scr_err |= (1 << ncq_tfs->tag);
> +    ncq_tfs->lba = err_sector;
> +    qemu_sglist_destroy(&ncq_tfs->sglist);
> +    ncq_tfs->used = 0;
> +}
> +

If you are definitely very sure you only want an ide_state error
difference, you could just as well call ncq_err and then patch
ide_state->error.

(I am not sure that's what you want, but... maybe it is?)

I'd have to check -- because I can't say the AHCI emulator was designed
so much as congealed -- but you might need calls to ncq_finish.

usually, ncq_cb handles the return from any NCQ command and will call
ncq_err and ncq_finish as appropriate to tidy up the command.

It might be a mistake that execute_ncq_command issues ncq_err in the
`default` arm of the switch statement without a call to finish.

If we do call ncq_finish from this context I'm not sure if we want
block_acct_done here unconditionally. We may not have started a block
accounting operation if we never started a backend operation. Everything
else looks about right to me.

>  static void ncq_finish(NCQTransferState *ncq_tfs)
>  {
>      /* If we didn't error out, set our finished bit. Errored commands
> @@ -1065,6 +1082,8 @@ static void execute_ncq_command(NCQTransferState *ncq_tfs)
>  {
>      AHCIDevice *ad = ncq_tfs->drive;
>      IDEState *ide_state = &ad->port.ifs[0];
> +    uint64_t error_sector = 0;
> +    char device_id[32];
>      int port = ad->port_no;
>  
>      g_assert(is_ncq(ncq_tfs->cmd));
> @@ -1072,6 +1091,14 @@ static void execute_ncq_command(NCQTransferState *ncq_tfs)
>  
>      switch (ncq_tfs->cmd) {
>      case READ_FPDMA_QUEUED:
> +        sprintf(device_id, "%lu", ide_state->wwn);

This seems suspicious for your design in general, but I'd like to not
run sprintf to a buffer in the hotpath for NCQ.

If you need this, I'd do it when wwn is set and just grab it from the
state structure.

> +
> +        if (error_in_read(device_id, ncq_tfs->lba,
> +                ncq_tfs->sector_count, &error_sector)) {
> +            ncq_media_err(ncq_tfs, error_sector);
> +            return;
> +        }
> +

One of the downsides to trying to trigger read error injections
per-device instead of per-node is that now you have to goof around with
device-specific code everywhere.

I suppose from your cover letter you *WANT* device-specific error
exercising, which would necessitate a different design from blkdebug,
but it means you have to add support for the framework per-device and it
might be very tricky to get right.

>          trace_execute_ncq_command_read(ad->hba, port, ncq_tfs->tag,
>                                         ncq_tfs->sector_count, ncq_tfs->lba);
>          dma_acct_start(ide_state->blk, &ncq_tfs->acct,
>