All of lore.kernel.org
 help / color / mirror / Atom feed
From: James Bottomley <jejb@linux.vnet.ibm.com>
To: Ric Wheeler <rwheeler@redhat.com>, Hannes Reinecke <hare@suse.de>,
	Sumit Saxena <sumit.saxena@broadcom.com>,
	linux-scsi@vger.kernel.org
Cc: martin.petersen@oracle.com, thenzl@redhat.com,
	kashyap.desai@broadcom.com, Christoph Hellwig <hch@infradead.org>,
	"Martin K. Petersen" <mkp@mkp.net>,
	Jeff Moyer <jmoyer@redhat.com>, Gris Ge <fge@redhat.com>,
	Ewan Milne <emilne@redhat.com>, Jens Axboe <axboe@kernel.dk>
Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
Date: Mon, 17 Oct 2016 09:20:05 -0700	[thread overview]
Message-ID: <1476721205.2734.12.camel@linux.vnet.ibm.com> (raw)
In-Reply-To: <9e5fbcef-bd72-71be-76f6-f41e26b5cb3c@redhat.com>

On Mon, 2016-10-17 at 09:01 -0400, Ric Wheeler wrote:
> On 10/17/2016 07:34 AM, Hannes Reinecke wrote:
> > On 10/17/2016 12:24 PM, Sumit Saxena wrote:
> > > megaraid_sas driver returns SYNCHRONIZE_CACHE command back to 
> > > SCSI layer without sending it to firmware as firmware takes care 
> > > of flushing cache.  This patch will change the driver behavior 
> > > wrt SYNCHRONIZE_CACHE handling. If underlying firmware has 
> > > support to handle the SYNCHORNIZE_CACHE, driver will send it for 
> > > firmware otherwise complete it back to SCSI layer with SUCCESS 
> > > immediately.  If  Firmware  handle SYNCHORNIZE_CACHE for both VD 
> > > and JBOD "canHandleSyncCache" bit in scratch pad register(offset 
> > > 0x00B4) will be set.
> > > 
> > > This behavior can be controlled via module parameter and user can 
> > > fallback to old behavior of returning SYNCHRONIZE_CACHE by driver 
> > > only without sending it to firmware.
> > > 
> > > Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
> > > Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
> > > ---
> > >   drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
> > >   drivers/scsi/megaraid/megaraid_sas_base.c   | 14 ++++++--------
> > >   drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
> > >   drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
> > >   4 files changed, 14 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/drivers/scsi/megaraid/megaraid_sas.h
> > > b/drivers/scsi/megaraid/megaraid_sas.h
> > > index ca86c88..43fd14f 100644
> > > --- a/drivers/scsi/megaraid/megaraid_sas.h
> > > +++ b/drivers/scsi/megaraid/megaraid_sas.h
> > > @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
> > >   #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
> > >   #define MR_MAX_MSIX_REG_ARRAY                   16
> > >   #define MR_RDPQ_MODE_OFFSET			0X00800000
> > > +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0X0100000
> > > 0
> > > +
> > >   /*
> > >   * register set for both 1068 and 1078 controllers
> > >   * structure extended for 1078 registers
> > > @@ -2140,6 +2142,7 @@ struct megasas_instance {
> > >   	u8 is_imr;
> > >   	u8 is_rdpq;
> > >   	bool dev_handle;
> > > +	bool fw_sync_cache_support;
> > >   };
> > >   struct MR_LD_VF_MAP {
> > >   	u32 size;
> > > diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
> > > b/drivers/scsi/megaraid/megaraid_sas_base.c
> > > index 092387f..a4a8e2f 100644
> > > --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> > > +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> > > @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
> > > MEGASAS_DEFAULT_CMD_TIMEOUT;
> > >   module_param(scmd_timeout, int, S_IRUGO);
> > >   MODULE_PARM_DESC(scmd_timeout, "scsi command timeout (10-90s),
> > > default 90s. See megasas_reset_timer.");
> > >   
> > > +bool block_sync_cache;
> > > +module_param(block_sync_cache, bool, S_IRUGO);
> > > +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by driver
> > > Default: 0(Send it to firmware)");
> > > +
> > >   MODULE_LICENSE("GPL");
> > >   MODULE_VERSION(MEGASAS_VERSION);
> > >   MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
> > > @@ -1700,16 +1704,10 @@ megasas_queue_command(struct Scsi_Host
> > > *shost, struct scsi_cmnd *scmd)
> > >   		goto out_done;
> > >   	}
> > >   
> > > -	switch (scmd->cmnd[0]) {
> > > -	case SYNCHRONIZE_CACHE:
> > > -		/*
> > > -		 * FW takes care of flush cache on its own
> > > -		 * No need to send it down
> > > -		 */
> > > +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
> > > +		(!instance->fw_sync_cache_support)) {
> > >   		scmd->result = DID_OK << 16;
> > >   		goto out_done;
> > > -	default:
> > > -		break;
> > >   	}
> > >   
> > >   	return instance->instancet
> > > ->build_and_issue_cmd(instance, scmd);
> > > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > index 2159f6a..8237580 100644
> > > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct
> > > megasas_instance *instance)
> > >   		ret = 1;
> > >   		goto fail_fw_init;
> > >   	}
> > > +	if (!block_sync_cache)
> > > +		instance->fw_sync_cache_support = (scratch_pad_2
> > > &
> > > +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET) ? 1 :
> > > 0;
> > >   
> > >   	IOCInitMessage =
> > >   	  dma_alloc_coherent(&instance->pdev->dev,
> > > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
> > > b/drivers/scsi/megaraid/megaraid_sas_fusion.h
> > > index e3bee04..2857154 100644
> > > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
> > > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
> > > @@ -72,6 +72,8 @@
> > >   #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET   (0x0000030C)
> > >   #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x0000006C)
> > >   
> > > +extern bool block_sync_cache;
> > > +
> > >   /*
> > >    * Raid context flags
> > >    */
> > > 
> > Be extra careful with that.
> > 
> > SYNCHRONIZE_CACHE has (potentially) a global scope, as there might 
> > be an array-wide cache, and a cache flush would affect the entire 
> > cache. Linux is using SYNCHRONIZE_CACHE as a per device setting, ie 
> > it assumes that the effects of a cache flush is restricted to the 
> > device in question.
> > 
> > If this is _not_ the case I'd rather not enable it.
> > Have you checked that enabling this functionality doesn't have 
> > negative performance impact?
> > 
> > Cheers,
> > 
> > Hannes
> 
> This must go in - without this fix, there is no data integrity for
> any file system.

That's not what I get from the change log.  What it says to me is that
the caches are currently firmware managed.  Barring firmware bugs, that
means that we currently don't have any integrity issues.

> In effect, this driver by default has been throwing away 
> SYNCHRONIZE_CACHE commands even when acting in JBOD/non-RAID mode.

That's not what the changelog says.  It says the cache flushing has
been managed by the firmware only.  Meaning the firmware decides when
to flush the cache.  Barring firmware bugs, this should mean that
integrity is preserved in either mode.

> Of course, actually doing a SYNCHRONIZE_CACHE to drives will be 
> slower than throwing it away, but this is not optional.

What Hannes means is that we need to know that turning over cache
management to Linux is a) safe and b) not a performance regression. 
 Given that there aren't any integrity issues, that's a reasonable
request.

> We really need to have some ways to validate that our IO stack is 
> properly and safely configured.
> 
> I would love to see a couple of things:
> 
> * having T10 & T13 report the existence of a volatile write cache - 
> this is different than WCE set, some devices have a write cache and 
> are battery/flash backed.

That's the non-volatile cache log page.

James


> * having a robust tool to test over power failure/disconnect that our
> assumptions are true - any write is durable after a SYNCHRONIZE_CACHE 
> or CACHE_FLUSH_EXT is sent and ack'ed
> 
> Regards,
> 
> Ric
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


  parent reply	other threads:[~2016-10-17 16:20 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-17 10:24 [PATCH 0/7] megaraid_sas: Updates for scsi-next Sumit Saxena
2016-10-17 10:24 ` [PATCH 1/7] megaraid_sas: For SRIOV enabled firmware, ensure VF driver waits for 30secs before reset Sumit Saxena
2016-10-17 11:29   ` Hannes Reinecke
2016-10-17 11:29     ` Hannes Reinecke
2016-10-17 12:43   ` Tomas Henzl
2016-10-17 10:24 ` [PATCH 2/7] megaraid_sas: Send correct PhysArm to FW for R1 VD downgrade Sumit Saxena
2016-10-17 11:29   ` Hannes Reinecke
2016-10-17 12:50   ` Tomas Henzl
2016-10-17 10:24 ` [PATCH 3/7] megaraid_sas: Do not fire DCMDs during PCI shutdown/detach Sumit Saxena
2016-10-17 11:31   ` Hannes Reinecke
2016-10-17 12:19     ` Sumit Saxena
2016-10-17 10:24 ` [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware Sumit Saxena
2016-10-17 11:34   ` Hannes Reinecke
2016-10-17 12:13     ` Sumit Saxena
2016-10-17 13:01     ` Ric Wheeler
2016-10-17 13:31       ` Sumit Saxena
2016-10-17 15:55       ` Christoph Hellwig
2016-10-17 16:08         ` Ric Wheeler
2016-10-17 16:23           ` Ewan D. Milne
2016-10-17 16:20       ` James Bottomley [this message]
2016-10-17 16:27         ` Ric Wheeler
2016-10-17 17:19           ` James Bottomley
2016-10-17 17:30             ` Ric Wheeler
2016-10-17 17:36             ` Kashyap Desai
2016-10-17 17:51               ` James Bottomley
2016-10-18 13:56                 ` Tomas Henzl
2016-10-19  9:50                   ` Ching Huang
2016-10-19 12:55                     ` Tomas Henzl
2016-10-19 18:07                       ` Raghava Aditya Renukunta
2016-10-21 16:30                         ` Tomas Henzl
2016-10-25  2:02                       ` Martin K. Petersen
     [not found]                         ` <CAKiknE_MM88eVON1g_qy7wbb2fkxdAs6O0SRSzN-RbQqSvx1eA@mail.gmail.com>
2016-10-27  2:07                           ` Martin K. Petersen
2016-10-18 18:47                 ` Sumit Saxena
2016-10-17 16:29         ` Ric Wheeler
2016-10-17 13:13   ` Tomas Henzl
2016-10-17 13:28     ` Sumit Saxena
2016-10-17 13:57       ` Tomas Henzl
2016-10-17 14:25         ` Sumit Saxena
2016-10-18 13:07         ` Ric Wheeler
2016-10-18 13:22           ` Sumit Saxena
2016-10-17 10:24 ` [PATCH 5/7] megaraid_sas: driver version upgrade Sumit Saxena
2016-10-17 11:35   ` Hannes Reinecke
2016-10-17 12:20     ` Sumit Saxena
2016-10-17 10:24 ` [PATCH 6/7] MAINTAINERS: Update megaraid maintainers list Sumit Saxena
2016-10-17 11:37   ` Hannes Reinecke
2016-10-17 10:24 ` [PATCH 7/7] megaraid_sas: Do not set MPI2_TYPE_CUDA for JBOD FP path for FW which does not support JBOD sequence map Sumit Saxena
2016-10-17 11:37   ` Hannes Reinecke
2016-10-17 11:37     ` Hannes Reinecke
2016-10-17 13:23   ` Tomas Henzl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1476721205.2734.12.camel@linux.vnet.ibm.com \
    --to=jejb@linux.vnet.ibm.com \
    --cc=axboe@kernel.dk \
    --cc=emilne@redhat.com \
    --cc=fge@redhat.com \
    --cc=hare@suse.de \
    --cc=hch@infradead.org \
    --cc=jmoyer@redhat.com \
    --cc=kashyap.desai@broadcom.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=mkp@mkp.net \
    --cc=rwheeler@redhat.com \
    --cc=sumit.saxena@broadcom.com \
    --cc=thenzl@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.