* Stability of FILEIO as backing store? @ 2021-02-25 21:36 Forza 2021-02-25 22:09 ` Chaitanya Kulkarni 2021-02-26 3:41 ` michael.christie 0 siblings, 2 replies; 8+ messages in thread From: Forza @ 2021-02-25 21:36 UTC (permalink / raw) To: target-devel Hi, I have a weird issue with using a file as backing store with a Win2016 server as initiator. Very often if I reboot the Linux server the disk image becomes corrupt so that Windows cannot even detect the gpt partition table on it. It can happen even if I shut down the Windows machine before I reboot the Linux server. Initially I thought I would be write cache. But I've disabled that with no benefit to this problem. There are no errors in dmesg except initially when loading the target. Perhaps I'm doing wrong when rebooting? [ 71.583665] dev[0000000064b6f5d8]: Unable to change SE Device alua_support: alua_support has fixed value [ 71.583676] dev[0000000064b6f5d8]: Unable to change SE Device alua_support: alua_support has fixed value [ 71.583837] ignoring deprecated emulate_dpo attribute [ 71.583874] ignoring deprecated emulate_fua_read attribute [ 71.584553] dev[0000000064b6f5d8]: Unable to change SE Device pgr_support: pgr_support has fixed value [ 71.584561] dev[0000000064b6f5d8]: Unable to change SE Device pgr_support: pgr_support has fixed value The LIO target is running Fedora 33 Server with two Seagate Exos 10TB in Btrfs RAID-1 mode. Are there any debugging options that would help? Thanks for any advice. ~Forza ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Stability of FILEIO as backing store? 2021-02-25 21:36 Stability of FILEIO as backing store? Forza @ 2021-02-25 22:09 ` Chaitanya Kulkarni 2021-02-25 22:20 ` Chaitanya Kulkarni 2021-02-26 3:41 ` michael.christie 1 sibling, 1 reply; 8+ messages in thread From: Chaitanya Kulkarni @ 2021-02-25 22:09 UTC (permalink / raw) To: Forza, target-devel On 2/25/21 13:49, Forza wrote: > Hi, > > I have a weird issue with using a file as backing store with a Win2016 server as initiator. > > Very often if I reboot the Linux server the disk image becomes corrupt so that Windows cannot even detect the gpt partition table on it. It can happen even if I shut down the Windows machine before I reboot the Linux server. You need to first isolate the problem by running the data verification test with loop transport on the linux and make sure everything is working fine before you move on to the windows initiator. > Initially I thought I would be write cache. But I've disabled that with no benefit to this problem. What were the steps taken to conclude that it was not the write cache ? > There are no errors in dmesg except initially when loading the target. Perhaps I'm doing wrong when rebooting? > > > [ 71.583665] dev[0000000064b6f5d8]: Unable to change SE Device alua_support: alua_support has fixed value > [ 71.583676] dev[0000000064b6f5d8]: Unable to change SE Device alua_support: alua_support has fixed value > [ 71.583837] ignoring deprecated emulate_dpo attribute > [ 71.583874] ignoring deprecated emulate_fua_read attribute > [ 71.584553] dev[0000000064b6f5d8]: Unable to change SE Device pgr_support: pgr_support has fixed value > [ 71.584561] dev[0000000064b6f5d8]: Unable to change SE Device pgr_support: pgr_support has fixed value > > > The LIO target is running Fedora 33 Server with two Seagate Exos 10TB in Btrfs RAID-1 mode. > > Are there any debugging options that would help? One way to go about it is to turn on the target tracing and examine the commands to see which command is failing if target has command level tracing implemented. This will allow other developers to help you more. Also if there a problem with the file system then you might want to run fsck before you establish the connection to make sure you have not encountered file system level errors. > Thanks for any advice. Also please mentioned that which kernel version you are using so that other developers can help you more. > > ~Forza > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Stability of FILEIO as backing store? 2021-02-25 22:09 ` Chaitanya Kulkarni @ 2021-02-25 22:20 ` Chaitanya Kulkarni 0 siblings, 0 replies; 8+ messages in thread From: Chaitanya Kulkarni @ 2021-02-25 22:20 UTC (permalink / raw) To: Forza, target-devel On 2/25/21 14:12, Chaitanya Kulkarni wrote: > On 2/25/21 13:49, Forza wrote: > >> The LIO target is running Fedora 33 Server with two Seagate Exos 10TB in Btrfs RAID-1 mode. >> >> Are there any debugging options that would help? You can try following patch to get the more information about command completion error :- diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c index 93ea17cbad79..f4e6e1c18867 100644 --- a/drivers/target/target_core_transport.c +++ b/drivers/target/target_core_transport.c @@ -873,6 +873,11 @@ void target_complete_cmd(struct se_cmd *cmd, u8 scsi_status) cmd->transport_state |= (CMD_T_COMPLETE | CMD_T_ACTIVE); spin_unlock_irqrestore(&cmd->t_state_lock, flags); + if (!success) + pr_err("%s %d cmd->scsi_status 0x%x" + "cmd->se_cmd_flags 0x%x\n", __func__, __LINE__, + cmd->scsi_status, cmd->se_cmd_flags); + INIT_WORK(&cmd->work, success ? target_complete_ok_work : target_complete_failure_work); queue_work_on(cmd->cpuid, target_completion_wq, &cmd->work); > One way to go about it is to turn on the target tracing and examine the > commands to > see which command is failing if target has command level tracing > implemented. > This will allow other developers to help you more. > > Also if there a problem with the file system then you might want to run > fsck before you > establish the connection to make sure you have not encountered file > system level errors. >> Thanks for any advice. > Also please mentioned that which kernel version you are using so that other > developers can help you more. >> ~Forza >> > ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: Stability of FILEIO as backing store? 2021-02-25 21:36 Stability of FILEIO as backing store? Forza 2021-02-25 22:09 ` Chaitanya Kulkarni @ 2021-02-26 3:41 ` michael.christie 2021-02-26 7:19 ` Forza 1 sibling, 1 reply; 8+ messages in thread From: michael.christie @ 2021-02-26 3:41 UTC (permalink / raw) To: Forza, target-devel On 2/25/21 3:36 PM, Forza wrote: > Hi, > > I have a weird issue with using a file as backing store with a Win2016 server as initiator. > > Very often if I reboot the Linux server the disk image becomes corrupt so that Windows cannot even detect the gpt partition table on it. It can happen even if I shut down the Windows machine before I reboot the Linux server. > > Initially I thought I would be write cache. But I've disabled that with no benefit to this problem. > How are you disabling the write cache? What tools do you use? Is it targetcli or are you doing this manually via configfs? What is the output of cat /sys/kernel/config/target/core/fileio_$N/$name/info cat /sys/kernel/config/target/core/fileio_$N/$name/attrib/write_cache ? If you do a sync manually after shutting down windows does it help? Are you accessing this from multiple windows machines at the same time? What target driver are you using? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Stability of FILEIO as backing store? 2021-02-26 3:41 ` michael.christie @ 2021-02-26 7:19 ` Forza 2021-02-26 16:12 ` Mike Christie 0 siblings, 1 reply; 8+ messages in thread From: Forza @ 2021-02-26 7:19 UTC (permalink / raw) To: michael.christie, target-devel ---- From: michael.christie@oracle.com -- Sent: 2021-02-26 - 04:41 ---- > On 2/25/21 3:36 PM, Forza wrote: >> Hi, >> >> I have a weird issue with using a file as backing store with a Win2016 server as initiator. >> >> Very often if I reboot the Linux server the disk image becomes corrupt so that Windows cannot even detect the gpt partition table on it. It can happen even if I shut down the Windows machine before I reboot the Linux server. >> >> Initially I thought I would be write cache. But I've disabled that with no benefit to this problem. >> > > How are you disabling the write cache? What tools do you use? Is > it targetcli or are you doing this manually via configfs? I am using targetcli, using the documentation http://linux-iscsi.org/Doc/LIO%20Admin%20Manual.pdf I created the backstore/fileio with write_back=false but this might not have disabled cache sinze I see in the saveconfig.json that it is still set to true. So I guess this can still be an issue =( > > What is the output of > > cat /sys/kernel/config/target/core/fileio_$N/$name/info Status: ACTIVATED Max Queue Depth: 128 SectorSize: 512 HwMaxSectors: 16384 TCM FILEIO ID: 0 File: /media/iscsi-tgt/dx_media_3.img Size: 429496729600 Mode: Buffered-WCE Async: 0 > cat /sys/kernel/config/target/core/fileio_$N/$name/attrib/write_cache This does not exist. The curent files are: # grep . * alua_support:1 block_size:512 emulate_3pc:1 emulate_caw:1 emulate_dpo:1 emulate_fua_read:1 emulate_fua_write:1 emulate_model_alias:1 emulate_pr:1 emulate_rest_reord:0 emulate_tas:1 emulate_tpu:0 emulate_tpws:0 emulate_ua_intlck_ctrl:0 emulate_write_cache:1 enforce_pr_isids:1 force_pr_aptpl:0 hw_block_size:512 hw_max_sectors:16384 hw_pi_prot_type:0 hw_queue_depth:128 is_nonrot:0 max_unmap_block_desc_count:1 max_unmap_lba_count:8192 max_write_same_len:4096 optimal_sectors:16384 pgr_support:1 pi_prot_format:0 pi_prot_type:0 pi_prot_verify:0 queue_depth:128 unmap_granularity:1 unmap_granularity_alignment:0 unmap_zeroes_data:0 > > ? > > If you do a sync manually after shutting down windows does it help? No. > > Are you accessing this from multiple windows machines at the same time? Only one client. > What target driver are you using? I am using FILEIO target. Here is the output of saveconfig.json. Originally I have two exports (dxdep2 dxmedia2) but added two more to test different settings (block size and aio). { "fabric_modules": [], "storage_objects": [ { "aio": false, "alua_tpgs": [ { "alua_access_state": 0, "alua_access_status": 0, "alua_access_type": 3, "alua_support_active_nonoptimized": 1, "alua_support_active_optimized": 1, "alua_support_offline": 1, "alua_support_standby": 1, "alua_support_transitioning": 1, "alua_support_unavailable": 1, "alua_write_metadata": 0, "implicit_trans_secs": 0, "name": "default_tg_pt_gp", "nonop_delay_msecs": 100, "preferred": 0, "tg_pt_gp_id": 0, "trans_delay_msecs": 0 } ], "attributes": { "alua_support": 1, "block_size": 512, "emulate_3pc": 1, "emulate_caw": 1, "emulate_dpo": 1, "emulate_fua_read": 1, "emulate_fua_write": 1, "emulate_model_alias": 1, "emulate_pr": 1, "emulate_rest_reord": 0, "emulate_tas": 1, "emulate_tpu": 0, "emulate_tpws": 0, "emulate_ua_intlck_ctrl": 0, "emulate_write_cache": 1, "enforce_pr_isids": 1, "force_pr_aptpl": 0, "is_nonrot": 0, "max_unmap_block_desc_count": 1, "max_unmap_lba_count": 8192, "max_write_same_len": 4096, "optimal_sectors": 16384, "pgr_support": 1, "pi_prot_format": 0, "pi_prot_type": 0, "pi_prot_verify": 0, "queue_depth": 128, "unmap_granularity": 1, "unmap_granularity_alignment": 0, "unmap_zeroes_data": 0 }, "dev": "/media/iscsi-tgt/dx_media_3.img", "name": "dxmedia3", "plugin": "fileio", "size": 429496729600, "write_back": true, "wwn": "a53bd5cc-85d7-47dc-a4d2-9682e9a7b82a" }, { "aio": false, "alua_tpgs": [ { "alua_access_state": 0, "alua_access_status": 0, "alua_access_type": 3, "alua_support_active_nonoptimized": 1, "alua_support_active_optimized": 1, "alua_support_offline": 1, "alua_support_standby": 1, "alua_support_transitioning": 1, "alua_support_unavailable": 1, "alua_write_metadata": 0, "implicit_trans_secs": 0, "name": "default_tg_pt_gp", "nonop_delay_msecs": 100, "preferred": 0, "tg_pt_gp_id": 0, "trans_delay_msecs": 0 } ], "attributes": { "alua_support": 1, "block_size": 512, "emulate_3pc": 1, "emulate_caw": 1, "emulate_dpo": 1, "emulate_fua_read": 1, "emulate_fua_write": 1, "emulate_model_alias": 1, "emulate_pr": 1, "emulate_rest_reord": 0, "emulate_tas": 1, "emulate_tpu": 0, "emulate_tpws": 0, "emulate_ua_intlck_ctrl": 0, "emulate_write_cache": 1, "enforce_pr_isids": 1, "force_pr_aptpl": 0, "is_nonrot": 0, "max_unmap_block_desc_count": 1, "max_unmap_lba_count": 8192, "max_write_same_len": 4096, "optimal_sectors": 16384, "pgr_support": 1, "pi_prot_format": 0, "pi_prot_type": 0, "pi_prot_verify": 0, "queue_depth": 128, "unmap_granularity": 1, "unmap_granularity_alignment": 0, "unmap_zeroes_data": 0 }, "dev": "/media/iscsi-tgt/dx_dep_3.img", "name": "dxdep3", "plugin": "fileio", "size": 966367641600, "write_back": true, "wwn": "253f2cc0-209c-4a93-b110-9dc45e52229e" }, { "aio": true, "alua_tpgs": [ { "alua_access_state": 0, "alua_access_status": 0, "alua_access_type": 3, "alua_support_active_nonoptimized": 1, "alua_support_active_optimized": 1, "alua_support_offline": 1, "alua_support_standby": 1, "alua_support_transitioning": 1, "alua_support_unavailable": 1, "alua_write_metadata": 0, "implicit_trans_secs": 0, "name": "default_tg_pt_gp", "nonop_delay_msecs": 100, "preferred": 0, "tg_pt_gp_id": 0, "trans_delay_msecs": 0 } ], "attributes": { "alua_support": 1, "block_size": 4096, "emulate_3pc": 1, "emulate_caw": 1, "emulate_dpo": 1, "emulate_fua_read": 1, "emulate_fua_write": 1, "emulate_model_alias": 1, "emulate_pr": 1, "emulate_rest_reord": 0, "emulate_tas": 1, "emulate_tpu": 0, "emulate_tpws": 0, "emulate_ua_intlck_ctrl": 0, "emulate_write_cache": 1, "enforce_pr_isids": 1, "force_pr_aptpl": 0, "is_nonrot": 0, "max_unmap_block_desc_count": 1, "max_unmap_lba_count": 8192, "max_write_same_len": 4096, "optimal_sectors": 2048, "pgr_support": 1, "pi_prot_format": 0, "pi_prot_type": 0, "pi_prot_verify": 0, "queue_depth": 128, "unmap_granularity": 1, "unmap_granularity_alignment": 0, "unmap_zeroes_data": 0 }, "dev": "/media/iscsi-tgt/dx_media_2.img", "name": "dxmedia2", "plugin": "fileio", "size": 858993459200, "write_back": true, "wwn": "da09b66d-5b23-4540-ab4a-f00b03af294f" }, { "aio": true, "alua_tpgs": [ { "alua_access_state": 0, "alua_access_status": 0, "alua_access_type": 3, "alua_support_active_nonoptimized": 1, "alua_support_active_optimized": 1, "alua_support_offline": 1, "alua_support_standby": 1, "alua_support_transitioning": 1, "alua_support_unavailable": 1, "alua_write_metadata": 0, "implicit_trans_secs": 0, "name": "default_tg_pt_gp", "nonop_delay_msecs": 100, "preferred": 0, "tg_pt_gp_id": 0, "trans_delay_msecs": 0 } ], "attributes": { "alua_support": 1, "block_size": 4096, "emulate_3pc": 1, "emulate_caw": 1, "emulate_dpo": 1, "emulate_fua_read": 1, "emulate_fua_write": 1, "emulate_model_alias": 1, "emulate_pr": 1, "emulate_rest_reord": 0, "emulate_tas": 1, "emulate_tpu": 0, "emulate_tpws": 0, "emulate_ua_intlck_ctrl": 0, "emulate_write_cache": 1, "enforce_pr_isids": 1, "force_pr_aptpl": 0, "is_nonrot": 0, "max_unmap_block_desc_count": 1, "max_unmap_lba_count": 8192, "max_write_same_len": 4096, "optimal_sectors": 2048, "pgr_support": 1, "pi_prot_format": 0, "pi_prot_type": 0, "pi_prot_verify": 0, "queue_depth": 128, "unmap_granularity": 1, "unmap_granularity_alignment": 0, "unmap_zeroes_data": 0 }, "dev": "/media/iscsi-tgt/dx_dep_2.img", "name": "dxdep2", "plugin": "fileio", "size": 858993459200, "write_back": true, "wwn": "924a476a-482f-414e-8011-83909d8b3b6e" } ], "targets": [ { "fabric": "iscsi", "tpgs": [ { "attributes": { "authentication": 0, "cache_dynamic_acls": 1, "default_cmdsn_depth": 64, "default_erl": 0, "demo_mode_discovery": 1, "demo_mode_write_protect": 1, "fabric_prot_type": 0, "generate_node_acls": 0, "login_keys_workaround": 1, "login_timeout": 15, "netif_timeout": 2, "prod_mode_write_protect": 0, "t10_pi": 0, "tpg_enabled_sendtargets": 1 }, "chap_password": "{redacted}", "chap_userid": "iqn.1991-05.com.microsoft:{redacted}", "enable": true, "luns": [ { "alias": "0cc1743f36", "alua_tg_pt_gp_name": "default_tg_pt_gp", "index": 1, "storage_object": "/backstores/fileio/dxmedia3" }, { "alias": "3d56464288", "alua_tg_pt_gp_name": "default_tg_pt_gp", "index": 0, "storage_object": "/backstores/fileio/dxdep3" }, { "alias": "5f63b78f76", "alua_tg_pt_gp_name": "default_tg_pt_gp", "index": 3, "storage_object": "/backstores/fileio/dxmedia2" }, { "alias": "aa08fefd1d", "alua_tg_pt_gp_name": "default_tg_pt_gp", "index": 2, "storage_object": "/backstores/fileio/dxdep2" } ], "node_acls": [ { "attributes": { "dataout_timeout": 5, "dataout_timeout_retries": 5, "default_erl": 0, "nopin_response_timeout": 30, "nopin_timeout": 15, "random_datain_pdu_offsets": 0, "random_datain_seq_offsets": 0, "random_r2t_offsets": 0 }, "chap_mutual_password": "{redacted}", "chap_mutual_userid": "{redacted}", "chap_password": "{redacted}", "chap_userid": "iqn.1991-05.com.microsoft:{redacted}", "mapped_luns": [ { "alias": "a2cce5a7e4", "index": 1, "tpg_lun": 1, "write_protect": false }, { "alias": "5e0c5884fa", "index": 0, "tpg_lun": 0, "write_protect": false }, { "alias": "3fb61f1c01", "index": 3, "tpg_lun": 3, "write_protect": false }, { "alias": "ff50cee375", "index": 2, "tpg_lun": 2, "write_protect": false } ], "node_wwn": "iqn.1991-05.com.microsoft:{redacted}" } ], "parameters": { "AuthMethod": "CHAP,None", "DataDigest": "CRC32C", "DataPDUInOrder": "Yes", "DataSequenceInOrder": "Yes", "DefaultTime2Retain": "20", "DefaultTime2Wait": "4", "ErrorRecoveryLevel": "2", "FirstBurstLength": "65536", "HeaderDigest": "CRC32C", "IFMarkInt": "Reject", "IFMarker": "No", "ImmediateData": "Yes", "InitialR2T": "Yes", "MaxBurstLength": "262144", "MaxConnections": "2", "MaxOutstandingR2T": "1", "MaxRecvDataSegmentLength": "8192", "MaxXmitDataSegmentLength": "262144", "OFMarkInt": "Reject", "OFMarker": "No", "TargetAlias": "LIO Target" }, "portals": [ { "ip_address": "{redacted}", "iser": false, "offload": false, "port": 3260 } ], "tag": 1 } ], "wwn": "iqn.2020-02.{redacted}:san01" } ] } Thanks! /Forza ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Stability of FILEIO as backing store? 2021-02-26 7:19 ` Forza @ 2021-02-26 16:12 ` Mike Christie 2021-03-01 15:38 ` Forza 0 siblings, 1 reply; 8+ messages in thread From: Mike Christie @ 2021-02-26 16:12 UTC (permalink / raw) To: Forza, target-devel On 2/26/21 1:19 AM, Forza wrote: > > > ---- From: michael.christie@oracle.com -- Sent: 2021-02-26 - 04:41 ---- > >> On 2/25/21 3:36 PM, Forza wrote: >>> Hi, >>> >>> I have a weird issue with using a file as backing store with a Win2016 server as initiator. >>> >>> Very often if I reboot the Linux server the disk image becomes corrupt so that Windows cannot even detect the gpt partition table on it. It can happen even if I shut down the Windows machine before I reboot the Linux server. >>> >>> Initially I thought I would be write cache. But I've disabled that with no benefit to this problem. >>> >> >> How are you disabling the write cache? What tools do you use? Is >> it targetcli or are you doing this manually via configfs? > > I am using targetcli, using the documentation https://urldefense.com/v3/__http://linux-iscsi.org/Doc/LIO*20Admin*20Manual.pdf__;JSU!!GqivPVa7Brio!OYNbrN3Fseq8PE_-n67Mmb8_JdUUU_yWw_LcbeKvyKgKgP_iVH-X2u1vrU9RasK-Nvhz$ > I created the backstore/fileio with write_back=false but this might not have disabled cache sinze I see in the saveconfig.json that it is still set to true. So I guess this can still be an issue =( > > >> >> What is the output of >> >> cat /sys/kernel/config/target/core/fileio_$N/$name/info > > Status: ACTIVATED Max Queue Depth: 128 SectorSize: 512 HwMaxSectors: 16384 > TCM FILEIO ID: 0 File: /media/iscsi-tgt/dx_media_3.img Size: 429496729600 Mode: Buffered-WCE Async: 0 > I think you need to ask the people that maintain your tools (fedora or upstream https://github.com/open-iscsi/targetcli-fb), because for upstream's master branch it looks like doing write_back=false should work, but above we see "Mode: Buffered-WCE" and below we see emulate_write_cache=1 kike you mentioned. > >> cat /sys/kernel/config/target/core/fileio_$N/$name/attrib/write_cache > > This does not exist. The curent files are: > Sorry, I meant emulate_write_cache. > # grep . * > alua_support:1 > block_size:512 > emulate_3pc:1 > emulate_caw:1 > emulate_dpo:1 > emulate_fua_read:1 > emulate_fua_write:1 > emulate_model_alias:1 > emulate_pr:1 > emulate_rest_reord:0 > emulate_tas:1 > emulate_tpu:0 > emulate_tpws:0 > emulate_ua_intlck_ctrl:0 > emulate_write_cache:1 > enforce_pr_isids:1 > force_pr_aptpl:0 > hw_block_size:512 > hw_max_sectors:16384 > hw_pi_prot_type:0 > hw_queue_depth:128 > is_nonrot:0 > max_unmap_block_desc_count:1 > max_unmap_lba_count:8192 > max_write_same_len:4096 > optimal_sectors:16384 > pgr_support:1 > pi_prot_format:0 > pi_prot_type:0 > pi_prot_verify:0 > queue_depth:128 > unmap_granularity:1 > unmap_granularity_alignment:0 > unmap_zeroes_data:0 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Stability of FILEIO as backing store? 2021-02-26 16:12 ` Mike Christie @ 2021-03-01 15:38 ` Forza 2021-03-02 11:22 ` Maurizio Lombardi 0 siblings, 1 reply; 8+ messages in thread From: Forza @ 2021-03-01 15:38 UTC (permalink / raw) To: Mike Christie, target-devel ---- From: Mike Christie <michael.christie@oracle.com> -- Sent: 2021-02-26 - 17:12 ---- > On 2/26/21 1:19 AM, Forza wrote: >> >> >> ---- From: michael.christie@oracle.com -- Sent: 2021-02-26 - 04:41 ---- >> >>> On 2/25/21 3:36 PM, Forza wrote: >>>> Hi, >>>> >>>> I have a weird issue with using a file as backing store with a Win2016 server as initiator. >>>> >>>> Very often if I reboot the Linux server the disk image becomes corrupt so that Windows cannot even detect the gpt partition table on it. It can happen even if I shut down the Windows machine before I reboot the Linux server. >>>> >>>> Initially I thought I would be write cache. But I've disabled that with no benefit to this problem. >>>> >>> >>> How are you disabling the write cache? What tools do you use? Is >>> it targetcli or are you doing this manually via configfs? >> >> I am using targetcli, using the documentation https://urldefense.com/v3/__http://linux-iscsi.org/Doc/LIO*20Admin*20Manual.pdf__;JSU!!GqivPVa7Brio!OYNbrN3Fseq8PE_-n67Mmb8_JdUUU_yWw_LcbeKvyKgKgP_iVH-X2u1vrU9RasK-Nvhz$ >> I created the backstore/fileio with write_back=false but this might not have disabled cache sinze I see in the saveconfig.json that it is still set to true. So I guess this can still be an issue =( >> >> >>> >>> What is the output of >>> >>> cat /sys/kernel/config/target/core/fileio_$N/$name/info >> >> Status: ACTIVATED Max Queue Depth: 128 SectorSize: 512 HwMaxSectors: 16384 >> TCM FILEIO ID: 0 File: /media/iscsi-tgt/dx_media_3.img Size: 429496729600 Mode: Buffered-WCE Async: 0 >> > > I think you need to ask the people that maintain your tools (fedora or upstream > https://github.com/open-iscsi/targetcli-fb), because for upstream's master branch > it looks like doing write_back=false should work, but above we see "Mode: Buffered-WCE" > and below we see emulate_write_cache=1 kike you mentioned. I think you might be right. In the end as I needed to get this to work, I swapped Fedora for Ubuntu Server 20.0.4.2 LTS with the HWE kernel. Since I changed to Ubuntu I have not had any issues. I've tested hard reboots and unclean shut downs with no issue. Go figure... Fedora automatically loads the saved config using "targetctl restore", while on Ubuntu automatic restore is not enabled, so I added a cron @reboot line to do "targetcli restoreconfig /etc/target/myconfig.json". Perhaps that avoids some race during boot? Are there any technical differences between "targetctl restore" and "targetcli restoreconfig saveconfig.json" There are some other changes that might be the most important. "Emulate_write_cache" is now false and /sys/config show "mode:o_dsync" instead of "buffered-wce" . Also, the "aio" attribute is different. Perhaps I set it manually in Fedora, but I cannot remember. It is not visible in /sys but is in the saveconfig.json. Ubuntu: "storage_objects": [ { "aio": false, Fedora: "storage_objects": [ { "aio": true, Does "aio" mean Async I/O in this case? I could not find any documentation for this attribute. What implication would this have with false vs true? # grep . info Status: ACTIVATED Max Queue Depth: 128 SectorSize: 512 HwMaxSectors: 16384 TCM FILEIO ID: 0 File: /media/iscsi-tgt/dx_media_3.img Size: 429496729600 Mode: O_DSYNC Async: 0 # grep . attrib/* attrib/alua_support:1 attrib/block_size:512 attrib/emulate_3pc:1 attrib/emulate_caw:1 attrib/emulate_dpo:1 attrib/emulate_fua_read:1 attrib/emulate_fua_write:1 attrib/emulate_model_alias:1 attrib/emulate_pr:1 attrib/emulate_rest_reord:0 attrib/emulate_tas:1 attrib/emulate_tpu:0 attrib/emulate_tpws:0 attrib/emulate_ua_intlck_ctrl:0 attrib/emulate_write_cache:0 attrib/enforce_pr_isids:1 attrib/force_pr_aptpl:0 attrib/hw_block_size:512 attrib/hw_max_sectors:16384 attrib/hw_pi_prot_type:0 attrib/hw_queue_depth:128 attrib/is_nonrot:0 attrib/max_unmap_block_desc_count:1 attrib/max_unmap_lba_count:8192 attrib/max_write_same_len:4096 attrib/optimal_sectors:16384 attrib/pgr_support:1 attrib/pi_prot_format:0 attrib/pi_prot_type:0 attrib/pi_prot_verify:0 attrib/queue_depth:128 attrib/unmap_granularity:1 attrib/unmap_granularity_alignment:0 attrib/unmap_zeroes_data:0 > >> >>> cat /sys/kernel/config/target/core/fileio_$N/$name/attrib/write_cache >> >> This does not exist. The curent files are: >> > > Sorry, I meant emulate_write_cache. > >> # grep . * >> alua_support:1 >> block_size:512 >> emulate_3pc:1 >> emulate_caw:1 >> emulate_dpo:1 >> emulate_fua_read:1 >> emulate_fua_write:1 >> emulate_model_alias:1 >> emulate_pr:1 >> emulate_rest_reord:0 >> emulate_tas:1 >> emulate_tpu:0 >> emulate_tpws:0 >> emulate_ua_intlck_ctrl:0 >> emulate_write_cache:1 >> enforce_pr_isids:1 >> force_pr_aptpl:0 >> hw_block_size:512 >> hw_max_sectors:16384 >> hw_pi_prot_type:0 >> hw_queue_depth:128 >> is_nonrot:0 >> max_unmap_block_desc_count:1 >> max_unmap_lba_count:8192 >> max_write_same_len:4096 >> optimal_sectors:16384 >> pgr_support:1 >> pi_prot_format:0 >> pi_prot_type:0 >> pi_prot_verify:0 >> queue_depth:128 >> unmap_granularity:1 >> unmap_granularity_alignment:0 >> unmap_zeroes_data:0 Thank you all for your inputs. I still would like to understand what wasn't right in the Fedora setup so that I might learn and avoid the specific pitfall in the future. Regards, Forza ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Stability of FILEIO as backing store? 2021-03-01 15:38 ` Forza @ 2021-03-02 11:22 ` Maurizio Lombardi 0 siblings, 0 replies; 8+ messages in thread From: Maurizio Lombardi @ 2021-03-02 11:22 UTC (permalink / raw) To: Forza, Mike Christie, target-devel Hello, I'm trying to reproduce the problem you reported on Fedora 33 Dne 01. 03. 21 v 16:38 Forza napsal(a): > > Also, the "aio" attribute is different. Perhaps I set it manually in Fedora, but I cannot remember. It is not visible in /sys but is in the saveconfig.json. > Ubuntu: > "storage_objects": [ > { > "aio": false, > Fedora: > "storage_objects": [ > { > "aio": true, > > Does "aio" mean Async I/O in this case? I could not find any documentation for this attribute. Yes, it means "async I/O" but I don't understand why it's set to true. Looking at the rtslib/targetcli sources, aio by default is set to false and it's not even possible to change it via targetcli. You likely changed it manually in the saveconfig.json file. Btw, there is a possible race condition in kernel versions < v5.12-rc1 when async i/o is enabled: In the fd_execute_rw_aio() function, the bvec pointer is freed before the async command is completed, might be the reason behind the disk corruptions? This has been fixed with commit ecd7fba0ade1d6d8d49d320df9caf96922a376b2 Maurizio ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-03-03 4:44 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-02-25 21:36 Stability of FILEIO as backing store? Forza 2021-02-25 22:09 ` Chaitanya Kulkarni 2021-02-25 22:20 ` Chaitanya Kulkarni 2021-02-26 3:41 ` michael.christie 2021-02-26 7:19 ` Forza 2021-02-26 16:12 ` Mike Christie 2021-03-01 15:38 ` Forza 2021-03-02 11:22 ` Maurizio Lombardi
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.