From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Worley Subject: Re: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs Date: Tue, 15 Sep 2009 11:01:02 -0600 Message-ID: References: <4AAE909F.6030202@vlnb.net> <4AAFC42D.4030708@vlnb.net> <4AAFC794.7090205@vlnb.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <4AAFC794.7090205@vlnb.net> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: general-bounces@lists.openfabrics.org Errors-To: general-bounces@lists.openfabrics.org To: Vladislav Bolkhovitin Cc: linux-rdma@vger.kernel.org, scst-devel , OpenIB List-Id: linux-rdma@vger.kernel.org On Tue, Sep 15, 2009 at 10:57 AM, Vladislav Bolkhovitin wrot= e: > Chris Worley, on 09/15/2009 08:53 PM wrote: >> >> On Tue, Sep 15, 2009 at 10:43 AM, Vladislav Bolkhovitin >> wrote: >>> >>> Chris Worley, on 09/15/2009 07:50 PM wrote: >>>> >>>> On Tue, Sep 15, 2009 at 12:10 AM, Bart Van Assche >>>> wrote: >>>>> >>>>> On Tue, Sep 15, 2009 at 1:03 AM, Chris Worley >>>>> wrote: >>>>>> >>>>>> On Mon, Sep 14, 2009 at 12:51 PM, Vladislav Bolkhovitin >>>>>> wrote: >>>>>>> >>>>>>> Chris Worley, on 09/11/2009 11:50 PM wrote: >>>>>>>> >>>>>>>> I've definitely removed the switch/firmware from being the cause. >>>>>>>> >>>>>>>> I'm thinking the reason you can't repeat the test may be latency >>>>>>>> related. =A0We get ~50usecs average latency (on small block sizes)= , >>>>>>>> which can't be achieved using regular SSD's (and rotating drives a= re >>>>>>>> nowhere close). =A0Maybe a ramdisk would help repeat the issue. >>>>>>> >>>>>>> I think you should try to reproduce the problem with ramdisk or >>>>>>> nullio. >>>>>>> By >>>>>>> so you will eliminate possible influence of the SSD backend. >>>>>> >>>>>> W/ 12GB RAM in the target, I created a 7GB ramdisk: >>>>>> >>>>>> mount -t ramfs -o size=3D7g ramfs /mnt/ >>>>>> dd if=3D/dev/zero of=3D/mnt/foo bs=3D1024k count=3D7000 >>>>>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>>>>> echo "add ramdisk 2" >/proc/scsi_tgt/groups/Default/devices >>>>>> >>>>>> Then, on the initiator, I tested it... and it hung during sequential >>>>>> 8KB block reads: >>>>>> >>>>>> fio --rw=3Dread --bs=3D8k --numjobs=3D64 --iodepth=3D64 --sync=3D0 -= -direct=3D1 >>>>>> --randrepeat=3D0 \ >>>>>> =A0--group_reporting --ioengine=3Dlibaio --filename=3D/dev/sde --nam= e=3Dtest >>>>>> --loops=3D10000 --runtime=3D600 >>>>>> >>>>>> Note that I was running the SM on the target this time too. >>>>> >>>>> Which Linux distro was installed on the inititiator and on the target >>>>> ? And if applicable, which OFED version ? Which kernel messages were >>>>> logged by SRPT around the time the issue occurred (after having >>>>> enabled SRPT logging first) ? >>>> >>>> As logging hadn't helped this issue previously, I've not been enabling >>>> it. =A0That plus the kernel hacks needed to invoke logging, it's not >>>> worth enabling. >>>> >>>> This was with Ubuntu 8.10, built-in IB on the 2.6.27-14-server kernel. >>>> >>>> I couldn't get ramdisks working w/ SCST in RHEL5.2. =A0When running: >>>> >>>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>>> >>>> I get the error: >>>> >>>> dev_vdisk: ***ERROR***: Wrong f_op or FS doesn't have required >>>> capabilities >>>> >>>> ... which doesn't occur in the Ubuntu kernel, so I've been unable to >>>> test RHEL kernels w/ ramdisks. =A0In general, this problem occurs w/ 8= KB >>>> and smaller blocks w/ the Ubuntu kernels, and 2KB and smaller blocks >>>> w/ RHEL kernels. >>> >>> Use ramfs instead. >> >> Do you mean: >> >> mount -t ramfs -o size=3D7g ramfs /mnt/ > > You should then create a file on it and use it. That's what I'm doing, I believe. From above: >>>>>> mount -t ramfs -o size=3D7g ramfs /mnt/ >>>>>> dd if=3D/dev/zero of=3D/mnt/foo bs=3D1024k count=3D7000 >>>>>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>>>>> echo "add ramdisk 2" >/proc/scsi_tgt/groups/Default/devices ... but the "open", on RHEL5.2 kernel 2.6.18-92.el5, generates the following kernel messages: dev_vdisk: Registering virtual FILEIO device ramdisk scst: Processing thread started, PID 9629 scst: Processing thread started, PID 9630 scst: Processing thread started, PID 9631 scst: Processing thread started, PID 9632 scst: Processing thread started, PID 9633 dev_vdisk: ***ERROR***: Wrong f_op or FS doesn't have required capabilities scst: ***ERROR***: New device handler's vdisk attach() failed: -22 scst: Processing thread PID 9629 finished scst: Processing thread PID 9630 finished scst: Processing thread PID 9631 finished scst: Processing thread PID 9632 finished scst: Processing thread PID 9633 finished scst: Failed to attach to virtual device ramdisk Chris > >> ? >> >> That's what I'm doing. >> >> Chris >>>> >>>> Chris >>>>> >>>>> Bart. >>>>> >>>> >>>> >>>> ----------------------------------------------------------------------= -------- >>>> Come build with us! The BlackBerry® Developer Conference in SF, CA >>>> is the only developer event you need to attend this year. Jumpstart yo= ur >>>> developing skills, take BlackBerry mobile applications to market and >>>> stay >>>> ahead of the curve. Join us from November 9-12, 2009. Register >>>> now! >>>> http://p.sf.net/sfu/devconf >>>> _______________________________________________ >>>> Scst-devel mailing list >>>> Scst-devel@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/scst-devel >>>> >>> >> _______________________________________________ >> general mailing list >> general@lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> http://openib.org/mailman/listinfo/openib-general >> > >