* Issuing custom IOCTLs to SCSI LLD
@ 2016-05-09 9:27 Naveen
2016-05-11 4:18 ` Naveen
0 siblings, 1 reply; 10+ messages in thread
From: Naveen @ 2016-05-09 9:27 UTC (permalink / raw)
To: ceph-devel
Hi,
I'm new to ceph and trying to find if on CentOS there is a way to issue
IOCTL or similar custom requests (other than fops request like OS issued
read/write requests) from ceph OSD/RBD to the underlying SCSI LLD.
From initial reading of docs it looks like in the I/O stack user/client
issued I/Os would reach the actual device as below:
Client I/O req
VFS
ceph fs
Ceph OSD daemon
RBD
SCSI midlayer
SCSI LLD (say mpt3sas)
Disk
Ref:
https://en.wikipedia.org/wiki/NVM_Express#/media/File:IO_stack_of_the_Li
nux_kernel.svg
My question is:
The SCSI LLD would support both read /write entry points for I/O
requests issued by the filesystem/block I/O but they also support some
custom requests using IOCTLs. So how can ceph support issuing of such
IOCTL requests to the device if user issues such request. Say for
example power cycling the drive etc. It can also be a passthro request
down to the device.
Appreciate the help and thanks in advance.
thanks & regards,
naveen
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Issuing custom IOCTLs to SCSI LLD
2016-05-09 9:27 Issuing custom IOCTLs to SCSI LLD Naveen
@ 2016-05-11 4:18 ` Naveen
2016-05-11 5:27 ` Christoph Hellwig
2016-05-11 12:41 ` Sage Weil
0 siblings, 2 replies; 10+ messages in thread
From: Naveen @ 2016-05-11 4:18 UTC (permalink / raw)
To: ceph-devel
Naveen <Naveen.Chandrasekaran <at> radisys.com> writes:
>
> Hi,
>
> I'm new to ceph and trying to find if on CentOS there is a way to
issue
> IOCTL or similar custom requests (other than fops request like OS
issued
> read/write requests) from ceph OSD/RBD to the underlying SCSI LLD.
> From initial reading of docs it looks like in the I/O stack
user/client
> issued I/Os would reach the actual device as below:
>
> Client I/O req
> VFS
> ceph fs
> Ceph OSD daemon
> RBD
> SCSI midlayer
> SCSI LLD (say mpt3sas)
> Disk
>
> Ref:
>
https://en.wikipedia.org/wiki/NVM_Express#/media/File:IO_stack_of_the_Li
> nux_kernel.svg
>
> My question is:
> The SCSI LLD would support both read /write entry points for I/O
> requests issued by the filesystem/block I/O but they also support some
> custom requests using IOCTLs. So how can ceph support issuing of such
> IOCTL requests to the device if user issues such request. Say for
> example power cycling the drive etc. It can also be a passthro request
> down to the device.
>
> Appreciate the help and thanks in advance.
>
Hi,
Still waiting for a response. Can any of the experts clarify on this?
thanks,
naveen
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Issuing custom IOCTLs to SCSI LLD
2016-05-11 4:18 ` Naveen
@ 2016-05-11 5:27 ` Christoph Hellwig
2016-05-12 8:52 ` Naveen
2016-05-11 12:41 ` Sage Weil
1 sibling, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2016-05-11 5:27 UTC (permalink / raw)
To: Naveen; +Cc: ceph-devel
Short answer is: don't do it. Use the SES interface for your example,
or ensure a proper API exists if the example was just made up.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Issuing custom IOCTLs to SCSI LLD
2016-05-11 4:18 ` Naveen
2016-05-11 5:27 ` Christoph Hellwig
@ 2016-05-11 12:41 ` Sage Weil
2016-05-12 9:32 ` Naveen
1 sibling, 1 reply; 10+ messages in thread
From: Sage Weil @ 2016-05-11 12:41 UTC (permalink / raw)
To: Naveen; +Cc: ceph-devel
On Wed, 11 May 2016, Naveen wrote:
> Naveen <Naveen.Chandrasekaran <at> radisys.com> writes:
> > Hi,
> >
> > I'm new to ceph and trying to find if on CentOS there is a way to
> issue
> > IOCTL or similar custom requests (other than fops request like OS
> issued
> > read/write requests) from ceph OSD/RBD to the underlying SCSI LLD.
> > From initial reading of docs it looks like in the I/O stack
> user/client
> > issued I/Os would reach the actual device as below:
> >
> > Client I/O req
> > VFS
> > ceph fs
> > Ceph OSD daemon
> > RBD
> > SCSI midlayer
> > SCSI LLD (say mpt3sas)
> > Disk
> >
> > Ref:
> >
> https://en.wikipedia.org/wiki/NVM_Express#/media/File:IO_stack_of_the_Li
> > nux_kernel.svg
> >
> > My question is:
> > The SCSI LLD would support both read /write entry points for I/O
> > requests issued by the filesystem/block I/O but they also support some
> > custom requests using IOCTLs. So how can ceph support issuing of such
> > IOCTL requests to the device if user issues such request. Say for
> > example power cycling the drive etc. It can also be a passthro request
> > down to the device.
Can you give an example of such an operation?
In general, any operation is generalized at the librados level.
For example, in order to get write same and cmpxchg block operations, we
added librados operations with similar semantics and implement them there.
It is unlikely that passing a complex operation down to the SCSI layer
will work in unison with the other steps involved in committing
an operation (e.g., updating metadata indicating the object
version has changed).
sage
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Issuing custom IOCTLs to SCSI LLD
2016-05-11 5:27 ` Christoph Hellwig
@ 2016-05-12 8:52 ` Naveen
0 siblings, 0 replies; 10+ messages in thread
From: Naveen @ 2016-05-12 8:52 UTC (permalink / raw)
To: ceph-devel
Christoph Hellwig <hch <at> infradead.org> writes:
>
> Short answer is: don't do it. Use the SES interface for your example,
> or ensure a proper API exists if the example was just made up.
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo <at> vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
Thanks Christoph. Power cycling of drive was an example of how some of the
management IOCTLs provided by the SCSI LLD driver could be used. So your
suggestion is dont' use ceph for this purpose(though its possible) but
write or use an existing utility(if HBA vendor provides one) that issues
IOCTLs to the SCSI/SAS LLD HBA driver directly?
Related question is if a device like /dev/sda[1..n] is assigned a OSD and
is being used by ceph can another user application continue to open and
read/write to the /dev/sda1 device directly. How ceph prevents such
scenarios?
thanks,
Naveen
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Issuing custom IOCTLs to SCSI LLD
2016-05-11 12:41 ` Sage Weil
@ 2016-05-12 9:32 ` Naveen
2016-05-12 12:37 ` Sage Weil
0 siblings, 1 reply; 10+ messages in thread
From: Naveen @ 2016-05-12 9:32 UTC (permalink / raw)
To: ceph-devel
Sage Weil <sage <at> newdream.net> writes:
> > > My question is:
> > > The SCSI LLD would support both read /write entry points for I/O
> > > requests issued by the filesystem/block I/O but they also support
some
> > > custom requests using IOCTLs. So how can ceph support issuing of
such
> > > IOCTL requests to the device if user issues such request. Say for
> > > example power cycling the drive etc. It can also be a passthro
request
> > > down to the device.
>
> Can you give an example of such an operation?
>
> In general, any operation is generalized at the librados level.
> For example, in order to get write same and cmpxchg block operations,
we
> added librados operations with similar semantics and implement them
there.
> It is unlikely that passing a complex operation down to the SCSI layer
> will work in unison with the other steps involved in committing
> an operation (e.g., updating metadata indicating the object
> version has changed).
>
> sage
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
in
> the body of a message to majordomo <at> vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
Thanks for the response Sage. Example IOCTL operations would be like
downloading a new FW to the drive/HBA, task management requests like
Hard reset, power cycling the drive, issue a SAS/SMP/STP pass thro
command to the drive for querying etc. All these would have to be
initiated and go through ceph (if supported) and not bypassing it.
I asked a related question in another post too: Can a physical disk
(/dev/sda1) assigned to a ceph OSD object be continued to used by other
apps in the system to issue I/O directly via (/dev/sda1) interface? Does
ceph prevent it, as such operations may corrupt data?
Thanks in advance,
naveen
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Issuing custom IOCTLs to SCSI LLD
2016-05-12 9:32 ` Naveen
@ 2016-05-12 12:37 ` Sage Weil
2016-05-12 13:21 ` John Spray
0 siblings, 1 reply; 10+ messages in thread
From: Sage Weil @ 2016-05-12 12:37 UTC (permalink / raw)
To: Naveen; +Cc: ceph-devel
On Thu, 12 May 2016, Naveen wrote:
> Sage Weil <sage <at> newdream.net> writes:
>
>
> > > > My question is:
> > > > The SCSI LLD would support both read /write entry points for I/O
> > > > requests issued by the filesystem/block I/O but they also support
> some
> > > > custom requests using IOCTLs. So how can ceph support issuing of
> such
> > > > IOCTL requests to the device if user issues such request. Say for
> > > > example power cycling the drive etc. It can also be a passthro
> request
> > > > down to the device.
> >
> > Can you give an example of such an operation?
> >
> > In general, any operation is generalized at the librados level.
> > For example, in order to get write same and cmpxchg block operations,
> we
> > added librados operations with similar semantics and implement them
> there.
> > It is unlikely that passing a complex operation down to the SCSI layer
> > will work in unison with the other steps involved in committing
> > an operation (e.g., updating metadata indicating the object
> > version has changed).
> >
> > sage
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> in
> > the body of a message to majordomo <at> vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> >
>
> Thanks for the response Sage. Example IOCTL operations would be like
> downloading a new FW to the drive/HBA, task management requests like
> Hard reset, power cycling the drive, issue a SAS/SMP/STP pass thro
> command to the drive for querying etc. All these would have to be
> initiated and go through ceph (if supported) and not bypassing it.
Ah. I think these kind of management functions should be performed while
the ceph-osd daemon for that drive is offline. We would probably want
some hardware management layer that coexists with ceph or that perhaps has
some minimal integration with the ceph osds to do this sort of thing.
It's not something that a client (user) would initiate, though.
> I asked a related question in another post too: Can a physical disk
> (/dev/sda1) assigned to a ceph OSD object be continued to used by other
> apps in the system to issue I/O directly via (/dev/sda1) interface? Does
> ceph prevent it, as such operations may corrupt data?
It depends on what privileges the other app has. If it's root or user
ceph, it can step all over the disk (and the rest of the system) and wreak
havoc. With the current backend, we are storing data as files, so you
could have other apps using other directories on the file system--this is
generally a bad idea for real deployments, though, as performance and disk
utilization will be unpredictable.
sage
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Issuing custom IOCTLs to SCSI LLD
2016-05-12 12:37 ` Sage Weil
@ 2016-05-12 13:21 ` John Spray
2016-05-12 13:38 ` Handzik, Joseph
0 siblings, 1 reply; 10+ messages in thread
From: John Spray @ 2016-05-12 13:21 UTC (permalink / raw)
To: Sage Weil; +Cc: Naveen, Ceph Development
On Thu, May 12, 2016 at 1:37 PM, Sage Weil <sage@newdream.net> wrote:
> On Thu, 12 May 2016, Naveen wrote:
>> Sage Weil <sage <at> newdream.net> writes:
>>
>>
>> > > > My question is:
>> > > > The SCSI LLD would support both read /write entry points for I/O
>> > > > requests issued by the filesystem/block I/O but they also support
>> some
>> > > > custom requests using IOCTLs. So how can ceph support issuing of
>> such
>> > > > IOCTL requests to the device if user issues such request. Say for
>> > > > example power cycling the drive etc. It can also be a passthro
>> request
>> > > > down to the device.
>> >
>> > Can you give an example of such an operation?
>> >
>> > In general, any operation is generalized at the librados level.
>> > For example, in order to get write same and cmpxchg block operations,
>> we
>> > added librados operations with similar semantics and implement them
>> there.
>> > It is unlikely that passing a complex operation down to the SCSI layer
>> > will work in unison with the other steps involved in committing
>> > an operation (e.g., updating metadata indicating the object
>> > version has changed).
>> >
>> > sage
>> >
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>> in
>> > the body of a message to majordomo <at> vger.kernel.org
>> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>> >
>> >
>>
>> Thanks for the response Sage. Example IOCTL operations would be like
>> downloading a new FW to the drive/HBA, task management requests like
>> Hard reset, power cycling the drive, issue a SAS/SMP/STP pass thro
>> command to the drive for querying etc. All these would have to be
>> initiated and go through ceph (if supported) and not bypassing it.
>
> Ah. I think these kind of management functions should be performed while
> the ceph-osd daemon for that drive is offline. We would probably want
> some hardware management layer that coexists with ceph or that perhaps has
> some minimal integration with the ceph osds to do this sort of thing.
> It's not something that a client (user) would initiate, though.
This is all pretty relevant to Joe Handzik's stuff:
https://github.com/joehandzik/ceph/commits/wip-hw-mgmt-cli
http://www.spinics.net/lists/ceph-devel/msg30126.html
The idea there though is to enable passing libstoragemgmt calls
through the OSD, as opposed to arbitrary SCSI operations.
Although libstoragemgmt is fairly young, I'm a fan of the idea that we
could use it internally within Ceph, and then have the same tools/libs
used by out-of-ceph management platforms when they want to do
equivalent stuff while the OSD is offline.
John
>> I asked a related question in another post too: Can a physical disk
>> (/dev/sda1) assigned to a ceph OSD object be continued to used by other
>> apps in the system to issue I/O directly via (/dev/sda1) interface? Does
>> ceph prevent it, as such operations may corrupt data?
>
> It depends on what privileges the other app has. If it's root or user
> ceph, it can step all over the disk (and the rest of the system) and wreak
> havoc. With the current backend, we are storing data as files, so you
> could have other apps using other directories on the file system--this is
> generally a bad idea for real deployments, though, as performance and disk
> utilization will be unpredictable.
>
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Issuing custom IOCTLs to SCSI LLD
2016-05-12 13:21 ` John Spray
@ 2016-05-12 13:38 ` Handzik, Joseph
2016-05-13 15:01 ` Naveen
0 siblings, 1 reply; 10+ messages in thread
From: Handzik, Joseph @ 2016-05-12 13:38 UTC (permalink / raw)
To: John Spray; +Cc: Sage Weil, Naveen, Ceph Development
I was intentionally staying quiet to hear other opinions, but I agree that the same mechanisms I'm building for LED operations could extend here (within Ceph of we want it to, but at the very least in libstoragemgmt).
Naveen, if you have interest in helping push this along I'd encourage you to voice your interest over on the libstoragemgmt GitHub: https://github.com/libstorage/libstoragemgmt
Joe
> On May 12, 2016, at 8:22 AM, John Spray <jspray@redhat.com> wrote:
>
>> On Thu, May 12, 2016 at 1:37 PM, Sage Weil <sage@newdream.net> wrote:
>>> On Thu, 12 May 2016, Naveen wrote:
>>> Sage Weil <sage <at> newdream.net> writes:
>>>
>>>
>>>>>> My question is:
>>>>>> The SCSI LLD would support both read /write entry points for I/O
>>>>>> requests issued by the filesystem/block I/O but they also support
>>> some
>>>>>> custom requests using IOCTLs. So how can ceph support issuing of
>>> such
>>>>>> IOCTL requests to the device if user issues such request. Say for
>>>>>> example power cycling the drive etc. It can also be a passthro
>>> request
>>>>>> down to the device.
>>>>
>>>> Can you give an example of such an operation?
>>>>
>>>> In general, any operation is generalized at the librados level.
>>>> For example, in order to get write same and cmpxchg block operations,
>>> we
>>>> added librados operations with similar semantics and implement them
>>> there.
>>>> It is unlikely that passing a complex operation down to the SCSI layer
>>>> will work in unison with the other steps involved in committing
>>>> an operation (e.g., updating metadata indicating the object
>>>> version has changed).
>>>>
>>>> sage
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>>> in
>>>> the body of a message to majordomo <at> vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>> Thanks for the response Sage. Example IOCTL operations would be like
>>> downloading a new FW to the drive/HBA, task management requests like
>>> Hard reset, power cycling the drive, issue a SAS/SMP/STP pass thro
>>> command to the drive for querying etc. All these would have to be
>>> initiated and go through ceph (if supported) and not bypassing it.
>>
>> Ah. I think these kind of management functions should be performed while
>> the ceph-osd daemon for that drive is offline. We would probably want
>> some hardware management layer that coexists with ceph or that perhaps has
>> some minimal integration with the ceph osds to do this sort of thing.
>> It's not something that a client (user) would initiate, though.
>
> This is all pretty relevant to Joe Handzik's stuff:
> https://github.com/joehandzik/ceph/commits/wip-hw-mgmt-cli
> http://www.spinics.net/lists/ceph-devel/msg30126.html
>
> The idea there though is to enable passing libstoragemgmt calls
> through the OSD, as opposed to arbitrary SCSI operations.
>
> Although libstoragemgmt is fairly young, I'm a fan of the idea that we
> could use it internally within Ceph, and then have the same tools/libs
> used by out-of-ceph management platforms when they want to do
> equivalent stuff while the OSD is offline.
>
> John
>
>
>>> I asked a related question in another post too: Can a physical disk
>>> (/dev/sda1) assigned to a ceph OSD object be continued to used by other
>>> apps in the system to issue I/O directly via (/dev/sda1) interface? Does
>>> ceph prevent it, as such operations may corrupt data?
>>
>> It depends on what privileges the other app has. If it's root or user
>> ceph, it can step all over the disk (and the rest of the system) and wreak
>> havoc. With the current backend, we are storing data as files, so you
>> could have other apps using other directories on the file system--this is
>> generally a bad idea for real deployments, though, as performance and disk
>> utilization will be unpredictable.
>>
>> sage
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Issuing custom IOCTLs to SCSI LLD
2016-05-12 13:38 ` Handzik, Joseph
@ 2016-05-13 15:01 ` Naveen
0 siblings, 0 replies; 10+ messages in thread
From: Naveen @ 2016-05-13 15:01 UTC (permalink / raw)
To: ceph-devel
Handzik, Joseph <joseph.t.handzik <at> hpe.com> writes:
>
> I was intentionally staying quiet to hear other opinions, but I agree
that the same mechanisms I'm building
> for LED operations could extend here (within Ceph of we want it to,
but at the very least in libstoragemgmt).
>
> Naveen, if you have interest in helping push this along I'd encourage
you to voice your interest over on the
> libstoragemgmt GitHub: https://github.com/libstorage/libstoragemgmt
>
> Joe
>
> >
> > This is all pretty relevant to Joe Handzik's stuff:
> > https://github.com/joehandzik/ceph/commits/wip-hw-mgmt-cli
> > http://www.spinics.net/lists/ceph-devel/msg30126.html
> >
> > The idea there though is to enable passing libstoragemgmt calls
> > through the OSD, as opposed to arbitrary SCSI operations.
> >
> > Although libstoragemgmt is fairly young, I'm a fan of the idea that
we
> > could use it internally within Ceph, and then have the same
tools/libs
> > used by out-of-ceph management platforms when they want to do
> > equivalent stuff while the OSD is offline.
> >
> > John
> >
> >
> >>> I asked a related question in another post too: Can a physical
disk
> >>> (/dev/sda1) assigned to a ceph OSD object be continued to used by
other
> >>> apps in the system to issue I/O directly via (/dev/sda1)
interface? Does
> >>> ceph prevent it, as such operations may corrupt data?
> >>
> >> It depends on what privileges the other app has. If it's root or
user
> >> ceph, it can step all over the disk (and the rest of the system)
and wreak
> >> havoc. With the current backend, we are storing data as files, so
you
> >> could have other apps using other directories on the file system--
this is
> >> generally a bad idea for real deployments, though, as performance
and disk
> >> utilization will be unpredictable.
> >>
> >> sage
Thanks everyone for the inputs and clarifications. Looks like
libstoragemgmt is key to lot of these feature. Will explore that next
and add if any questions in the git.
thanks,
naveen
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-05-13 15:02 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-09 9:27 Issuing custom IOCTLs to SCSI LLD Naveen
2016-05-11 4:18 ` Naveen
2016-05-11 5:27 ` Christoph Hellwig
2016-05-12 8:52 ` Naveen
2016-05-11 12:41 ` Sage Weil
2016-05-12 9:32 ` Naveen
2016-05-12 12:37 ` Sage Weil
2016-05-12 13:21 ` John Spray
2016-05-12 13:38 ` Handzik, Joseph
2016-05-13 15:01 ` Naveen
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.