* [GIT PULL] Ceph updates for 4.7-rc1 @ 2016-05-26 18:18 Sage Weil 2016-05-26 18:31 ` Linus Torvalds 2016-06-10 20:42 ` Arnd Bergmann 0 siblings, 2 replies; 14+ messages in thread From: Sage Weil @ 2016-05-26 18:18 UTC (permalink / raw) To: torvalds; +Cc: linux-kernel, ceph-devel Hi Linus, Please pull the following Ceph updates from git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git for-linus This changeset has a few main parts: * Ilya has finished a huge refactoring effort to sync up the client-side logic in libceph with the user-space client code, which has evolved significantly over the last couple years, with lots of additional behaviors (e.g., how requests are handled when cluster is full and transitions from full to non-full). This structure of the code is more closely aligned with userspace now such that it will be much easier to maintain going forward when behavior changes take place. There are some locking improvements bundled in as well. * Zheng adds multi-filesystem support (multiple namespaces within the same Ceph cluster) * Zheng has changed the readdir offsets and directory enumeration so that dentry offsets are hash-based and therefore stable across directory fragmentation events on the MDS. * Zheng has a smorgasbord of bug fixes across fs/ceph. Thanks! sage ---------------------------------------------------------------- Ilya Dryomov (40): rbd: get/put img_request in rbd_img_request_submit() libceph: make ceph_osdc_put_request() accept NULL libceph: grab snapc in ceph_osdc_alloc_request() libceph: move message allocation out of ceph_osdc_alloc_request() libceph: change how osd_op_reply message size is calculated libceph: variable-sized ceph_object_id rbd: use header_oid instead of header_name libceph: nuke unused fields and functions libceph: open-code remove_{all,old}_osds() libceph: DEFINE_RB_FUNCS macro libceph: fix ceph_eversion encoding libceph: rename ceph_oloc_oid_to_pg() libceph: ceph_osds, ceph_pg_to_up_acting_osds() libceph: rename ceph_calc_pg_primary() libceph: make pgid_cmp() global libceph: pi->min_size, pi->last_force_request_resend libceph: introduce ceph_osd_request_target, calc_target() libceph: switch to calc_target(), part 1 libceph: switch to calc_target(), part 2 libceph: drop msg argument from ceph_osdc_callback_t libceph: redo callbacks and factor out MOSDOpReply decoding libceph: move schedule_delayed_work() in ceph_osdc_init() libceph: schedule tick from ceph_osdc_init() libceph: allocate dummy osdmap in ceph_osdc_init() libceph: handle_one_map() libceph: osd_init() and osd_cleanup() libceph: allocate ceph_osd with GFP_NOFAIL libceph: protect osdc->osd_lru list with a spinlock libceph: a major OSD client update libceph: request_init() and request_release_checks() libceph: wait_request_timeout() rbd: rbd_dev_header_unwatch_sync() variant libceph, rbd: ceph_osd_linger_request, watch/notify v2 libceph: support for sending notifies libceph: support for checking on status of watch libceph: async MON client generic requests libceph: pool deletion detection libceph: take osdc->lock in osdmap_show() and dump flags in hex libceph: replace ceph_monc_request_next_osdmap() libceph: support for subscribing to "mdsmap.<id>" maps Yan, Zheng (30): ceph: multiple filesystem support ceph: CEPH_FEATURE_MDSENC support ceph: renew caps for read/write if mds session got killed. ceph: don't call truncate_pagecache in ceph_writepages_start ceph: don't show symlink target in debugfs/mdsc ceph: report mount root in session metadata ceph: use CEPH_MDS_OP_RMXATTR request to remove xattr ceph: search cache postion for dcache readdir ceph: remove unnecessary checks in __dcache_readdir ceph: simplify 'offset in frag' ceph: define struct for dir entry in readdir reply ceph: define 'end/complete' in readdir reply as bit flags ceph: record 'offset' for each entry of readdir result ceph: don't forbid marking directory complete after forward seek ceph: using hash value to compose dentry offset ceph: fix inode reference leak ceph: don't assume frag tree splits in mds reply are sorted ceph: fix dir_auth check in ceph_fill_dirfrag() ceph: keep leaf frag when updating fragtree ceph: improve fragtree change detection ceph: tolerate bad i_size for symlink inode ceph: block non-fatal signals for fault/page_mkwrite ceph: make fault/page_mkwrite return VM_FAULT_OOM for -ENOMEM ceph: handle -EAGAIN returned by ceph_update_writeable_page() libceph: make ceph_osdc_wait_request() uninterruptible ceph: make ceph_update_writeable_page() uninterruptible ceph: handle interrupted ceph_writepage() ceph: SetPageError() for writeback pages if writepages fails ceph: don't use truncate_pagecache() to invalidate read cache ceph: fix wake_up_session_cb() Zhang Zhuoyu (1): ceph: make logical calculation functions return bool drivers/block/rbd.c | 305 +-- fs/ceph/addr.c | 214 +- fs/ceph/cache.c | 2 +- fs/ceph/caps.c | 51 +- fs/ceph/debugfs.c | 2 +- fs/ceph/dir.c | 376 ++-- fs/ceph/file.c | 89 +- fs/ceph/inode.c | 159 +- fs/ceph/ioctl.c | 14 +- fs/ceph/mds_client.c | 140 +- fs/ceph/mds_client.h | 17 +- fs/ceph/mdsmap.c | 43 +- fs/ceph/super.c | 47 +- fs/ceph/super.h | 12 +- fs/ceph/xattr.c | 25 +- include/linux/ceph/ceph_frag.h | 4 +- include/linux/ceph/ceph_fs.h | 20 +- include/linux/ceph/decode.h | 2 +- include/linux/ceph/libceph.h | 57 + include/linux/ceph/mon_client.h | 23 +- include/linux/ceph/osd_client.h | 231 ++- include/linux/ceph/osdmap.h | 158 +- include/linux/ceph/rados.h | 34 +- net/ceph/ceph_common.c | 2 +- net/ceph/ceph_strings.c | 16 + net/ceph/debugfs.c | 147 +- net/ceph/mon_client.c | 393 ++-- net/ceph/osd_client.c | 4074 +++++++++++++++++++++++++-------------- net/ceph/osdmap.c | 651 +++++-- 29 files changed, 4779 insertions(+), 2529 deletions(-) ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] Ceph updates for 4.7-rc1 2016-05-26 18:18 [GIT PULL] Ceph updates for 4.7-rc1 Sage Weil @ 2016-05-26 18:31 ` Linus Torvalds 2016-05-26 19:02 ` Sage Weil 2016-05-26 21:13 ` Linus Torvalds 2016-06-10 20:42 ` Arnd Bergmann 1 sibling, 2 replies; 14+ messages in thread From: Linus Torvalds @ 2016-05-26 18:31 UTC (permalink / raw) To: Sage Weil; +Cc: Linux Kernel Mailing List, ceph-devel On Thu, May 26, 2016 at 11:18 AM, Sage Weil <sweil@redhat.com> wrote: > > Please pull the following Ceph updates from Why was that branch rebased yesterday? What has been in next, if anything? And if something has been in next, why was _that_ not sent to me? Pulled and then immediately unpulled again. I'm fed up with people doing stupid things. Linus ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] Ceph updates for 4.7-rc1 2016-05-26 18:31 ` Linus Torvalds @ 2016-05-26 19:02 ` Sage Weil 2016-05-26 19:54 ` Linus Torvalds 2016-05-26 21:13 ` Linus Torvalds 1 sibling, 1 reply; 14+ messages in thread From: Sage Weil @ 2016-05-26 19:02 UTC (permalink / raw) To: Linus Torvalds; +Cc: Linux Kernel Mailing List, ceph-devel On Thu, 26 May 2016, Linus Torvalds wrote: > On Thu, May 26, 2016 at 11:18 AM, Sage Weil <sweil@redhat.com> wrote: > > > > Please pull the following Ceph updates from > > Why was that branch rebased yesterday? > > What has been in next, if anything? > > And if something has been in next, why was _that_ not sent to me? The branch was assembled in its current form yesterday and is included in today's -next: https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=e536030934aebf049fe6aaebc58dd37aeee21840 The same commit went through our internal testing last night, and we've been testing the code for the better part of a week internally. If you want it to bake longer in -next first, let us know. We're not causing merge conflicts, and there isn't -next-based ceph testing that I'm aware of going on outside of our own QA environment, so I'm not sure how valuable it is, but I'm happy to delay before sending a pull request if that's what you want to see. Thanks- sage ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] Ceph updates for 4.7-rc1 2016-05-26 19:02 ` Sage Weil @ 2016-05-26 19:54 ` Linus Torvalds 2016-05-26 20:10 ` Al Viro 0 siblings, 1 reply; 14+ messages in thread From: Linus Torvalds @ 2016-05-26 19:54 UTC (permalink / raw) To: Sage Weil; +Cc: Linux Kernel Mailing List, ceph-devel On Thu, May 26, 2016 at 12:02 PM, Sage Weil <sweil@redhat.com> wrote: > > The branch was assembled in its current form yesterday and is included in > today's -next: So that's what I *don't* want, and it's not the point of "next". The next branch is about the *next* merge window. If it was about the current one, it would be called "linux-current". It's not just about merge conflicts and integration - it's also about showing that your code is ready. Having the branch being created the day before would be much more palatable if we were talking about the first few days of the merge window. As it is, we're in the latter part of the second week of the merge window, and I get a very strong feeling that it wasn't ready for the merge window, and that it was set up in a hurry because it was getting to the last two working days. And again, it's called the *merge* window, not the *development* window. It's not that the two weeks is where the work is supposed to be done, it's when things get merged. Right now I'm in the mode where I was planning to look at some of the older pulls that I didn't do earlier because I felt I wanted to take a second look (in particular, I have two DAX pull requests that I am getting back to). And I'm starting to already see pull requests for *fixes* for things I pulled earlier in the merge window. That makes me go "good, people are already using it and finding problems, we're starting to move from just integration to fixing things up". Having entirely new pull requests show up that haven't even been on my radar because they weren't in linux-next is annoying. And yes, I've started to become stricter about these issues, because it turns out that the whole linux-next process has worked fairly well. It allows developers to see what the potential problem spots can be, but it also allows me to see what I can expect during the merge window. And the "it's been in next for a while" really does give me the warm and fuzzies. Admittedly very few people actually *run* and actually test next kernels, but there's a couple who do, and even without that it does tend to get not just build coverage for odd configurations, but there are now several boot farms too, so it does add value. If your patch series had been "90% of it has been in -next since before the merge window even opened, and 10% are clearly new things that came in later", that would be one thing. That's perfectly normal - last-minute fixes etc, possibly for stuff that was _found_ because it was in linux-next. But it was *all* from the last day, and I don't see anything at all in my next tree (which I tend fetch at the end of the first week, exactly to get a feel for "ok, so who/what am I still looking to get"). And yes, Ceph is pretty standalone and the boot farms likely don't trigger any ceph testing, and the build coverage is probably pretty configuration-independent (with architecture header file differences likely being the biggest issue, and unlikely to happen to bite *only* ceph), so you can always argue that linux-next isn't really all that big of a deal for Ceph at least wrt coverage. But that can be argued individually for almost _any_ subsystem. None of them are all that likely to have problems individually. It's just that once you have lots of different subsystems, not only do they interact, but enough of those "very unlikely to have problems" on their own tends to become "it's likely that _one_ of them ended up causing a problem anyway". Linus ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] Ceph updates for 4.7-rc1 2016-05-26 19:54 ` Linus Torvalds @ 2016-05-26 20:10 ` Al Viro 2016-05-26 21:18 ` Linus Torvalds 0 siblings, 1 reply; 14+ messages in thread From: Al Viro @ 2016-05-26 20:10 UTC (permalink / raw) To: Linus Torvalds; +Cc: Sage Weil, Linux Kernel Mailing List, ceph-devel On Thu, May 26, 2016 at 12:54:27PM -0700, Linus Torvalds wrote: > Having entirely new pull requests show up that haven't even been on my > radar because they weren't in linux-next is annoying. How about the things like followups to earlier merges? I've got in #for-linus update D/f/directory-locking add down_write_killable_nested() restore killability of old mutex_lock_killable(&inode->i_mutex) users The first one probably should've been in the work.lookups merge, but the last two clearly depend upon down_write_killable() having been already merged... ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] Ceph updates for 4.7-rc1 2016-05-26 20:10 ` Al Viro @ 2016-05-26 21:18 ` Linus Torvalds 0 siblings, 0 replies; 14+ messages in thread From: Linus Torvalds @ 2016-05-26 21:18 UTC (permalink / raw) To: Al Viro; +Cc: Sage Weil, Linux Kernel Mailing List, ceph-devel On Thu, May 26, 2016 at 1:10 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > > How about the things like followups to earlier merges? Small and obvious follow-ups are fine. What I really hated about this ceph pull request was that it was multiple thousand lines of changes, with no previous work or warning, and effectively no time in linux-next. Your for-linus branch is three small commits that actually restore old functionality (well, one of them is documentetion) that got removed by the big stuff you sent early (*). So not at all the same kind of thing, and not problematic at all. Linus (*) thanks, btw - you used to be one of the late people, now lately you've been one of the early ones. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] Ceph updates for 4.7-rc1 2016-05-26 18:31 ` Linus Torvalds 2016-05-26 19:02 ` Sage Weil @ 2016-05-26 21:13 ` Linus Torvalds 2016-05-26 21:46 ` Sage Weil 1 sibling, 1 reply; 14+ messages in thread From: Linus Torvalds @ 2016-05-26 21:13 UTC (permalink / raw) To: Sage Weil; +Cc: Linux Kernel Mailing List, ceph-devel On Thu, May 26, 2016 at 11:31 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > Pulled and then immediately unpulled again. .. and having thought it over, I ended up re-pulling again, so now it's going through my build test. Consider this discussion a strong encouragement to *not* do this in the future - sending me pull requests at the end of the merge window without them having been in linux-next is a no-no, unless those pull requests are small and trivial (or have fixes that I'd pull even outside the merge window, of course). Linus ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] Ceph updates for 4.7-rc1 2016-05-26 21:13 ` Linus Torvalds @ 2016-05-26 21:46 ` Sage Weil 2016-05-27 2:16 ` Linus Torvalds 0 siblings, 1 reply; 14+ messages in thread From: Sage Weil @ 2016-05-26 21:46 UTC (permalink / raw) To: Linus Torvalds; +Cc: Linux Kernel Mailing List, ceph-devel On Thu, 26 May 2016, Linus Torvalds wrote: > On Thu, May 26, 2016 at 11:31 AM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > Pulled and then immediately unpulled again. > > .. and having thought it over, I ended up re-pulling again, so now > it's going through my build test. > > Consider this discussion a strong encouragement to *not* do this in > the future - sending me pull requests at the end of the merge window > without them having been in linux-next is a no-no, unless those pull > requests are small and trivial (or have fixes that I'd pull even > outside the merge window, of course). Thank you! We'll be sure we include things in -next well beforehand next time around, especially if it's a big diff like this one. One point of clarification, though: in the past I've squashed down fixes discovered during testing if the branch hasn't hit a stable tree yet (e.g., your tree). AIUI this is(was?) standard procedure for things in -next. Do you want us to avoid squashing if we are creeping up on pull request time, or are you primarily interested in, say, seeing that what has been in -next for a while is substantially the same as what you pull, and has perhaps been there unmodified for at least a few days? Or would you rather see fixup patches if we identify issues in the last few days of testing? Thanks- sage ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] Ceph updates for 4.7-rc1 2016-05-26 21:46 ` Sage Weil @ 2016-05-27 2:16 ` Linus Torvalds 0 siblings, 0 replies; 14+ messages in thread From: Linus Torvalds @ 2016-05-27 2:16 UTC (permalink / raw) To: Sage Weil; +Cc: Linux Kernel Mailing List, ceph-devel On Thu, May 26, 2016 at 2:46 PM, Sage Weil <sweil@redhat.com> wrote: > > One point of clarification, though: in the past I've squashed down fixes > discovered during testing if the branch hasn't hit a stable tree yet > (e.g., your tree). AIUI this is(was?) standard procedure for things in > -next. Yes, rebasing with good reason is acceptable for branches that don't have anybody else depend on them. Adn the "good reason" ends up being a judgement call. If the merge window hasn't even started yet, the "good" reason may not even be very great, and might might be "oh, I screwed up the commit message, so let's make the history look good". If it's already inside the merge window, you should aim for having increasingly higher barriers to rebasing your tree, and strive to generally try to avoid it. If it's about something mostly cosmetic and the merge windoe has opened or is just about to, leave it be. On the other hand, if it's really nasty problem and seriously will hurt people who try to bisect - even if you have fixed the problem then later in the history - you might choose to do it to not be in the situation that people who use "git bisect" to find another bug will then be left with data corruption or something like that because of a major bug in the middle of the development history. And in between those two extremes of "cosmetic" and "nasty data corruption bug" there is obviously a graduation of issues. There can't be any completely black-and-white rules. But the corollary to that is that if you really had a major bug that you feld had to be fixed not just at the tip, but going back, then you then shouldn't immediately send the end result to me. Because you just fixed something critical (by definition, if you chose to do it just before you would have wanted to send to me), so now you need to retest things. So rebasing isn't some absolute wrong thing. Sometimes rebasing is simply the right thing to do. For example, maybe I don't get the same commits that were in -next, but I would have still seen that the code was there. Linus ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] Ceph updates for 4.7-rc1 2016-05-26 18:18 [GIT PULL] Ceph updates for 4.7-rc1 Sage Weil 2016-05-26 18:31 ` Linus Torvalds @ 2016-06-10 20:42 ` Arnd Bergmann 2016-06-10 21:32 ` Linus Torvalds 1 sibling, 1 reply; 14+ messages in thread From: Arnd Bergmann @ 2016-06-10 20:42 UTC (permalink / raw) To: Sage Weil; +Cc: torvalds, linux-kernel, ceph-devel, Ilya Dryomov On Thursday, May 26, 2016 2:18:03 PM CEST Sage Weil wrote: > Hi Linus, > > Please pull the following Ceph updates from > > git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git for-linus > > This changeset has a few main parts: > > * Ilya has finished a huge refactoring effort to sync up the client-side > logic in libceph with the user-space client code, which has evolved > significantly over the last couple years, with lots of additional > behaviors (e.g., how requests are handled when cluster is full and > transitions from full to non-full). This structure of the code is more > closely aligned with userspace now such that it will be much easier to > maintain going forward when behavior changes take place. There are some > locking improvements bundled in as well. I'm getting a warning in some ARM randconfig build: WARNING: "ceph_monc_do_statfs" [fs/ceph/ceph.ko] has no CRC! I have bisected this down to this particular commit fcd00b68bbe: > ---------------------------------------------------------------- > Ilya Dryomov (40): > libceph: DEFINE_RB_FUNCS macro but I have no idea how the change relates to the symptom though. What I see is that this one exported symbol has a __crc of a different type from all the others: $ nm net/ceph/mon_client.o | grep __crc 48c2e16e A __crc_ceph_monc_get_version 2360d633 A __crc_ceph_monc_get_version_async 0c50a10a A __crc_ceph_monc_got_map w __crc_ceph_monc_do_statfs b63e5cf5 A __crc_ceph_monc_init c4602476 A __crc_ceph_monc_open_session 5e6fd15f A __crc_ceph_monc_renew_subs 53a9ed7f A __crc_ceph_monc_stop 33a329fa A __crc_ceph_monc_validate_auth 34010ef9 A __crc_ceph_monc_wait_osdmap 29c50846 A __crc_ceph_monc_want_map I only see this in some rare randconfig builds and have not figured out which options are required, but I have also been able to reproduce it on an x86 config and have uploaded the .config file to http://pastebin.com/raw/Dsrtfbcs Arnd ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] Ceph updates for 4.7-rc1 2016-06-10 20:42 ` Arnd Bergmann @ 2016-06-10 21:32 ` Linus Torvalds 2016-06-10 23:04 ` Arnd Bergmann ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: Linus Torvalds @ 2016-06-10 21:32 UTC (permalink / raw) To: Arnd Bergmann Cc: Sage Weil, Linux Kernel Mailing List, ceph-devel, Ilya Dryomov On Fri, Jun 10, 2016 at 1:42 PM, Arnd Bergmann <arnd@arndb.de> wrote: > > What I see is that this one exported symbol has a __crc of a different > type from all the others: > > $ nm net/ceph/mon_client.o | grep __crc > 48c2e16e A __crc_ceph_monc_get_version > 2360d633 A __crc_ceph_monc_get_version_async > 0c50a10a A __crc_ceph_monc_got_map > w __crc_ceph_monc_do_statfs A lower-case 'w' in a symbol list just means that it's a local weak symbol (with a upper-case 'A' meaning it's an absolute global). Afaik, that simply means that it never got resolved, and genksyms never generated that absolute value for it. As to _why_ that happens, that's more than I can guess. We've had problems with genksyms before, and it tends to be hard to debug. Is it 100% reliable for you? Because the most common problem has been issues with subtle build races, where just causing a re-build will fix it. Your config doesn't work for me, when I do cp ~/genksyms-config.txt .config make ARCH=i386 oldconfig I get something else than what you had. I tried with both current -git and the commit you pinpointed, so I don't know how you generated that config file.. Linus ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] Ceph updates for 4.7-rc1 2016-06-10 21:32 ` Linus Torvalds @ 2016-06-10 23:04 ` Arnd Bergmann 2016-06-11 22:50 ` Arnd Bergmann 2016-06-13 13:06 ` Arnd Bergmann 2 siblings, 0 replies; 14+ messages in thread From: Arnd Bergmann @ 2016-06-10 23:04 UTC (permalink / raw) To: Linus Torvalds Cc: Sage Weil, Linux Kernel Mailing List, ceph-devel, Ilya Dryomov On Friday, June 10, 2016 2:32:21 PM CEST Linus Torvalds wrote: > On Fri, Jun 10, 2016 at 1:42 PM, Arnd Bergmann <arnd@arndb.de> wrote: > > > > What I see is that this one exported symbol has a __crc of a different > > type from all the others: > > > > $ nm net/ceph/mon_client.o | grep __crc > > 48c2e16e A __crc_ceph_monc_get_version > > 2360d633 A __crc_ceph_monc_get_version_async > > 0c50a10a A __crc_ceph_monc_got_map > > w __crc_ceph_monc_do_statfs > > A lower-case 'w' in a symbol list just means that it's a local weak > symbol (with a upper-case 'A' meaning it's an absolute global). > > Afaik, that simply means that it never got resolved, and genksyms > never generated that absolute value for it. > > As to _why_ that happens, that's more than I can guess. We've had > problems with genksyms before, and it tends to be hard to debug. > > Is it 100% reliable for you? Because the most common problem has been > issues with subtle build races, where just causing a re-build will fix > it. In a few thousand randconfig builds, this is the only symbol I ever see the problem with, and I always see it with the same configurations after rebuilding dozens of times, including with different compiler versions (I only tried arm-gcc-4.9, arm-gcc-6.1 and x86-gcc-5.3, but that seems to cover a wide range). > Your config doesn't work for me, when I do > > cp ~/genksyms-config.txt .config > make ARCH=i386 oldconfig > > I get something else than what you had. I tried with both current -git > and the commit you pinpointed, so I don't know how you generated that > config file.. I had not tried building the entire kernel on x86, and indeed I don't see the warning there either, but I do see this one weak symbol in my configuration: arm-soc/obj-x86$ nm vmlinux | grep __crc | grep -w 'w' w __crc_ceph_monc_do_statfs Also, the .config I first uploaded was based on my randconfig tree that has some other changes to avoid all existing warnings. I've done a 'make oldconfig' on your current HEAD now and put that on http://pastebin.com/raw/EJBaG0FV just to make sure we have an identical configuration. The respective ARM .config file does produce the warning on mainline, and I've uploaded that to http://pastebin.com/3NRSFSdr I generated the x86 configuration by starting with this one and running 'make olddefconfig ARCH=x86' on it. I have found 15 more random configurations with the same symptom now, and about 50 configurations with CONFIG_CEPH_FS=m that don't show it, so I can mine the configurations for more hints next week to see what influences it. Arnd ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] Ceph updates for 4.7-rc1 2016-06-10 21:32 ` Linus Torvalds 2016-06-10 23:04 ` Arnd Bergmann @ 2016-06-11 22:50 ` Arnd Bergmann 2016-06-13 13:06 ` Arnd Bergmann 2 siblings, 0 replies; 14+ messages in thread From: Arnd Bergmann @ 2016-06-11 22:50 UTC (permalink / raw) To: Linus Torvalds Cc: Sage Weil, Linux Kernel Mailing List, ceph-devel, Ilya Dryomov, Michal Marek, Sam Ravnborg On Friday, June 10, 2016 2:32:21 PM CEST Linus Torvalds wrote: > On Fri, Jun 10, 2016 at 1:42 PM, Arnd Bergmann <arnd@arndb.de> wrote: > > > > What I see is that this one exported symbol has a __crc of a different > > type from all the others: > > > > $ nm net/ceph/mon_client.o | grep __crc > > 48c2e16e A __crc_ceph_monc_get_version > > 2360d633 A __crc_ceph_monc_get_version_async > > 0c50a10a A __crc_ceph_monc_got_map > > w __crc_ceph_monc_do_statfs > > A lower-case 'w' in a symbol list just means that it's a local weak > symbol (with a upper-case 'A' meaning it's an absolute global). > > Afaik, that simply means that it never got resolved, and genksyms > never generated that absolute value for it. > > As to _why_ that happens, that's more than I can guess. We've had > problems with genksyms before, and it tends to be hard to debug. (Cc: Michal and Sam, who might understand this better) I still don't know what goes wrong, but the patch below fixes it. I have experimentally determined that the next EXPORT_SYMBOL() after the DEFINE_RB_FUNCS line in net/ceph/mon_client.c ends up without a checksum, and that adding a semicolon at the end of that line makes it work fine. However, there are other DEFINE_RB_FUNCS instances in net/ceph/osd_client.c that don't suffer from this problem, so I still have no clue why it helps, and we probably don't want to apply the patch unless we know what the problem is. Arnd diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c index 37c38a7fb5c5..1ac468920495 100644 --- a/net/ceph/mon_client.c +++ b/net/ceph/mon_client.c @@ -478,7 +478,7 @@ out: /* * generic requests (currently statfs, mon_get_version) */ -DEFINE_RB_FUNCS(generic_request, struct ceph_mon_generic_request, tid, node) +DEFINE_RB_FUNCS(generic_request, struct ceph_mon_generic_request, tid, node); static void release_generic_request(struct kref *kref) { ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [GIT PULL] Ceph updates for 4.7-rc1 2016-06-10 21:32 ` Linus Torvalds 2016-06-10 23:04 ` Arnd Bergmann 2016-06-11 22:50 ` Arnd Bergmann @ 2016-06-13 13:06 ` Arnd Bergmann 2 siblings, 0 replies; 14+ messages in thread From: Arnd Bergmann @ 2016-06-13 13:06 UTC (permalink / raw) To: Linus Torvalds Cc: Sage Weil, Linux Kernel Mailing List, ceph-devel, Ilya Dryomov On Friday, June 10, 2016 2:32:21 PM CEST Linus Torvalds wrote: > On Fri, Jun 10, 2016 at 1:42 PM, Arnd Bergmann <arnd@arndb.de> wrote: > > > > What I see is that this one exported symbol has a __crc of a different > > type from all the others: > > > > $ nm net/ceph/mon_client.o | grep __crc > > 48c2e16e A __crc_ceph_monc_get_version > > 2360d633 A __crc_ceph_monc_get_version_async > > 0c50a10a A __crc_ceph_monc_got_map > > w __crc_ceph_monc_do_statfs > > A lower-case 'w' in a symbol list just means that it's a local weak > symbol (with a upper-case 'A' meaning it's an absolute global). > > Afaik, that simply means that it never got resolved, and genksyms > never generated that absolute value for it. > > As to _why_ that happens, that's more than I can guess. We've had > problems with genksyms before, and it tends to be hard to debug. > > Is it 100% reliable for you? Because the most common problem has been > issues with subtle build races, where just causing a re-build will fix > it. > > Your config doesn't work for me, when I do > > cp ~/genksyms-config.txt .config > make ARCH=i386 oldconfig > > I get something else than what you had. I tried with both current -git > and the commit you pinpointed, so I don't know how you generated that > config file.. I've tracked it down to the use of the 'typeof(((type *)0)->keyfld)' expression in DEFINE_RB_LOOKUP() now, and sent a patch with subject "ceph: fix symbol versioning for ceph_monc_do_statfs" that works around it by rewriting that line in a way that genksyms understands. Arnd ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2016-06-13 13:05 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-05-26 18:18 [GIT PULL] Ceph updates for 4.7-rc1 Sage Weil 2016-05-26 18:31 ` Linus Torvalds 2016-05-26 19:02 ` Sage Weil 2016-05-26 19:54 ` Linus Torvalds 2016-05-26 20:10 ` Al Viro 2016-05-26 21:18 ` Linus Torvalds 2016-05-26 21:13 ` Linus Torvalds 2016-05-26 21:46 ` Sage Weil 2016-05-27 2:16 ` Linus Torvalds 2016-06-10 20:42 ` Arnd Bergmann 2016-06-10 21:32 ` Linus Torvalds 2016-06-10 23:04 ` Arnd Bergmann 2016-06-11 22:50 ` Arnd Bergmann 2016-06-13 13:06 ` Arnd Bergmann
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).