From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christopher Li Subject: Re: [RFC] rationale for systematic elimination of OP_SYMADDR instructions Date: Thu, 10 Aug 2017 11:01:22 -0400 Message-ID: References: <20170309142044.96408-1-luc.vanoostenryck@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Return-path: Received: from mail-yw0-f176.google.com ([209.85.161.176]:35906 "EHLO mail-yw0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752231AbdHJPB0 (ORCPT ); Thu, 10 Aug 2017 11:01:26 -0400 Received: by mail-yw0-f176.google.com with SMTP id u207so6242363ywc.3 for ; Thu, 10 Aug 2017 08:01:26 -0700 (PDT) In-Reply-To: Sender: linux-sparse-owner@vger.kernel.org List-Id: linux-sparse@vger.kernel.org To: Luc Van Oostenryck Cc: Linux-Sparse , Linus Torvalds On Wed, Apr 26, 2017 at 7:02 PM, Luc Van Oostenryck wrote: >> Does the address of the symbol ever change inside a function? >> I assume it does not change. If that is the case, can we skip the CSE >> and replace all the symbol address reference to one OP_SYMADDR? >> >> For example, for each symbol access in the function we insert OP_SYMADDR >> after the entry: >> foo: >> >> %r1 <- a >> %r2 <- b >> ... >> >> Then all reference of symbol address of "a" and "b" inside the function >> foo will use %r1 and %r2. Notice that we still keep the OP_SYMADDR >> instruction, just move to function entry. >> >> Is that illegal or bad? > > The address of a symbol will of course not change. > So yes, all the OP_SYMADDR could move to the top of the function. > It wouldn't be illegal and it could be advantageous in some cases. > It would be bad, though, if these addresses are in fact not used > (because of a conditional). I'm thinking to something like: > if (unlikely(some cond)) a++; > Of course, doing so would also need a register to hold these addresses > so pre-calculated. What if the function access a lot of symbols? Sorry for reply to a very old email. Still catching up the my backlogs. I think the function access a lot of symbols will need to have more OP_SYMADDR, at least one instruction per symbol if you want to have the OP_SYMADDR. You can't do better than that. > In my opinion, we should handle these OP_SYMADDR just like > any other instructions (in other words: near where they are used). I think in the current IR, how close it is to be used is not a big issue. If you want to get it close and have one OP_SYMADDR per symbol. The OP_SYMADDR should be place at immediate D(N), where N is the block that use the symbol. In other words, place the OP_SYMADDR at where N blocks join in the dominator tree. You can't do better than hat. Please correct me if I am wrong. I still think: 1) OP_SYMADDR can be generated from the symbol node and where to place them (immediate D(N)). 2) If one of the back end needs them. Go ahead and generate OP_SYMADDR. However, I don't think sparse checker need to use the OP_SYMADDR so it will be more optimal for sparse checker to avoid OP_SYMADDR. Am I missing some thing? Chris