From mboxrd@z Thu Jan 1 00:00:00 1970 From: Scott Lurndal Subject: Re: segmentation fault in numa_node_to_cpus_v1 Date: Mon, 1 Nov 2010 17:36:27 -0800 Message-ID: <20101102013627.GA1953@www.lurndal.org> References: <20101101225942.GA21509@sgi.com> Mime-Version: 1.0 Return-path: Content-Disposition: inline In-Reply-To: <20101101225942.GA21509@sgi.com> Sender: linux-numa-owner@vger.kernel.org List-ID: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Cliff Wickman Cc: Michael Spiegel , linux-numa@vger.kernel.org On Mon, Nov 01, 2010 at 05:59:42PM -0500, Cliff Wickman wrote: > On Mon, Nov 01, 2010 at 03:52:59PM -0400, Michael Spiegel wrote: > > Hi, > > > > I'm trying to run the HotSpot Java VM on an SGI UV 1000 with 4096 > > cores. When I enable the NUMA-aware garbage collection algorithm, I > > get a segmentation fault as the virtual machine is initializing. The > > sigsegv is occurring at one of the memcpy's in numa_node_to_cpus_v1, > > although I'm afraid I can't determine whether libnuma is being called > > correctly or incorrectly. I am testing on a system that has numactl > > 2.0.5. > > > > Thanks, > > --Michael > > Hi Michael, > > I see that Scott Lundal gave you a possible fix. > There were some important corrections added to the latest version, so > if you could try building numactl/libnuma from numactl-2.0.6-rc3.tar.gz > that would be an interesting test. > (ftp://oss.sgi.com/www/projects/libnuma/download/) Hi Cliff, I suspect that Michael will find the same problem with the newest numactl release; the problem is that oracle (in my case) and the JVM (in Michael's case) don't use the 'dlvsym' function after dynamically loading (dlopen) the libnuma library, they just use 'dlsym'. Thus they'll get the wrong API for 'numa_node_to_cpus' instead of the one for the version the jvm was coded for. We've seen a seg fault if the wrong version of numa_node_to_cpus is called because of the signature change. My patch just changes the symbol versioning to swap the _v1 and _v2 default. Of course, any properly written applications cannot use this modified library, so it's best to load it with LD_LIBRARY_PATH rather than replacing the system library. scott