Del wrote:
Richard Megginson wrote:
> Del wrote:
>>
>> Hi,
>>
>> Following an earlier suggestion on this thread, I have tried to get FDS
>> running on a Fedora 7 box using the binary RPM from the download area
>> for Fedora Core 6.
>>
>> The directory server appears to run fine, but the admin server just
>> spews
>> a torrent of log messages saying:
>>
>> [Wed Aug 08 18:00:07 2007] [notice] child pid 19260 exit signal
>> Segmentation fault (11)
>> ... etc.
> I'm not sure what the problem is. I just downloaded FDS 1.0.4 for
> FC6 x86_64 and installed on a vmware instance of F7 x86_64. The F7
> system has the latest updates as of today. It works fine. I ran
> setup, just accepted the defaults, setup completed and started the
> admin server. I don't have java installed on the system, but I was
> able to use the web interface to run several of the CGIs. I have no
> problems.
>>
>> Has anyone else seen this and can anyone offer any suggestions as to
>> how
>> to get it going? It's quite tricky to run strace / gdb on the httpd
>> binary
>> as all I get is as far as the fork, and it appears to be the
>> httpd.worker
>> child processes that are dying.
> strace -f will follow forks (-ff to write each process output to
> separate files), and gdb has a mode to follow forks as well.
I've been working on this for some weeks now with no success.
I have one server which has been upgraded from FC6 to FC7 and it works
fine.
I have another server which is a new FC7 install and it fails. Both are
similar architecture, x86 32 bit.
The strace -ff output shows this on each process on the machine where it
is failing:
open("tls/i686/libnsl.so.1", O_RDONLY) = -1 ENOENT (No such file or
directory)
open("tls/sse2/libnsl.so.1", O_RDONLY) = -1 ENOENT (No such file or
directory)
open("tls/libnsl.so.1", O_RDONLY) = -1 ENOENT (No such file or
directory)
open("i686/sse2/libnsl.so.1", O_RDONLY) = -1 ENOENT (No such file or
directory)
open("i686/libnsl.so.1", O_RDONLY) = -1 ENOENT (No such file or
directory)
open("sse2/libnsl.so.1", O_RDONLY) = -1 ENOENT (No such file or
directory)
open("libnsl.so.1", O_RDONLY) = -1 ENOENT (No such file or
directory)
open("/lib/libnsl.so.1", O_RDONLY) = 30
read(30,
"\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0@\341\211"..., 512) = 512
fstat64(30, {st_mode=S_IFREG|0755, st_size=109732, ...}) = 0
mmap2(NULL, 100296, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE,
30, 0) = 0x50a32000
mmap2(0x50a47000, 8192, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 30, 0x14) = 0x50a47000
mmap2(0x50a49000, 6088, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x50a49000
close(30) = 0
mprotect(0x50a47000, 4096, PROT_READ) = 0
munmap(0xb7282000, 109394) = 0
rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_IGN}, 8) = 0
geteuid32() = 0
futex(0x5defa564, FUTEX_WAKE, 2147483647) = 0
open("/etc/ldap.conf", O_RDONLY) = 30
fstat64(30, {st_mode=S_IFREG|0644, st_size=9020, ...}) = 0
fstat64(30, {st_mode=S_IFREG|0644, st_size=9020, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0xb7f5d000
read(30, "# @(#)$Id: ldap.conf,v 1.38 2006"..., 4096) = 4096
read(30, "assword ad\n\n# Use the OpenLDAP p"..., 4096) = 4096
read(30, " for 2.1 and later is \"yes\".\n#tl"..., 4096) = 828
read(30, "", 4096) = 0
close(30) = 0
munmap(0xb7f5d000, 4096) = 0
uname({sys="Linux", node="marvin.babel.office", ...}) = 0
open("/etc/hosts", O_RDONLY) = 30
fcntl64(30, F_GETFD) = 0
fcntl64(30, F_SETFD, FD_CLOEXEC) = 0
fstat64(30, {st_mode=S_IFREG|0644, st_size=278, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0xb7f5d000
read(30, "# Do not remove the following li"..., 4096) = 278
close(30) = 0
munmap(0xb7f5d000, 4096) = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++
So it looks like it's attempting a connection to the LDAP server in
NSS_LDAP
somewhere, possibly looking for the current uid, and then looking in
/etc/hosts
for the current host name.
What process is this strace from? ns-slapd?
httpd.worker?
What user and group is the server running as? Does it have to make an
nss_ldap call to get these user IDs? If so, then this is likely the
problem.
/etc/ldap.conf contains the IP address of both LDAP servers.
/etc/hosts in
the current case looks like this:
127.0.0.1 localhost.localdomain localhost
192.168.110.52 marvin.babel.office marvin
192.168.110.42 fortytwo.babel.office fortytwo
All of these IP addresses are also mapped (and reverse mapped) in the
local
DNS.
Everything else on these systems works normally -- internet access,
web browsing,
sendmail, etc, all of the stuff that would normally use /etc/hosts
and/or DNS.
I've checked the systems over fairly extensively.
I can't think of why the admin server is failing at this point.
Anything I should
go looking for next?
On the machine where the admin server is not failing -- the strace
output looks
completely different. It doesn't appear to be doing any
NSS/DNS/etc/hosts lookups
at all.