2009-08-08

GHC 6.10.4 on RH4

Getting GHC 6.10.4 working on RHEL4 is a bit of a task. Certainly RH provides no RPM for such an old version of their OS and the Linux binary provided by the GHC team doesn't work on such an old libc. So I had to build it. Unfortunately, trying to build it from scratch (boot strapping) resulted in an error that I needed a newer version of gnu make. It needs gmake 3.81, and RH4 has gmake 3.80. Since I don't have root access to the machine, I had to build it from source and install it to ~/opt. No surprises in building or installing make.

Back to ghc 6.10.4 for the bootstrapping. Now that there's a newer version of make, the configure step completes. But when make is run, it complains that it can't run '-Wall'. Clearly, it didn't put the gcc executable into the command line so it's trying to execute the first parameter to gcc. So rather than try to figure out what's going wrong, I decided to find an older binary of ghc which would run on RH 4. GHC 6.8 did not run, but 6.6 did. So I installed GHC 6.6, then built ghc 6.10.4 directly from that. The 6.10.4 non-boostrapping worked without a problem.

Next came the testsuite. There's not much point in having a compiler if I don't know it works. So I ran the test suite, and tried building my own project while it was running. My process hit a wall:
can't load .so/.DLL for: rt (/usr/lib/librt.so: symbol __librt_multiple_threads, version GLIBC_PRIVATE not defined in file libc.so.6 with link time reference). Also many of the tests in the testsuite were failing.

With the help of the #ghc IRC channel, I found a known bug, and some pointers to possible solutions. It turns out that in RH4, glibc and the kernel have different linuxthread implementations. A program built to run with non-NPTL threads will give the above complaint if the system tries to run it with the NPTL libraries. The standard way of getting the system to pick the non-NPTL libraries is to set an environment variable:
LD_ASSUME_KERNEL=2.4.1

This got rid of the pthread errorr, but I was faced with another error:
[x@bldrh4 HaskellRME]$ LD_ASSUME_KERNEL=2.4.1 runhaskell Setup.lhs configure
Configuring haskellrme-0.0...
Setup.lhs: ghc version >=6.4 is required but the version of
/home/x/opt/ghc-6.10.4/bin/ghc could not be determined.
[x@bldrh4 HaskellRME]$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 6.10.4


So, WTF? Just to be sure my new GHC was fully working, I rebuilt GHC with the 6.10.4 that I had just built. I found out that the compiler was just fine, and ran way faster than 6.6 and produced much smaller stuff. The rebuild of ghc took under 2 hours on this machine, but the first time around with 6.6 it took more than 3 hours, and with the 6.6 compiler the whole 6.10.4 directory was nearly 2 GB, but with the rebuilt it was only 1.2 GB. I also found out that there was no problem with my compiler, but I installed the new new one anyways.

Now the testsuite runs with only two unexpected errors (break017(ghci) and ghcpkg02(normal)), but the cabal stuff still can't seem to determine the version of GHC I have. Thanks to the patient help of the IRC channel again, I narrowed the problem down to something to do with multi-threading and spawning sub-processes. The problem, it appears is when the sub-process exits too quickly for the parent. When that happens, the parent gets nothing and in this case thinks that ghc has no version. Using runghc and -v3 would slow down the execution enough that it would sometimes work:
[x@bldrh4 HaskellRME]$ runghc Setup.lhs configure -v3
Configuring haskellrme-0.0...
Creating dist (and its parents)
searching for ghc in path.
found ghc at /home/x/opt/ghc-6.10.4/bin/ghc
("/home/x/opt/ghc-6.10.4/bin/ghc",["--numeric-version"])
/home/x/opt/ghc-6.10.4/bin/ghc is version 6.10.4
looking for package tool: ghc-pkg near compiler in
/home/x/opt/ghc-6.10.4/bin
found package tool in /home/x/opt/ghc-6.10.4/bin/ghc-pkg
("/home/x/opt/ghc-6.10.4/bin/ghc-pkg",["--version"])
/home/x/opt/ghc-6.10.4/bin/ghc-pkg is version 6.10.4
("/home/x/opt/ghc-6.10.4/bin/ghc",["--supported-languages"])
Setup.lhs: waitForProcess: does not exist (No child processes)

[x@bldrh4 HaskellRME]$ runghc Setup.lhs configure -v3
Configuring haskellrme-0.0...
Creating dist (and its parents)
searching for ghc in path.
found ghc at /home/x/opt/ghc-6.10.4/bin/ghc
("/home/x/opt/ghc-6.10.4/bin/ghc",["--numeric-version"])
/home/x/opt/ghc-6.10.4/bin/ghc is version 6.10.4
looking for package tool: ghc-pkg near compiler in
/home/x/opt/ghc-6.10.4/bin
found package tool in /home/x/opt/ghc-6.10.4/bin/ghc-pkg
("/home/x/opt/ghc-6.10.4/bin/ghc-pkg",["--version"])
/home/x/opt/ghc-6.10.4/bin/ghc-pkg is version 6.10.4
("/home/x/opt/ghc-6.10.4/bin/ghc",["--supported-languages"])
Reading installed packages...
("/home/x/opt/ghc-6.10.4/bin/ghc-pkg",["dump","--global"])
Setup.lhs: At least the following dependencies are missing:
regex-tdfa -any && -any


Just compiling the Setup.lhs (with --make) resulted in a binary that always works. After that, it was just a matter of installing cabal-install, installing regex-tdfa, then building my package with the executable Setup which all went without a hitch.

So thanks to all the help, I now have a working RH4 GHC and working binaries for the ancient RH4.