AMD APP 2.6: Sad regression…

January 20, 2012

AMD’s OpenCL implementation, called AMD APP SDK (formally AMD Stream SDK) seems to suffer from a regression in it’s last incarnation (2.6).

Code crashes with:
../../../thread/semaphore.cpp:87: sem_wait() failed

A user reported it on AMD’s forum ( but the forum is down for the moment due to a migration.

The reason is that links to /usr/lib64/ which on my system is provided by NVidia’s driver (x11-drivers/nvidia-drivers).

A workaround is to point to media-libs/mesa’s /usr/lib64/opengl/xorg-x11/lib/ using LD_LIBRARY_PATH:
$ LD_LIBRARY_PATH=/usr/lib64/opengl/xorg-x11/lib/ ./opencl_code_to_run_on_amdapp

Hopefully AMD will increase the packaging quality of their SDK…

Force a kernel panic

August 24, 2011

It might seems counterproductive, but sometimes it’s useful to generate a kernel panic by hand.

A kernel panic is the Linux equivalent to the blue screen of death. A fatal error happened somewhere and the kernel does not know what to do and crash. It can happens for many reasons, but probably due to a bad boot configuration.  In a previous post, I explained why I got a kernel panic and how to set up a fallback mechanism.

But one is about to test this fallback mechanism? Wait until a panic happens? You could wait a long time! Searching for an answer, I came up with this blog post which gives the source code of a GPL kernel module. The kernel module is pretty simple. On initialization, it calls the panic() function. That’s it!

Here’s the panic.c file:


static int initKernelPanic(void) {
printk(KERN_INFO "Panic time!\n");
return 0;

// we'll never make it here
static void exitKernelPanic(void) {
printk(KERN_INFO "How'd I get here?!?\n");

module_init( initKernelPanic );
module_exit( exitKernelPanic );

MODULE_AUTHOR( "Phil Dufault " );
MODULE_DESCRIPTION( "A kernel module to force a panic" );

And here’s the makefile:

obj-m += panic.o

# Make sure the whitespaces before the "make" commands below are real tabs!
	make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

	make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

Note that the package the blog post gives has a bad Makefile. You’ll have to replace the spaces with tabs. Or use the posted one.

To use, simply load the module. WANRING: You’ll get an instant panic!

# insmod panic.ko

GRUB fallback

August 23, 2011

Often, clusters are located in specialized rooms: power is monitored, some UPC might protect them, efficient cooling, etc.

Special rooms also means special access. I am lucky: I can have physical access to the (front of the) machine, but I still need to walk the whole campus to get there first. Less lucky ones might require permission to enter the room, or might not be permitted at all.

In any case, how can you make sure the just updated kernel boots up fine? It often happens that a kernel is un-bootable. This can happen if you forget to compile the hard drives modules… This just happened to me:

VFS: Cannot open root device "sda5" or unknown-block(2,0)
Please append a correct "root=" boot option; here are the available partitions:
usb 7-1: new full speed USB device using ohci_hcd and address
0b0 1048575 sr0 driver: sr
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown block(0,0)

The problem is that the module for the disks is not compiled in the kernel (or present in the initramfs).  Note that the only “root” the kernel sees is the “sr0” partition (the CD-ROM).

To fix this I had to go in the special room, manually reset the machine and boot into a known working kernel. Now I don’t want to try a fix and risk having to go there again.

After a bit of search, I’ve found out that GRUB can be set to fallback to different boot options.

In the following, I’m assuming my kind of GRUB configuration: I put at the top of the list (and as default using “default 0”) a generic entry. The entry boots a symbolic link to the latest kernel. New kernels are added afterward but in inverse chronological order: newest at top and oldest at the bottom.

Here’s what you need to do:

  1. Add “panic=5” to every kernel line. This will make the kernel reboot the machine 5 seconds after a kernel panic.
  2. In your /boot/grub/menu.lst, change “default […]” to “default saved”. This will ensure grub selects the saved configuration by default.
  3. Add “fallback 1 2” after your “timeout […]” line. This will give the order of the fallbacks. Note that there should be existing correspondent entries for the fallbacks. I just tried “fallback 1 2 3 4 5 6” in a virtual machine but grub wouldn’t boot.
  4. After each GRUB entries, add “savedefault fallback” (except the last). This will make grub save the current default as the one selected at boot.
  5. After the last GRUB entry, add “savedefault”. I think this will prevent the last fallback in the list from being changed.
  6. Last, you need to tell GRUB that a boot was successful, as GRUB (in this setup) assumes a boot failed. So execute “grub-set-default 0” after a successful boot to reset the default boot to 0 (the first entry). Replace “0” with the one you want by default.
    On Gentoo Linux with >=sys-apps/openrc-0.8.3-r1, create the file in /etc/local.d/ with a “.start” extension and make it executable:

    # echo "#!/bin/bash" > /etc/local.d/99_set_default_grub.start
    # echo "grub-set-default 0" >> /etc/local.d/99_set_default_grub.start
    # chmod +x /etc/local.d/99_set_default_grub.start

Here’s my /boot/grub/menu.lst for reference:

default saved
timeout 30
fallback 1 2

title Gentoo Linux (lastest kernel)
root (hd0,0)
kernel /vmlinuz root=/dev/sda5 panic=5
savedefault fallback

title Gentoo Linux (2.6.32-gentoo-r34)
root (hd0,0)
kernel /vmlinuz-2.6.32-gentoo-r34 root=/dev/sda5 panic=5
savedefault fallback

title Gentoo Linux (2.6.37-gentoo-r4)
root (hd0,0)
kernel /vmlinuz-2.6.37-gentoo-r4 root=/dev/sda5 panic=5

That’s it! Now your remote machine can be rebooted to a new kernel without being afraid of a complete failure!

All this (except the “panic=5”) icomes from the GRUB’s manual:

Tested on GRUB 0.97 (legacy)

Drive backup to image and mount

July 27, 2011

I often backup usb drives by copying its content using dd:

$ dd if=/dev/sdb of=backup.dd

This will copy the whole content of the drive. It is NOT just a partition’s image. This means you can’t just mount it:

# mount -o loop backup.dd /mnt/backup
mount: you must specify the filesystem type

The trick is to find where the partition is located in the image file and pass that to mount as an offset. To get that offset, use fdisk:

fdisk -l backup.dd
Disk backup.dd: 221 MB, 221741056 bytes
20 heads, 51 sectors/track, 424 cylinders, total 433088 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x52f76374

Device             Boot      Start         End      Blocks   Id  System
backup.dd1                   496     2015231     1007368    6  FAT16

Note the “Start” column: it gives the partitions’ offset in sectors. Multiply that value by 512 (512 bytes per sector) and pass the result to mount:

# mount -o loop,offset=$((512*496)) backup.dd /mnt/backup

Red Tape struggle

July 11, 2011

A couple of weeks ago, we ordered a workstation to be able to (finally) develop and run our OpenCL code. Up to now, we have been using the AMD Accelerated Parallel Processing (APP) SDK (or AMD Stream)[1] and just lately the Intel OpenCL SDK[2]. These are great tools as it allows you to develop and run OpenCL code on the CPU. Not only they provide printf() suport in your OpenCL kernels, but they can also accelerate the code. I saw a speedup of about 2 running my Molecular Dynamics (MD) code using OpenCL as opposed to OpenMP. I suspect the cause is that OpenCL code allows for easy vectorization and cache friendly code. You are “forced” to write your code in a way that optimize execution time. Because branches are extremely expensive on a GPU, you are better off getting rid of them in your OpenCL code, which speedup the execution time even on a CPU.

Even though OpenCL on the CPU is fast, I still want to use the code to run on a real GPU. This should give not only better performance, but also better stability: both AMD’s and Intel’s implementations crash the code in the middle of a simulation, rendering this particular run useless.

So, I was able to convince my thesis supervisor to buy a new workstation to do GPGPU. A really nice desktop with 3 video cards: one basic for the display, and two GTX 580 with 3GB of RAM each: what a beast! The system was shipped last week, but I am now faced with a clear (and frustrating) example of red-tape stupidity and lack of flexibility.

Everything we buy has to be CSA approved, including computers. You see, when you buy a computer, there is a high risk of it becoming sentient. This is known fact. The press is full of stories about computers starting to walk, attacking people and killing kittens. So it becomes clear you need to protect yourself from these problems.

So to protect its poor employees from reckless machines and Skynet’s Judgment Day, the university requires all purchase to be CSA approved. Because of this, we now have two choices when buying computers: we can either pay a useless guy 400$ to stick a CSA sticker on a box of metal, or we can go with the big players like Dell, IBM, HP, etc. (that probably have exclusivity agreements, follow the money!) and buy one of there crappy, over-priced and weak systems.

So we have a really nice and powerful computer sitting in a box, waiting to be salvaged by a kitten-saviour sticker that will prevent it from taking over the world. This is just outrageous. More and more, universities compete with each other for the best researchers, the best grants, the best students. Higher education institutions have become more business-like. Researchers needs to travel, show off their research, attract students and money. A university will not hire a new professor that is doing research: they will hire a professor that can get (big) grants, hire students that will do research for them, and shine the university’s prestige around the world.

But every month we are faced with these ridiculous red-tape idiocies that is just preventing us to do our job. Getting paid here? They lost your contract, but no worries, you’ll receive a big amount in a month or two. Of course, I’ll stop eating for a month or two and then eat 200 meals in a day to compensate. The credit card company and my landlord will understand that too of course. You want access to your own cluster? be ready to wait 2 weeks for a single door to be opened by the IT “support”.

We should renamed our institution the Inefficient University, where nothing gets done, nothing can be bought but where at least our computers don’t get sentient and start a nuclear war. Because you know, it happened already.


Different profiles for different MPI and compiler configuration

September 28, 2010

To have different implementation of MPI (OpenMPI, MPICH, etc.) compiled with different compilers (gcc, icc, etc.) side by side, we’ll use empi. First, emerge it (it’s from the science overlay)

# emerge -avq empi

Make sure you install your compiler of choice. I’ll use in this example GCC and ICC. For instructions on ICC, see Gentoo’s wiki[1].

We’ll now install OpenMPI compiled with ICC. Edit the file /etc/portage/package.icc and add the following to it:


The OpenMPI ebuild is from the Gentoo Science Overlay[2]. Make sure you add the package to /etc/portage/package.keywords. The ebuild “sys-cluster/openmpi” from portage will not work since it does not use the mpi class. Make sure you use the Science overlay’s one.

The following will create the profile “mpi-openmpi-intel” and emerge “=sys-cluster/openmpi-1.4.2-r1” into it. Since we edited the file /etc/portage/package.icc with the right content, the package will be compiled with icc:

# /usr/bin/empi --create --class mpi-openmpi-intel =sys-cluster/openmpi-1.4.2-r1

Do the same with a gcc compiled version:

# /usr/bin/empi --create --class mpi-openmpi-gcc =sys-cluster/openmpi-1.4.2-r1

Since the file /etc/portage/package.icc does not contain “mpi-openmpi-gcc”, the package will be compiled with gcc.

Using “eselect”, one can now choose which MPI implementation to use!

Code Listing 1.1: Selecting the mpi-openmpi class

$ eselect mpi set mpi-openmpi-intel
$ eselect mpi list
Available MPI classes:
  mpi-openmpi-intel               Enabled
  mpi-openmpi-gcc                --
$ source /etc/profile
$ eselect mpi list
Available MPI classes:
  mpi-openmpi-intel               Enabled, In Use
  mpi-openmpi-gcc                --
$ which mpirun 

And that’s it! 😀


Integrating Intel’s C/C++ Compiler (ICC) with portage

September 28, 2010

Add this at the end of your /etc/make.conf:

ICCCFLAGS="-O2 -xT -gcc"

Then create the file /etc/portage/bashrc to tell portage which compiler to use:

[ -r ${ROOT}/etc/portage/package.icc ] || return 0
while read -a target; do
  if [ "${target}" = "${CATEGORY}/${PN}" ]; then
    export OCC="icc"
    export OCXX="icpc"
    export CFLAGS=${ICCCFLAGS}
    if [ -r ${ROOT}/etc/portage/package.icc-cflags ]; then
      while read target flags; do
        if [ "${target}" = "${CATEGORY}/${PN}" ]; then
          export CFLAGS="$flags"
          export CXXFLAGS="$CFLAGS"
      done < ${ROOT}/etc/portage/package.icc-cflags

done < ${ROOT}/etc/portage/package.icc

if [ -r ${ROOT}/etc/portage/package.gcc-cflags ]; then
  while read target flags; do
    if [ "${target}" = "${CATEGORY}/${PN}" ]; then
      export CFLAGS="$flags"
      export CXXFLAGS="$CFLAGS"
  done < ${ROOT}/etc/portage/package.gcc-cflags

if [ "${OCC}" != "" ]; then
  export CC_FOR_BUILD="${OCC}" #workaround gcc detection function in toolchain-funcs.eclass

That’s the first possibility for the file’s content where the default is icc. The second option, which I’m not posting here, compiles with icc by default.

Now just add any packages you want to compile with ICC in the file /etc/portage/package.icc. You can also use the file /etc/portage/package.icc-cflags to specify different flags for different packages. Both files takes the same format as the other /etc/portage/* files: package_category/package_name <options>.
Make sure you uptdate your environment variables:
# env-update
# source /etc/profile

Read more on the Gentoo wiki:

Different packages profiles possible?

September 28, 2010

Our vendor though it would be a good idea to compile MPI manually and install to /usr/local instead of going through portage. Because of this, an upgrade to the infiniband driver and utils broke it! Now I don’t even know which implementation they used! I think it’s mpich… And I think it’s compile with ICC.

This made me think… There is probably a better way of doing it. After a bit of googling, I’ve found “empi”[1]. As I see it, it allows you to create “profiles” for different mpi implementation. I will then be able to compile many different ones (OpenMPI, MPICH, MVAPICH, etc.) with different compilers (gcc and icc). The user will be able to select which one he wants, and portage will manage everything.

So yes, it’s possible! I’ll post what I’ve done later.

Which package owns a file?

September 27, 2010

To find which package owns a file:

equery belongs /usr/lib64/

Strip package version from name

September 27, 2010

To remove a package version from a list of packages, use the regular expression “-[0-9].*”. In Kate, KDE’s powerful editor, search for the regex “-[0-9].*” (without quotes) and replace with nothing. The following list:

x11-apps/appres-1.0.1 (0)
x11-apps/bdftopcf-1.0.0 (0)
x11-apps/bitmap-1.0.3-r1 (0)
x11-apps/iceauth-1.0.2 (0)