Blog posts

Cuda debug problems on Ubuntu 13.04

Hi there,

Today, I am going to discuss cuda-gdb problems. Almost a month ago, I have faced a problem which cuda-gdb wasn't able to execute debug operations for kernel functions.

I have compiled my application according to the manual of cuda-gdb;

nvcc -g -G -o main

Here "-g" option is for debug instructions of CPU code and "-G" option is for debug instructions of GPU code. As you can see that I am compiling our application according to rules.

Still, I was unable to debug my application. I have done some research and found out that many other have faced with the same problem. Here is an example;

Cuda-gdb doesn't break and/or step into Kernels

I have continued my research and discovered something extraordinary. I don't know who but packager of the nvdia driver (experimental-310 in repository) has removed the debug libraries from the driver while packing it. That was really surprising. My first reaction was that to visit Nvidia download page and download latest Nvidia driver. There is no need to download driver separately. I will explain this further.

I have installed the driver according to instructions in here and I was very excited. I have tested debugging immediately. Unfortunately, it didn't work and I got fallowing error;

Cuda ELF image contains unknown ABI version: 7
/home/buildmeister/build/rel/gpgpu/toolkit/r4.1/debugger/cuda-gdb/7.2/gdb/cuda-tdep.c:1203: internal-error: cuda_get_bfd_abi_version: Assertion `CUDA_ELFOSABIV_16BIT <= abiv && abiv <= CUDA_ELFOSABIV_LATEST' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.

Again, I have started to do some research and I have read somewhere that I have to recompile my CUDA Toolkit. I have installed downloaded and installed CUDA Toolkit according to STEP II instructions in here.

Actually, this only worked for once and again I have the same problem. I've done some research and reached this link. It says that, my driver version is too new for the toolkit :)

Now, I am downloading Cuda Toolkit 5.5 (which is a release candidate). and after I test it, I will post the results here. I have downloaded and installed the deb package. Unfortunately, it didn't work. I've got following error;

libkmod: ERROR ../libkmod/libkmod-module.c:791 kmod_module_insert_module: could not find module by name='nvidia_current'
ERROR: could not insert 'nvidia_current': Function not implemented

Now, I am downloading self-extracting installer. I will post the results. I have downloaded self-extractor package and installed both cuda-5.5 toolkit and display driver. I have compiled my cuda code, than run cuda-gdb application to debug my code. Finally, I haven't experience any more problems and debug was successful.

Have a nice day.

by zgrw on 2013-06-25 12:33:23