C / C++ Troubleshooting

IN PROGRESS

Most of the high-performance bioinformatic programs are written in C or C++.  Unfortunately, C and C++ code is some of the hardest code to debug.  If you have only programmed casually in perl/python, you will not have a good time.  Here are some tips to help you out, but you will most likely need someone with C / C++ programming experience and knowledge of the code to get you through it.

All Memory Related Issues:
EG)
Program Memory Usage Keeps Growing Over Time and Uses Up Your RAM
Program Crashes Due to GLIBC Corrupted Memory Errors

A memory leak is the term used to describe the bug when a program keeps increasing its RAM usage over time without stabilizing.  C/C++ programs are very prone to memory leaks, because it is up to the programmer to request and release memory.  Memory leaks happen when the programmer requests memory but doesn’t release it is not longer needed.  Your options are

a) find a machine with enough RAM to handle the program requirements for its entire run.

b) find out what causes the leak and avoid it or fix it.

You will need access to the source code for option b).  Ensure that you compile the code with debug flags.  You may have to edit the configure shell script or makefile to do this.  Make sure there is a -g flag passed to the compiler so that debug symbols (extra line information) is added to the binaries:

gcc -g ....

Run valgrind to tell you where the memory leak is.  Valgrind is a unix commandline memory checking tool that checks for many memory related bugs, like memory leaks, using uninitialized variables, accessing invalid memory addresses, etc.  You feed your program and its arguments into valgrind.  Valgrind executes your program and handles all the memory operations instead of the OS.  You will want to use a smaller test input since valgrind slows down the execution of your program considerably.

Use this valgrind command to test for memory leaks and any other memory related issues:

valgrind --leak-check=full --track-origins=yes   --log-file=<log file location> <your program and its arguments>

–leak-check=full:  do a full memory leak check and output stacktraces (line numbers + functions) that cause the leak

–track-origins=yes:  trace accessing of unitialized variables to the source

–log-file:  output all valgrind output to a logfile.  Your program output will not go in here.

Examine your valgrind output.

Search for “definitely lost” or “possibly lost”.  These will indicate where your program is leaking memory.  Below indicates that function getClonesToRemove() in file c2am.c on line 1283 is reserving memory via calloc() without calling a corresponding free() to release the memory when it’s done with it.  You can either fix it by editing the code and putting in the necessary free() calls or figure out the workflows that lead to the memory leak and avoid them.

==22862== 5,664 bytes in 472 blocks are definitely lost in loss record 35 of 50
==22862==    at 0x4C29DB4: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==22862==    by 0x435E82: getClonesToRemove (c2am.c:1283)
==22862==    by 0x435206: Zsplit_ctgs_batch (c2am.c:1036)
==22862==    by 0x40E62A: Zbatch (batch.c:573)
==22862==    by 0x40BB96: main (fpp3.c:241)

Search for “Conditional jump or move depends on uninitialised value”.  These will indicate where your code is reading a variable that has not been initialized yet.  Depending on what happens to be already existing in the variable’s memory location, you will get unpredictable results.

==22862== Conditional jump or move depends on uninitialised value(s)
==22862==    at 0x6C213B1: vfprintf (vfprintf.c:1630)
==22862==    by 0x6C47813: vsprintf (iovsprintf.c:43)
==22862==    by 0x6C29A06: sprintf (sprintf.c:34)
==22862==    by 0x43F564: AutoCtgMsg (editproj.c:57)
==22862==    by 0x46349C: CBlayout (cb_ok_contig.c:410)
==22862==    by 0x43587B: rebuildAndSplitContig (c2am.c:1131)
==22862==    by 0x43597F: tryCancelCloneSplitChimericContig (c2am.c:1168)
==22862==    by 0x435437: Zsplit_ctgs_batch (c2am.c:1067)
==22862==    by 0x40E62A: Zbatch (batch.c:573)
==22862==    by 0x40BB96: main (fpp3.c:241)
==22862==