Introduction

I am reading through Crafting Interpreters by Robert Nystom and one of the first exercises asks you to write and debug a doubly-linked list in C. Regretfully I confess I never actually learned how to use GDB, ever, and on macOS I need to use the LLVM toolchain which means using LLDB instead of GDB. In this post I will write using LLDB to debug a simple C program.

Reference program

Throughout this post, I will reference a simple implementation of a doubly-linked list in C. Here is the single-file C program I will be debugging:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct LinkedListNode {
    struct LinkedListNode* next;
    struct LinkedListNode* prev;
    char* value;
} LinkedListNode;

LinkedListNode* initList(char* value) {
    LinkedListNode* head = (LinkedListNode*)malloc(sizeof(LinkedListNode));
    head->value = value;
    head->next = NULL;
    head->prev = NULL;
    return head;
}

int append(LinkedListNode* list, char* new) {
    LinkedListNode* current = list;
    int size = 1;
    while (current->next != NULL) {
        current = current->next;
        size++;
    }
    LinkedListNode* tail = initList(new);
    tail->prev = current;
    current->next = tail;
    return size++;
}

void printList(LinkedListNode* list) {
    LinkedListNode* current = list;
    for(int index = 0; current != NULL; current = current->next, index++) {
        printf("[%d]: %s\n", index, current->value);
    }
}

LinkedListNode* find(LinkedListNode* list, char* value) {
    LinkedListNode* current = list;
    while (current != NULL && strcmp(current->value, value) != 0) {
        // printf("Comparing %s and %s\n", current->value, value);
        current = current->next;
    }
    if (current != NULL) {
        return current;
    }
    return NULL;
}

LinkedListNode* delete(LinkedListNode* list, char* value) {
    LinkedListNode* toDelete = find(list, value);
    if (toDelete != NULL) {
        if (toDelete->prev != NULL) {
            toDelete->prev->next = toDelete->next;
        }
        if (toDelete->next != NULL) {
            toDelete->next->prev = toDelete->prev;
        }
    }
    toDelete->prev = NULL;
    toDelete->next = NULL;
    return toDelete;
}

int main(int argc, char* argv[]) {
    LinkedListNode* list = initList("A");
    append(list, "B");
    append(list, "C");
    append(list, "D");
    printList(list);
    printf("\n");
    LinkedListNode* c = find(list, "C");
    printList(c);
    printf("\n");
    delete(list, "B");
    printList(list);
}

And the Makefile that builds this:

# Thanks to Job Vranish (https://spin.atomicobject.com/2016/08/26/makefile-c-projects/)
TARGET_EXEC := llist

BUILD_DIR := ./build
SRC_DIRS := ./src

# Find all the C and C++ files we want to compile
# Note the single quotes around the * expressions. Make will incorrectly expand these otherwise.
SRCS := $(shell find $(SRC_DIRS) -name '*.cpp' -or -name '*.c' -or -name '*.s')

# String substitution for every C/C++ file.
# As an example, hello.cpp turns into ./build/hello.cpp.o
OBJS := $(SRCS:%=$(BUILD_DIR)/%.o)

# String substitution (suffix version without %).
# As an example, ./build/hello.cpp.o turns into ./build/hello.cpp.d
DEPS := $(OBJS:.o=.d)

# Every folder in ./src will need to be passed to GCC so that it can find header files
INC_DIRS := $(shell find $(SRC_DIRS) -type d)
# Add a prefix to INC_DIRS. So moduleA would become -ImoduleA. GCC understands this -I flag
INC_FLAGS := $(addprefix -I,$(INC_DIRS))

# The -MMD and -MP flags together generate Makefiles for us!
# These files will have .d instead of .o as the output.
CPPFLAGS := $(INC_FLAGS) -MMD -MP

CFLAGS := -g

# The final build step.
$(BUILD_DIR)/$(TARGET_EXEC): $(OBJS)
	$(CC) $(OBJS) -o $@ $(LDFLAGS)

# Build step for C source
$(BUILD_DIR)/%.c.o: %.c
	mkdir -p $(dir $@)
	$(CC) $(CPPFLAGS) $(CFLAGS) -c $< -o $@

# Build step for C++ source
$(BUILD_DIR)/%.cpp.o: %.cpp
	mkdir -p $(dir $@)
	$(CXX) $(CPPFLAGS) $(CXXFLAGS) -c $< -o $@


.PHONY: clean
clean:
	rm -r $(BUILD_DIR)

# Include the .d makefiles. The - at the front suppresses the errors of missing
# Makefiles. Initially, all the .d files will be missing, and we don't want those
# errors to show up.
-include $(DEPS) 

Your program must be compiled with debug symbols to fully work with LLDB For Apple’s cc C compiler, debug symbols generation uses the -g flag passed to the compiler.

Invoking LLDB

Starting simple, LLDB can be invoked by passing it an executable file. Compilation of our sample code produces a llist executable, so to debug it you simply invoke lldb llist This launches the debugger and drops you in a REPL prompt, waiting for a command.

The simplest thing to do is to run the program, which can be done by sending the run command:

> lldb llist
(lldb) target create "llist"
Current executable set to '/Users/gustavo/code/craftinginterpreters/prelude/build/llist' (x86_64).
(lldb) run
Process 7890 launched: '/Users/gustavo/code/craftinginterpreters/prelude/build/llist' (x86_64)
[0]: A
[1]: B
[2]: C
[3]: D

[0]: C
[1]: D

[0]: A
[1]: C
[2]: D
Process 7890 exited with status = 0 (0x00000000)

Here we can see that the printf statements from the C program work, and the process exits with the expected 0 status code. You can run the program again with run, and for a full list of commands you can run help at the debugger REPL.

Pretty neat!

LLDB and GDB both support running commands identified by an unambiguous prefix match. This means that instead of typing run<Enter> you can just type r<Enter> (or ru<Enter>) because there is no other command that begins with r. While great for experienced debuggers, use of the terse commands for beginners just causes them to have to learn shortcuts at the same time as learning to use the debugger. In this post I will use the full command names, but just know that you can always use the shorter names.

Setting breakpoints

Because we passed the -g flag to cc to generate debug symbols, LLDB has full knowledge of the source code of our executable.

Let’s set a breakpoint right at the start of main then run the code:

> lldb llist 
(lldb) target create "llist"
Current executable set to '/Users/gustavo/code/craftinginterpreters/prelude/build/llist' (x86_64).
(lldb) breakpoint set --name main
Breakpoint 1: where = llist`main + 15 at llist.c:67:28, address = 0x0000000100003ebf
(lldb) run
Process 2805 launched: '/Users/gustavo/code/craftinginterpreters/prelude/build/llist' (x86_64)
Process 2805 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x0000000100003ebf llist`main(argc=1, argv=0x00007ff7bfeff4b0) at llist.c:67:28
   64   }
   65  
   66   int main(int argc, char* argv[]) {
-> 67       LinkedListNode* list = initList("A");
   68       append(list, "B");
   69       append(list, "C");
   70       append(list, "D");
Target 0: (llist) stopped.
(lldb) 

Let’s break that down:

  1. Invoke LLDB with lldb llist
  2. Set a breakpoint at main with breakpoint set --name main
  3. Run the program with run
  4. Watch our breakpoint be hit at llist.c line 67 with a source code print out.

Neat! A debugger should allow you to inspect the values of local variables, and you can do that with var. Within main, we should see local variables for the signature of main and the 2 LinkedListNode* declared inside:

(lldb) var
(int) argc = 1
(char **) argv = 0x00007ff7bfeff4b0
(LinkedListNode *) list = 0x000000010001a883
(LinkedListNode *) c = 0x00007ff7bfeff370

A debugger should allow you to step through your code line by line, and there are 2 commands to know:

  1. step: Executes the next thing, stepping into calls.
  2. next: Executes the next thing, stepping over calls.

Our breakpoint at main stopped right at a call to initList. Let’s step inside initList with step:

(lldb) step
Process 2805 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
    frame #0: 0x0000000100003c5c llist`initList(value="A") at llist.c:12:45
   9    } LinkedListNode;
   10  
   11   LinkedListNode* initList(char* value) {
-> 12       LinkedListNode* head = (LinkedListNode*)malloc(sizeof(LinkedListNode));
   13       head->value = value;
   14       head->next = NULL;
   15       head->prev = NULL;
Target 0: (llist) stopped.

Now we’re inside of initList and we could set more breakpoints or inspect locals.

One final basic command, finish. finish allows the current stack frame to run to completion, and then breaks when program execution returns to the caller frame. This is a good command to run whenever you want to just finish the current function call and return to debugging your caller.

If we call finish while we’re debugging initList, we’ll go back to debugging main:

(lldb) finish
Process 2805 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step out
Return value: (LinkedListNode *) $0 = 0x00000001003044a0

    frame #0: 0x0000000100003ecb llist`main(argc=1, argv=0x00007ff7bfeff4b0) at llist.c:67:21
   64   }
   65  
   66   int main(int argc, char* argv[]) {
-> 67       LinkedListNode* list = initList("A");
   68       append(list, "B");
   69       append(list, "C");
   70       append(list, "D");
Target 0: (llist) stopped.

Finally, you can always terminate the debugging session with exit.

Programmatically setting breakpoints.

Think about the experience of using an IDE: you set breakpoints on lines and the breakpoints move as you edit code to approximately stay with the line you originally set the breakpoint on. You can add several breakpoints, run the program, and the breakpoints persist across runs. In a CLI debugger, you can achieve the same thing by writing a file with commands that gets run when lldb runs a program. For example, let’s say you knew you wanted to debug printList, and you wanted to spare having to write breakpoint set --name printList and run every time you invoked lldb. First, write a file containing the commands you want to run, one command per line:

breakpoint set --name printList
run

Then invoke lldb with the --source (short -s) command:

> lldb llist -s ../debug
(lldb) target create "llist"
Current executable set to '/Users/gustavo/code/craftinginterpreters/prelude/build/llist' (x86_64).
(lldb) command source -s 0 '../debug'
Executing commands in '/Users/gustavo/code/craftinginterpreters/prelude/debug'.
(lldb) breakpoint set --name printList
Breakpoint 1: where = llist`printList + 12 at llist.c:33:31, address = 0x0000000100003d2c
(lldb) run
Process 3648 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x0000000100003d2c llist`printList(list=0x00000001003041c0) at llist.c:33:31
   30   }
   31  
   32   void printList(LinkedListNode* list) {
-> 33       LinkedListNode* current = list;
   34       for(int index = 0; current != NULL; current = current->next, index++) {
   35           printf("[%d]: %s\n", index, current->value);
   36       }
Target 0: (llist) stopped.
Process 3648 launched: '/Users/gustavo/code/craftinginterpreters/prelude/build/llist' (x86_64)

This debugger command file therefore allows us to configure the debugger on launch with an arbitrarily complex command list, which means you can script the process of getting your program into the error condition you need to debug 😈.

Don’t confuse -s|--source with -S|--source-before-file! The latter runs the commands before your target executable is loaded, which means your debugger commands will fail!

Conclusion

This post wraps a simple introduction to lldb, the LLVM equivalent to GCC’s gdb. We saw how to compile, run, and debug a simple C program with lldb. Then we saw how to script lldb to automatically set breakpoints. I am happy with lldb’s documentation and its UX, I will use it to debug my C programs.

References

https://lldb.llvm.org/index.html