Linked List: sentinel and initialization issue

stgatilov · September 13, 2023, 2:10pm

Without a doubt, there are many ways to implement a doubly-linked list. But all of them are filled with corner cases and branches, except for the circular implementation with sentinel.

In this approach. we add a sentinel node, which is linked both to the start and to the end of the list, making the list circular. I firmly believe all the other approaches are a big waste of brainpower and CPU branch predictor =)

You can see my solution with this approach here: stgatilov's solution for Linked List in Zig on Exercism
(It has branches/cases only where the interface allows invalid operations, like popping from an empty list)

The issue is that the current testing framework penalizes this approach by assuming that LinkedList can be initialized using “default field initialization” only. The implementation with sentinel needs to assign a pointer to itself during its initialization, which is not possible even with the commonly used “return new object by value from init method” style of constructors.

I’m not sure what to suggest here, since apparently this implementation is a bit against the standard zig OOP style. I had to wrap the proper linked list into a lazily-initialized PImpl wrapper, I also had to move “len” field there, since testing code assumes it can access it directly.

stgatilov · September 13, 2023, 2:12pm

(here is the full solution code)

pub fn ProperLinkedList(comptime T: type) type {
    return struct {
        pub const Node = struct {
            prev: *Node = undefined,
            next: *Node = undefined,
            data: T,

            fn linkMyselfToMe(node: *Node) void {
                node.*.next = node;
                node.*.prev = node;
            }
            fn linkNeighborsToMe(node: *Node) void {
                node.next.prev = node;
                node.prev.next = node;
            }
            fn linkNeighborsToEachOther(node: *Node) void {
                node.next.prev = node.prev;
                node.prev.next = node.next;
            }

            pub fn addBefore(self: *Node, where: *Node) void {
                self.next = where;
                self.prev = where.prev;
                self.linkNeighborsToMe();
            }
            pub fn addAfter(self: *Node, where: *Node) void {
                self.next = where.next;
                self.prev = where;
                self.linkNeighborsToMe();
            }
            pub fn remove(self: *Node) void {
                self.linkNeighborsToEachOther();
                self.linkMyselfToMe();
            }
        };
    
        pub const Self: type = @This();

        // linked lists are awful without sentinel and circularity =)
        sentinel: Node = undefined,

        // I guess returning "this" from constructor by value is not always possible
        pub fn init(self: *Self) void {
            self.sentinel.prev = &self.sentinel;
            self.sentinel.next = &self.sentinel;
        }
            
        pub fn push(self: *Self, node: *Node) void {
            node.addBefore(&self.sentinel);
        }
        pub fn unshift(self: *Self, node: *Node) void {
            node.addAfter(&self.sentinel);
        }
        pub fn pop(self: *Self) ?*Node {
            const node = self.sentinel.prev;
            if (node == &self.sentinel)
                return null;
            node.remove();
            return node;
        }
        pub fn shift(self: *Self) ?*Node {
            const node = self.sentinel.next;
            if (node == &self.sentinel)
                return null;
            node.remove();
            return node;
        }

        // BEWARE: takes O(N) time!
        pub fn contains(self: *const Self, node: *Node) bool {
            var curr = self.sentinel.next;
            while (curr != &self.sentinel) : (curr = curr.next) {
                if (curr == node)
                    return true;
            }
            return false;
        }

        pub fn delete(_: *Self, node: *Node) void {
            node.remove();
        }
    };
}

// Testing code assumes that LinkedList can be initialized with default field initialization.
// But this is not the case for sentinel-based linked list.
// So this is a hacky PImpl wrapper for lazy initialization and direct "len" access.
pub fn LinkedList(comptime T: type) type {
    return struct {
        const Impl = ProperLinkedList(T);
        const Self = @This();
        pub const Node = Impl.Node;
    
        len: usize = 0,
        impl: ?Impl = null,

        fn ensureImpl(self: *Self) *Impl {
            if (self.impl == null) {
                self.impl = .{};
                self.impl.?.init();
            }
            return &self.impl.?;
        }

        pub fn push(self: *Self, node: *Node) void {
            self.ensureImpl().push(node);
            self.len += 1;
        }
        pub fn unshift(self: *Self, node: *Node) void {
            self.ensureImpl().unshift(node);
            self.len += 1;
        }
        pub fn pop(self: *Self) ?*Node {
            const res = self.ensureImpl().pop();
            if (res != null) self.len -= 1;
            return res;
        }
        pub fn shift(self: *Self) ?*Node {
            const res = self.ensureImpl().shift();
            if (res != null) self.len -= 1;
            return res;
        }
        pub fn delete(self: *Self, node: *Node) void {
            const impl = self.ensureImpl();
            const isOurs = impl.contains(node);
            if (isOurs) {
                impl.delete(node);
                self.len -= 1;
            }
        }
    };
}

IsaacG · September 13, 2023, 3:21pm

The link you shared is an invite for a private request, which, I think, can be accepted exactly once by one person to view your code. People can view your published solution here: stgatilov's solution for Linked List in Zig on Exercism

Why do you need a circular list vs having the last node point to null as the next?

jalih · September 13, 2023, 5:29pm

The basic use for a circular linked list with sentinel node would be efficient implementation of a sorted linked list. You would put a impossible value into sentinel node that is guaranteed to break the comparison loop. Now, as you iterate when you add stuff into linked list there is no need to test head/tail at all.

IsaacG · September 13, 2023, 5:48pm

Why would making it circular vs non-circular make it more efficient? The only difference is if the head/tail node contains a sentinel value which tells you to stop vs using a null at the tail telling you to stop.

jalih · September 13, 2023, 6:01pm

Probably in this context it won’t make a difference. It’s more efficient when implementing a sorted linked list as you don’t need to test for head/tail at all. With a really big linked list one comparison less for each node can make a difference.

siebenschlaefer · September 13, 2023, 6:06pm

Compare these two code snippets (pseudo-code):

# without sentinel
if to_be_deleted.prev:
    to_be_deleted.prev.next = to_be_deleted.next
else:
    list.first = to_be_deleted.next
if to_be_deleted.next:
    to_be_deleted.next.prev = to_be_deleted.prev
else:
    list.last = to_be_deleted.prev

# with sentinel
to_be_deleted.prev.next = to_be_deleted.next
to_be_deleted.next.prev = to_be_deleted.prev

By using the sentinel you have to handle fewer special cases.

(I’m not convinced by the argument about branchless being faster in this case because working with a linked list usually involves lots of cache misses and that is magnitudes slower than a conditional instruction.)

stgatilov · September 13, 2023, 6:42pm

Yes, the main profit is that you don’t need to check for head/tail/empty cases everywhere. Indeed, for a large list these checks are well-predicted and don’t slow things down much. But having to write all these checks is pretty annoying.