Linked Lists
Linked lists are a low-level data structure that are simple, lightweight, and flexible. They are commonly used to implement other data structures like stacks/queues or bucketing for hash tables. We use linked lists when the most important consideration is operations at the head and tail where we want fast prepends/appends/reads of these values.
Linked list nodes are stored in arbitrary memory slots, and each item in the linked list uses a to indicate the element that comes after it.
Very often for linked list interview questions, you are just given a head node as an input. This means to perform any operation, you need to start from the front and must iterate through the linked list at O(n) time, touching each node to find the next value. In this instance, you won't have access to a tail element, and the nodes won't have a pointer to access previous nodes.
Cheat Sheet
Big O Complexity | |
---|---|
lookup | O(n) |
append (w/ tail) | O(1) |
append (w/o tail) | O(n) |
prepend | O(1) |
insert | O(n) |
delete | O(n) |
space | O(n) |
Note: You only get constant time append
if a linked list provides a pointer to the tail node.
If you don't have a reference to the tail, it takes linear time because you would need to iterate through the entire linked list to arrive at the tail.
- Flexible Size / DistributedData does not need to be sequential in memory. We can arbitrarily add/remove items in a linked list as long as we have a way to point to the next item.
- OrderedLinked lists provide order by pointing to the `next` item in the linked list.
- Fast Insert / DeleteIf we have access to a node (typically the head or tail), we get fast inserts and deletes because the only thing that is required is updating a pointer.
- Slow LookupsThere is no direct access to data (no keys/indexes). We must iterate through the list sequentially to perform a lookup.
Linked Lists in Interviews
- Be comfortable taking an input head node of a singly linked list and iterating from there
- Always keep the runner technique in mind
- Understand how to manage multiple pointers
- Know the different linked list techniques and operations
Linked Lists Explained
The easiest way to think of a linked list is a train.
Each individual train car will have its data
which are the items or people that this train car holds.
This car will also have a reference to the previous and next car through its coupling,
but it has no knowledge of the train as a whole.
Singly Linked List
A singly linked list means that each node will point to the next
one in the list.
This means you can move forward through the list but not backward.
Doubly Linked List
Doubly linked lists have nodes that provide a pointer to the next
AND previous
node.
This means you can move both forward and backward.
If needed, you could reverse directions to reach a node behind the one you're at,
but with a singly linked list, you would have no way of reaching this node unless you started back at the head.
Linked List Nodes and Wrapper Objects
Very often in an interview question, you are simply given the head node of a singly linked list as an input to the question. This means to reach any other node, you must start iterating from the front.
Â
In practical implementations though, you likely have access to a linked list wrapper object that points at the head
and possibly tail
.
The wrapper object may also provide the size
of the linked list.
When you have access to a head
and tail
, you get O(1) operations at each end.
Â
Linked Lists vs Arrays
Linked lists and arrays appear to have a lot of similarities. You may be wondering why we would use a linked list over an array. We know arrays, we understand them, and we see them frequently.
As with all data structures, there are tradeoffs. Attributes that make an array useful in some cases, can become detrimental in other scenarios.
Arrays are very tightly structured. We enforce them to be sequential in memory with all the values indexed. This is great when we need to access values at a certain index, and it's easy for us as developers to grasp the composition of our array. The cost we pay to maintain arrays is that anytime we insert/delete inside the array, it must reindex the items and update their location in memory. As the array grows in size, our programming language underneath the hood must continue to allocate more memory for it. They must fit inside the RAM of the machine we're using, and we're bound to the structure of arrays and constraints they impose.
// 'h', 'e', 'l', 'l', 'o' as array vs linked list
// Array (in back-to-back memory slots)
Index 0 1 2 3 4
| | | | |
Memory address 418 419 420 421 422
___ ___ ___ ___ ___
Items "h" "e" "l" "l" "o"
// Linked list (assigned to random locations in memory and pointed to)
"h"
321 ---> "e"
981 ---> "l"
1421 ---> "l"
56 ---> "o"
717 ---> null
Linked lists on the other hand are a very loose data structure.
Since linked lists maintain order by simply pointing at the next item in the list,
they don't require any structure in memory.
When you have access to the head
and tail
, you get fast operations at both ends.
To add/remove items in the middle, you have to iterate O(n) to get there,
but the update itself only requires changing a pointer.
The items before and after it have no knowledge of the change and don't care — they simply manage their own pointers.
There is no mapping or way to find an element other than the reference to it from the preceding node. So to look up a desired item, we must start at the first element and walk through the linked list sequentially.
When our programming language creates a new node in the linked list, it allocates it to any random memory slot.
The node before it simply sets its next
attribute to this new node.
The next
attribute is a pointer to the following element in the sequence.
It's possible we may need fast access at another location in a linked list besides the head
or the tail
.
With a linked list, we can iterate to the desired spot and store a pointer to this node in a variable.
Then we'll get O(1) operations at this spot since we can access the item through a variable.
Storing multiple nodes in variables is a common technique in interviews.
Linked Lists in the Real World
- Browser back and forward buttons (doubly linked list where each page knows the one you visited before and after)
- Implementing stacks and queues (fast operations at the ends)
- Blockchain (distributed nodes are doubly linked lists that point to the one before and after)
- "Undo" button in your text editor
The pattern is that for all these implementations, they simply point to the item before and after the current one.
The Runner Technique
The runner technique is very commonly used in linked list problems. We use 2 pointers simultaneously to iterate through the linked list instead of just 1. The second pointer either starts before the first pointer, or both pointer start at the same time, and the second pointer jumps multiple nodes instead of touching each one. In either case, the second pointer will reach the end of the linked list faster which provides us additional details to solve a problem.
Let's look at an example of a runner that skips two nodes, which means it should reach the end twice as fast as our first pointer:
Â
Possible use cases:
- Finding the length of a linked list
- Determining the midpoint
- Find an intersection of linked lists
- Find or remove the kth element from the end
- Finding cycles in a linked list
- Finding sub-lists
- We don't have access to the
tail
but need the last node