A Neat XOR Trick
There's a neat trick to solve Advent of Code, day 6 in a single pass.
Looking over the solutions mega-thread, it seems like not many people discovered this trick; when I posted about it, people found it to be noteworthy.
Let's start with a brief summary of the problem: you're given an input string, e.g.
nznrnfrfntjfmvfwmzdfjlvtqnbhcprsg
Within this string, you want to find the first W
-character window in which
every character is unique.
For this example, with a 4-character window, this occurs here:
nznrnf [rfnt] jfmvfwmzdfjlvtqnbhcprsg
Naively, you can do this with three nested loops:
fn run(s: &[char], window_size: usize) -> usize {
for i in 0..s.len() - window_size {
let mut unique = true;
for j in 0..window_size {
for k in (j + 1)..window_size {
if s[i + j] == s[i + k] {
unique = false;
}
}
}
if unique {
return i + window_size;
}
}
panic!("No unique window found");
}
With an input string of length N
and a window size W
, the
running time
of this implementation is O(N * W * W)
.
There's an obvious algorithmic optimization: instead of comparing each element
within the window, we could instead build a HashSet
and check its size:
fn run(s: &[char], window_size: usize) -> usize {
for i in 0..s.len() - window_size {
let mut set = HashSet::new();
for j in 0..window_size {
set.insert(s[i + j]);
}
if set.len() == window_size {
return i + window_size;
}
}
panic!("No unique window found");
}
Insertion into the HashSet
is O(1)
, so our running time is down to O(N * W)
.
(Whether this is faster in practice is left as an exercise to the reader, since it adds allocation into the mix)
However, there's still room to improve, with a little trickery.
First, let's eliminate the HashSet
. We can consider each character as a
bitmask:
a = 10000000000000000000000000000000
b = 01000000000000000000000000000000
c = 00100000000000000000000000000000
d = 00010000000000000000000000000000
...etc
Given a set of characters, we can see which ones are present by combining
these bitmasks with or (which is written as |
):
a | b | c = 11100000000000000000000000000000
If we have a set of W
characters, the or of their bitmasks will have W
bits set if and only if those characters are unique.
a | b | c = 11100000000000000000000000000000
a | b | b = 11000000000000000000000000000000
Hopefully, this is intuitively obvious: each character can only activate one bit
in the result, so if we have W
bits set, then we have W
unique characters.
In Rust, we check how many bits are set with count_ones
.
With -C target-cpu=native
passed to the compiler,
it should compile down to 1-2 instructions:
We can use this bitmask to replace our HashSet
:
fn run(s: &[char], window_size: usize) -> usize {
for i in 0..s.len() - window_size {
let mut set = 0u32;
for j in 0..window_size {
set |= 1 << (s[i + j] as u32 - 'a' as u32);
}
if set.count_ones() as usize == window_size {
return i + window_size;
}
}
panic!("No unique window found");
}
(and now there are zero allocations!)
The code is still O(N * W)
– because we have an outer and inner loop –
but it's prepared us for the final trick.
Consider a slight modification to our previous axiom:
If we have a set of W
characters, the xor
of their bitmasks will have W
bits set if and only if those characters are
unique.
a ^ b ^ c = 11100000000000000000000000000000
a ^ b ^ b = 10000000000000000000000000000000
This is less intuitively obvious: everyone that I've presented it to, including myself, has to convince themself that it's correct.
One reply on Reddit has a particularly good explanation
(edited to match the convention of using W
for window size):
This algorithm is just counting how many characters appear an odd number of times in a window of size
W
. That number will beW
if-and-only-if they're all unique!
Using xor
is more powerful than or
, because of one particular property:
a ^ b ^ a = b
Applying xor twice will turn a bit on, then back off.
In other words, we can maintain the bitset for a sliding window in a single
u32
, turning bits on as they enter the window and off as they leave.
This eliminates the inner loop from our previous function:
fn run(s: &[char], window_size: usize) -> usize {
let mut set = 0u32;
for i in 0..s.len() {
// Turn on bits as they enter the window
set ^= 1 << (s[i] as u32 - 'a' as u32);
// Turn off bits as they leave the window
if i >= window_size {
set ^= 1 << (s[i - window_size] as u32 - 'a' as u32);
}
// Check the current window and see if we're done
if set.count_ones() as usize == window_size {
return i + 1;
}
}
panic!("No unique window found");
}
With this last trick, we're down to O(N)
running time, with no dependence on
the window size!
I didn't do rigorous benchmarking, but
one reply
said that this trick sped up their code by almost 3x.
There's no moral to the story, other than "xor is cool". I hope everyone is enjoying Advent of Code!
Further reading
If you want to read about overengineered solutions to Advent of Code problems, I've got a few other blog posts in this vein: