- Day 1 - Calorie Counting
- Day 2 - Rock Paper Scissors
- Day 3 - Rucksack Reorganization
- Day 4 - Camp Cleanup
- Day 5 - Supply Stacks
- Day 6 - Tuning Trouble
- Day 7 - No Space Left On Device
- Day 8 - Treetop Tree House
- Day 9 - Rope Bridge
- Day 10 - Cathode-Ray Tube
- Day 11 - Monkey in the Middle
- Day 12 - Hill Climbing Algorithm
- Day 13 - Distress Signal
- Day 14 - Regolith Reservoir
- Day 15 - Beacon Exclusion Zone
- Day 16 - Proboscidea Volcanium
- Day 17 - Pyroclastic Flow: TODO
- Day 18 - Boiling Boulders
- Day 19 - Not Enough Minerals
- Day 20 - Grove Positioning System
- Day 21 - Monkey Math
- Day 22 - Monkey Map
- Day 23 - Unstable Diffusion: TODO
- Day 24 - Blizzard Basin
- Day 25 - Full of Hot Air
Problem statement — Complete solution — Back to top
Welcome to AoC 2022! First day, easiest problem: we are given a list of integers divided in chunks by empty lines, and we need to find the chunk whose integers have the greatest sum.
There are a couple of easy ways of parsing today's input, I took advantage of a
list comprehension coupled with the
map()
to do this in two lines.
First, we read the entire input and .split()
its content on
empty lines (i.e. two consecutive newline characters \n\n
):
fin = open(...)
chunks = fin.read().split('\n\n')
# ['1000\n2000\n3000', '4000', '5000\n6000']
To transform a single chunk from a single string into a list of integers we can
.split()
it again and then use tuple(map(int, chunk))
to turn it into a
tuple
of int
(we could also transform it into a
list
, but for things that should be immutable it's good practice to
use a tuple
). To do the same for all chunks we can either use
map()
again or a list comprehension. In this case, the
latter is easier to write:
chunks = [tuple(map(int, chunk.split())) for chunk in chunks]
# [(1000, 2000, 3000), (4000,), (5000, 6000)]
We end up with a list of tuples, and now we can search for the one whose sum is
the greatest. We can now map()
each of those tuples into the sum()
of its
elements and then use max()
to find the greatest.
best = max(map(sum, chunks))
print('Part 1:', best)
For the second part, we need to find the top 3 chunks with the greatest sums and then sum their sums.
While in general there are algorithms such as quickselect
that are able to find the Nth largest elements of an unsorted sequence in
linear time (i.e. O(n)), Python sadly does not provide any
such function in its standard library. However we're still very far from needing
such a level of optimization, and the list we are dealing with is pretty small.
The "naive" solution of .sort()
ing the chunks (O(nlogn)) and
then extracting the top 3 works just fine.
We pass reverse=True
to sort in descending order, and key=sum
to sort the
chunks according to the sum of their elements (the key=
function is applied to
each chunk before comparing it with others).
chunks.sort(key=sum, reverse=True)
best3 = sum(map(sum, chunks[:3]))
print('Part 2:', best3)
Problem statement — Complete solution — Back to top
As you may have guessed from the title, today we're playing probably one of the oldest games in the world. We are given a list of rounds of rock-paper-scissors composed of two letters, one for the first player and one for the second one (us), indicating for each round who made which choice. We have to score each round and compute the total score for us. Each round, a win is worth 6 points, a draw 3 and a loss 0. Additionally, we get 1 point if we played rock, 2 if we played paper and 3 if we played scissors.
The rules seem pretty simple. The only annoying thing is that rock, paper and
scissors are identified by ABC
for the first player and XYZ
for the second.
For any given round there are a total of 9 possible combinations of moves: AX
,
AY
, AZ
, BX
, ..., CZ
. Given the rules, the easiest solution is to
pre-compute all possible scores for the second player:
SCORES = (
1 + 3, # A (rock) vs X (rock) -> draw
2 + 6, # A (rock) vs Y (paper) -> win
3 + 0, # A (rock) vs Z (scissors) -> loss
1 + 0, # B (paper) vs X (rock) -> loss
2 + 3, # B (paper) vs Y (paper) -> draw
3 + 6, # B (paper) vs Z (scissors) -> win
1 + 6, # C (scissors) vs X (rock) -> win
2 + 0, # C (scissors) vs Y (paper) -> loss
3 + 3, # C (scissors) vs Z (scissors) -> draw
)
Now if we want to know the score for the round A Z
we can simply access
SCORES[2]
, for C Y
SCORES[7]
and so on. The index to use can be calculated
by simply transforming the 3 possible moves into integers: ABC
-> 036
and
XYZ
-> 012
and summing them up. To make the operation easier we can
translate the entire input at once using str.maketrans()
and str.translate()
.
table = str.maketrans('ABCXYZ', '036123')
with open(...):
data = fin.read().translate(table)
The [with
][py-with] statement used above is useful to auto-close the file once
done using it. We can also avoid decoding the input by opening it with
open(..., 'rb')
since we already know we'll only be reading simple ASCII
characters, but in that case we'd also need to use bytes.maketrans()
and
bytes.translate()
instead.
Now that we read and translated the input moves into integer values, we can
iterate on each line of input, split it, parse the two integers, and use them to
index SCORES
. As usual, map()
comes in handy.
score = 0
for line in data.splitlines():
a, b = map(int, line.split())
score += SCORES[a + b]
print('Part 1:', score)
Now we are told that the second letter does not actually represent the move of
the second player, but rather an instruction: X
means we should play a move
that makes us lose, Y
means we should draw, and Z
means we should win.
This doesn't change our algorithm one bit. The only thing that changes is the
move we need to play. For example, for A Y
the first player played "rock"
(A
), and since we should draw (Y
) we need to play rock too, thus the score
for the round would be 3 + 1
(draw + rock). All we need to do is declare a
second SCORES
tuple for the second part with different precomputed scores. We
can then use the same loop we already wrote to compute the scores for both parts
together.
SCORES1 = ... # SCORES from part 1, unchanged
SCORES2 = (
0 + 3, # A (rock) and X (lose) -> play scissors
3 + 1, # A (rock) and Y (draw) -> play rock
6 + 2, # A (rock) and Z (win) -> play paper
0 + 1, # B (paper) and X (lose) -> play rock
3 + 2, # B (paper) and Y (draw) -> play paper
6 + 3, # B (paper) and Z (win) -> play scissors
0 + 2, # C (scissors) and X (lose) -> play paper
3 + 3, # C (scissors) and Y (draw) -> play scissors
6 + 1, # C (scissors) and Z (win) -> play rock
)
score1 = score2 = 0
for line in data.splitlines():
a, b = map(int, line.split())
score1 += SCORES1[a + b]
score2 += SCORES2[a + b]
print('Part 1:', score1)
print('Part 2:', score2)
An alternative solution to using a lookup table for pre-calculated scores would be to "exploit" the nature of the scores themselves. It's easy to notice in the above lookup tables that we are always repeating the same numbers over and over, and in particular we are cycling among them depending on the move of the first player. Thus, given any two moves (appropriately converted to integers) we can calculate the score for the round using a simple closed-form mathematical formula using the modulo operator. I implemented a simplified version of this approach in my alternative solution.
Problem statement — Complete solution — Back to top
For today's problem, we are dealing with strings. We are given a list of strings of even length, one per line of input, composed of seemingly random uppercase and lowercase letters. For each input string, there is exactly one common letter that appears in both of its halves. Once found, we need to calculate its "priority" using a simple rule given in the problem statement. The answer we are looking for is the sum of all the priorities for such common letters.
First of all, let's write a short function to calculate the "priority" value of
a letter given the instructions in the problem statement. It's basically equal
to the letter's position in the English alphabet, plus 26 in case it's
uppercase. Knowing this, and given that we are dealing with ASCII
characters, the ord()
builtin comes in handy, as we can use
ord(x) - ord('a')
to calculate the position of the letter in the variable x
in the English alphabet. Note that for this to work we must have len(x) == 1
.
def prio(x):
if 'a' <= x <= 'z':
return ord(x) - ord('a') + 1
return ord(x) - ord('A') + 27
We can optimize the function a bit by pre-computing everything that is constant,
like for example ord('A') + 27
. Furthermore, since we are guaranteed to only
have to deal with lowercase and uppercase ASCII letters, the check for lowercase
can be simplified to x >= 'a'
, as lowercase letters come after uppercase ones.
def prio(x):
if x >= 'a':
return ord(x) - 96
return ord(x) - 38
We can also make use of a ternary operator if we wish to simplify things even further:
def prio(x):
return ord(x) - (96 if x >= 'a' else 38)
Now we are ready to solve the actual problem. We'll iterate over the input one line at a time and split each line in half. We can do this through slicing.
fin = open(...)
for line in fin:
mid = len(line) // 2
a, b = line[:mid], line[mid:]
A careful reader is going to notice one small thing in the above code: we are
iterating with for line in fin
to get lines of input, however, these lines
will still contain the trailing newline character (\n
), which will therefore
end up in b
. We are actually splitting the string in half correctly, since the
length is odd and doing odd // 2
is the same as doing (odd - 1) // 2
.
Nonetheless, we can strip the final newline from each line if we wish, using
map()
and str.rstrip()
:
for line in map(str.rstrip, fin):
# ...
Now that we have the two halves we can simply iterate over the first one and check which letter is present in the second one. When we find such a sletter, we calculate its priority and stop the search, moving on to the next line of input.
total = 0
for line in fin:
mid = len(line) // 2
a, b = line[:mid], line[mid:]
for letter in a:
if letter in b:
total += prio(letter)
break
print('Part 1:', total)
We still need to search for common letters, but this time among groups of 3 lines of input, and compute the same sum of priorities that we did before. For example, given the following input:
asdfgXjkl
qwXertyui
zxcvbnXab
lksjhagAa
Awuytiqwe
mmvxbbzAz
The first 3 lines of input all contain 'X'
, while the following 3 all contain
'A'
, so the total would be prio('X') + prio('A')
.
We can keep using the same input loop we wrote for part 1, accumulating lines
into a small temporary list (the current group). When the group list reaches a
length of 3
we will find the common letter among the lines in the group and
update the total priority for part 2.
group = []
total = group_total = 0
for line in fin:
# .. same code as part 1 ...
group.append(line)
if len(group) == 3:
a, b, c = group
group = []
for item in a:
if item in b and item in c:
group_total += prio(item)
break
print('Part 1:', total)
print('Part 2:', group_total)
Easy peasy! Daily puzzle solved once again.
Problem statement — Complete solution — Back to top
Our job today is rather straightforward. We are given a list of pairs of integer
ranges (in the form <start>-<end>,<start>-<end>
), and we want to count the
pairs for which one of the two ranges is fully contained within the other. A
range is fully contained within another even if they share one or both their
ends. For example, 2-3
is fully contained within 0-3
.
Let's start by parsing the input. Quite simple: .split()
each
input line on the comma (,
), split each of the two parts on the dash (-
),
and turn the numbers into int
s. We can do the integer conversion on the fly
using map
.
fin = open(...)
for line in fin:
a, b = line.strip().split(',')
a1, a2 = map(int, a.split('-'))
b1, b2 = map(int, b.split('-'))
Now for each line of input we know the start and end of the two ranges. There are a few different ways in which we could check if one range is within another, for example computing their overlap:
a1|------------|a2
b1|---------|b2
o1|-----|o2
overlap
If the length of the overlap is equal to the length of one of the two
ranges, then that range is fully contained within the other. Or, to put it in
other words, the extremes of the overlap (o1
and o2
) coincide with the
extremes of the inner range.
a1|--------------|a2
b1|-----|b2
o1|-----|o2
overlap
We can compute the extremes of the overlap by simply calculating the maximum
between the two range start values and the minimum between the two range end
values. For this purpose, we have the max()
and
min()
builtins.
full_overlap = 0
for line in fin:
a, b = line.strip().split(',')
a1, a2 = map(int, a.split('-'))
b1, b2 = map(int, b.split('-'))
o1 = max(a1, b1)
o2 = min(a2, b2)
if o1 == a1 and o2 == a2 or o1 == b1 and o2 == b2:
full_overlap += 1
print('Part 1:', full_overlap)
And as easy as that, we have our solution!
Another way full overlap could have been checked is through simple logic formulas:
for line in fin:
# ...
# b inside a a inside b
if a1 <= b1 and a2 >= b2 or b1 <= a1 and b2 >= a2:
full_overlap += 1
However, as we'll shortly see, computing the overlap also helps us solve part 2 effortlessly.
Now we want to count the number of range pairs that overlap in any way, either partially or fully (part 1).
We know that all the cases counted in part 1 will also count for part two, since if one range is fully contained within another, there is a full overlap. To also take into account partial overlaps, we can simply check the two extremes of the overlap we just calculated.
a1|------------|a2 | a1|--------|a2
b1|---------|b2 | b1|--|b2
o1|-----|o2 | |o2 o1|
overlap (o2 >= o1) | no overlap (o2 < o1)
As can be seen in the above example, if the end of the overlap (o2
) is after
the start (o1
) then we have a valid range and an overlap (partial or full)
must exist.
All of this simply means adding one check to our part 1 code, and since we know that a full overlap is a special case of a partial overlap, we can move the part 1 check inside the part 2 one.
overlap = full_overlap = 0
for line in fin:
# ... same as part 1 ...
if o2 >= o1:
overlap += 1
if o1 == a1 and o2 == a2 or o1 == b1 and o2 == b2:
full_overlap += 1
print('Part 1:', full_overlap)
print('Part 2:', overlap)
Et voilà! 8 out of 50 stars.
Problem statement — Complete solution — Back to top
Do you like simulations? Today we're gonna need to simulate a crane moving crates stacked on top of each other. We start with a few stacks of crates, given as input in the following form:
[D]
[N] [C]
[Z] [M] [P]
1 2 3
This represents the initial configuration of N stacks of crates (in the above
example we have N = 3, but in the real input N = 9). Following the initial
configuration is an empty line followed by a list of instructions, one per line,
of the form move <n> from <i> to <j>
, meaning "move the top n
crates from
stack i
to stack j
, one at a time.
After executing all instructions, we need to answer with a string of letters
representing the topmost crate of each stack in order. For example, if the above
configuration was the final one, we would answer NDP
.
Today's input is particularly annoying to parse: we are given stacks in fancy ASCII art columns, and we need to somehow turn each one into a string or list in order to work with it. The easiest way to approach this is probably to just read the entirety of the first few lines of input, stopping at the first empty line, and then transpose them to obtain a list of columns. In other words, do something like this:
+----------> ' [[ '
|+---------> ' NZ1'
||+--------> ' ]] '
|||+-------> ' '
||||+------> '[[[ '
|||||+-----> 'DCM2'
||||||+----> ...
|||||||
' [D] '
'[N] [C] '
'[Z] [M] [P]'
' 1 2 3 '
After reading the initial ASCII art and stopping at the first empty line:
fin = open(...)
raw = []
for line in fin:
if line == '\n':
break
raw.append(line)
# raw = [' [D] \n',
# '[N] [C] \n',
# '[Z] [M] [P]\n',
# ' 1 2 3 \n']
We can transpose the raw
list with the help of zip()
plus
an unpacking operator:
columns = list(zip(*raw))
# [(' ', '[', '[', ' ', '\n'),
# (' ', 'N', 'Z', '1', '\n'),
# ... ]
This seemingly esoteric single-line transposition works because zip()
yields
tuples consisting of the i-th element of each line in raw
, i.e. it effectively
yields columns.
We went from strings to tuples, but that's no problem for now. The next thing to
do is skip all the useless columns (those consisting of only spaces and square
brackets) and keep the rest, turning good columns into strings through
str.join()
and discarding leading whitespace with
str.lstrip()
.
Fortunately, all we need to identify good columns is
enumerate()
, skip the first column and then only keep
one every 4, which can be achieved using the modulo (%
) operator.
# Indexes in the instructions are 1-based, fill stacks[0] with some useless
# value so that later we can do stacks[i] instead of stacks[i - 1].
stacks = [None]
for i, column in enumerate(zip(*raw)):
if i % 4 == 1:
# column = (' ', 'N', 'Z', '1', '\n')
column = ''.join(column[:-1]) # -> ' NZ'
column = column.lstrip() # -> 'NZ'
stacks.append(column)
# Make a copy to use for part 2
original = stacks[:]
Now that we finally have the initial stacks parsed, let's also parse the
instructions. This is quite simple: iterate over input lines, split them and
extract the three numbers at positions 1
, 3
and 5
:
moves = []
for line in fin:
line = line.split()
moves.append((int(line[1]), int(line[3]), int(line[5])))
We have the instructions parsed, now let's simply follow them:
for n, i, j in moves:
for _ in range(n):
crate = stacks[i][0] # Extract top of stacks[i]
stacks[i] = stacks[1:] # Remove it from stacks[i]
stacks[j] = crate + stacks[j] # Add it to top of stacks[j]
We can optimize the above operation by extracting all n
crates at once and
reversing their order with crates[::-1]
, a common Python trick to reverse an
indexable iterable through slicing:
for n, i, j in moves:
crates = stacks[i][:n][::-1]
stacks[i] = stacks[i][n:]
stacks[j] = crates + stacks[j]
Finally, we can extract the topmost element of each stack using a simple
generator expression and .join()
the result into a
string:
top = ''.join(s[0] for s in stacks[1:]) # Skip stacks[0], which is None
print('Part 1:', top)
For part two, we need to follow the same list of instructions as part 1, but
this time moving all of the topmost n
crates from a given stack to another
at once, meaning that their final order on top of the second stack will not be
reversed.
Well, given the code we already wrote, this is really child's play. We can use
the same code as part 1, removing the reversing operation ([::-1]
):
# Restore initial state from the copy we made earlier
stacks = original
for n, i, j in moves:
crates = stacks[i][:n] # <- removed [::-1] from here
stacks[i] = stacks[i][n:]
stacks[j] = crates + stacks[j]
top = ''.join(s[0] for s in stacks[1:])
print('Part 2:', top)
Problem statement — Complete solution — Back to top
Today's problem feels like it could have been a day 1 problem. In fact, I believe it was even easiest than this year's day 1. We are given a long string of seemingly random characters as input, and we need to find the first group of 4 consecutive characters that are all different, called the "start-of-packet". Once found, our answer will be the 1-based index of the character immediately following the start-of-packet.
After reading the whole file as a string:
with open(...) as fin:
data = fin.read()
We can extract groups of 4 consecutive characters starting at index i
just
doing data[i:i + 4]
. To check whether they are all different, a quick and easy
way is to simply put them all inside a set
and calculate the size of the set:
if the size is 4, it means all 4 were different.
for i in range(len(data) - 4):
if len(set(data[i:i + 4])) == 4:
sop = i + 4
break
print('Part 1:', sop)
You technically do not need to build the entire set in order to perform this check: you can add the characters to the set one by one and stop at the first one that is already present. However, since we are talking about merely 4 characters, such an optimization would just be pointless.
We need to... do the same thing as before, only that we are looking for a "start-of-message" consisting of 14 different consecutive characters now.
Well, the code is the same as part 1, so let's just move it into a function. We also know for a fact that our "start-of-message" cannot appear before the "start-of-packet" (we do not have 4 consecutive different characters before the start-of-packet, let alone 14), so let's also give our function the ability to skip the start of the data for performance.
def find_start(data, header_len, start=0):
for i in range(start, len(data) - header_len):
if len(set(data[i:i + header_len])) == header_len:
return i + header_len
Here we go, we can now get both part 1 and part 2 answers using this function:
sop = find_start(data, 4)
som = find_start(data, 14, sop)
print('Part 1:', sop)
print('Part 2:', som)
Problem statement — Complete solution — Back to top
First challenging problem of the year today! We are dealing with a filesystem,
and we are given the output of a shell session where the only two commands used
are cd
to change directory and ls
to list the current directory's contents.
Each line of input that starts with $
indicates a command, and the following
lines that don't start with $
are the command's output.
Simply enough, only the ls
command generates output. Each directory can
contain other directories or files, and the ls
command prints out a list where
each directory name is preceded by the string dir
and each file name is
preceded by its size.
The total size of a directory is the sum of the sizes of the files it contains, either directly or inside other subdirectories. We are asked to find all directories with total size up to 100000 and sum all their sizes.
First of all, let's see an example input to help us understand what we're talking about:
$ cd /
$ ls
dir a
1000 b.txt
699 c.dat
$ cd a
$ ls
100 d
200 e
Given the above shell session, we can infer the contents of /
and /a
, and we
know that the filesystem looks like this:
/
├── a
│ ├── d, size 100
│ └── e, size 200
├── b.txt, size 1000
└── c.dat, size 699
The total size of the directory /a
is 100 + 200 = 300
, and the total size of
/
is 300 + 1000 + 699 = 1999
.
Since the size of a directory will need to be calculated recursively entering
all its descendant directories and finding all the files contained within them,
it's pretty obvious that we'll need to somehow reconstruct the structure of the
filesystem to fulfill our request. We can represent the filesystem as a
tree structure, where the root is /
.
Parsing the input line by line, one option would be to store the size of each file and the contents of each directory in a dictionary, for example like this:
fs = {
'/': ['a', 'b.txt', 'c.dat'],
'a': ['d', 'e'],
'd': 100,
'e': 200,
'b.txt': 1000,
'c.dat': 699,
}
However, this is not enough, because directory and file names are not unique.
There could be for example multiple directories, each containing a file with the
same name. Even worse, there could be one directory containing a file, and
another one containing a directory with the same name as the file! Indeed, we
should already know, the thing that uniquely identifies a file or directory in a
filesystem is not its name, but its path. The path to the file d
in the
above example would be /a/d
.
Given the input, and starting with an empty path, we should be able to keep
track of the current path by simply looking at the cd
commands that are
performed. For simplicity, we'll use a tuple
to represent a path instead of a
string of path components separated by slashes, meaning that the path /a/d
will be represented by the tuple ('/', 'a', 'd')
. This will make it easier to
add and remove path components as needed.
The output we want is a dictionary of the form {path: list_of_contents}
. Since
we do not care about file names at all, instead of storing their names we'll
simply store their size as an integer. Later, when iterating over the list of
contents of a given path, we'll know that anything that is an integer is the
size of a file. Furthermore, we also do not care about directories that we do
not explicitly enter through the cd
command. If some ls
command lists 100
directories, but we only enter one of them, that's the only one we care about.
We can therefore skip all the lines of ls
output that start with dir
. Our
input seems to be "well-formed", and seems to suggest that we actually enter
with cd
every single directory that is listed by a ls
command, but still.
We'll parse the input one line at a time, like so:
- Each time we encounter a
cd
command, we'll add a component to the current path. If we encounter the special component..
we'll remove the last component from the current path instead. - Each time we encounter a
ls
command, we'll start reading the following lines until the next command. Each line can either bedir <dirname>
or<size> <filename>
. We will completely skip lines starting withdir
, and for the others we'll only parse the file size as an integer, adding it to the list of the contents of the current directory.
As already said, we'll use a dictionary to keep track of the contents of each
path we encounter. More precisely, to make things easier, a
defaultdict
of list
comes in handy, so that we
can just do fs[path].append(...)
without worrying about path
not being
already present in fs
. A deque
(double-ended queue)
is also useful to process the input line by line while being able to peek at the
next line without consuming it, since we want to stop parsing the output of the
ls
command whenever we encounter a line starting with $
. We can peek the
first element of a deque
with d[0]
, and consume it by popping it with
d.popleft()
. The same could be done with a normal list through l.pop(0)
, but
the operation is much more expensive as it internally requires to move all
elements of the list back one position after removing the first one.
Here's a function that implements the above logic taking advantage of
defaultdict
and deque
:
from collections import deque, defaultdict
def parse_filesystem(fin):
lines = deque(fin)
fs = defaultdict(list)
path = ()
while lines:
line = lines.popleft().split() # ['$', 'cd', 'foo']
command = line[1]
args = line[2:] # ['foo'] or [] in case the command is `ls`
if command == 'ls':
# The `ls` command outputs a list of directory contents, keep going
# until we either run out of lines or the next line is a command
while lines and not lines[0].startswith('$'):
# Get the size of the file
size = lines.popleft().split()[0]
# Skip if not a file
if size == 'dir':
continue
# Add the size of the file to the contents of the current path
fs[path].append(int(size))
else:
# The `cd` command has no output, but changes the current path
if args[0] == '..':
# Discard last path component (go up one directory)
path = path[:-1]
else:
# Calculate path of the directory we are moving into
new_path = path + (args[0],)
# Add its path to the contents of the current directory
fs[path].append(new_path)
# Move into the new directory
path = new_path
return fs
The result of calling parse_filesystem()
using the example input we saw
earlier should be something like this:
with open(...) as fin:
fs = parse_filesystem(fin)
# fs = {
# () : [('/',)],
# ('/',) : [('/', 'a'), 1000, 699],
# ('/', 'a'): [100, 200]
# }
The extra empty tuple ()
is an artifact of the fact that we start with an
empty path (path = ()
), and the first command is cd /
, so effectively our
actual root is ()
, but it only contains ('/',)
so there isn't much
difference. If we wanted to avoid this, we could have started with
path = ('/',)
skipping the first command.
Technicalities aside, now we have a dictionary that represents the tree of our
filesystem. In order to calculate the size of a single directory we need to
traverse the tree starting at its path, and sum up any file sizes we find along
the way. The simplest way to do this is through a
depth-first search, which, given the data structure we have, can be
implemented as a recursive function in just a few lines of code. All we have to
do given a path is iterate over fs[path]
and check whether the current item is
an int
(size of a file) or another path representing a subdirectory. In the
first case, we'll just add the size to the total, while in the second case we'll
make a recursive call to determine the size of the subdirectory. The
isinstance
built-in function can be used to check for
int
s, as well as the type
built-in, it's more or less a
matter of taste.
def directory_size(fs, path):
size = 0
for subdir_or_size in fs[path]:
if isinstance(subdir_or_size, int):
# File, add size to total
size += subdir_or_size
else:
# Directory, recursively calculate size
size += directory_size(fs, subdir_or_size)
return size
Now we have all we need to solve the problem. We'll iterate over all the keys of
fs
(representing all the paths of the directories we know) and call
directory_size()
for each one of them, summing up the sizes that are less than
100000.
There is a small performance issue though: calling the directory_size()
function we just wrote for every single path is not optimal. It's a recursive
function that will traverse the whole tree starting from path
every time it's
called, however we only need a single traversal (starting at the root ('/',)
)
to kno the size of any directory. We can save this information into a dictionary
before returning from the function.
def directory_size(fs, path, output):
size = 0
for subdir_or_size in fs[path]:
if isinstance(subdir_or_size, int):
size += subdir_or_size
else:
size += directory_size(subdir_or_size)
output[path] = size
return size
This only traverses the entire fs
tree once and saves the size of each
directory into a dictionary of the form {path: size}
. After a single call
starting from the root, we'll know the size of any directory:
sizes = {}
directory_size(fs, ('/',), sizes)
small_dir_total = 0
for sz in sizes.values():
if sz <= 100000:
small_dir_total += sz
Alternativelt, we could also use memoization to cache the
results of directory_size()
calls, and keep calling the function regardless.
This is an easy to implement solution since Python 3 already provides us with a
decorator to implement memoization out of the box:
functools.lru_cache
. For those who are interested, I
explained memoization and the use of lru_cache
in more detail towards the end
of 2019 day 18 part 1.
@lru_cache(maxsize=None)
def directory_size(path):
size = 0
for subdir_or_size in fs[path]:
if isinstance(subdir_or_size, int):
size += subdir_or_size
else:
size += directory_size(subdir_or_size)
return size
Now no matter how many times we call directory_size()
, the calculation is only
performed once for any given path
on the first call and cached automatically
by lru_cache
to be reused for subsequent calls. This however
In any case, we finally have our solution:
small_dir_total = 0
for path in fs:
sz = directory_size(path)
if sz <= 100000:
small_dir_total += sz
print('Part 1:', small_dir_total)
As a bonus, this can also be rewritten as a one-liner with the help of
filter
, map
and a
lambda
:
small_dir_total = sum(filter(lambda s: s <= 100000, map(directory_size, fs)))
print('Part 1:', small_dir_total)
Though I would honestly avoid it for readability.
The bulk of the work is done, now we want to find a single directory to delete to free some space on the disk. The total available space is 70000000, and we need at least 30000000 of it free. We need to find the size of the smallest directory that can be deleted in order to end up with a free space of at least 30000000.
Well, we already have the code to do everything, all we need to do is add a
couple of variables and another if
inside the final for
loop used to
calculate the answer for part 1:
used = directory_size(('/',))
free = 70000000 - used
need = 30000000 - free
small_dir_total = 0
min_size_to_free = used
for path in fs:
sz = directory_size(path)
if sz <= 100000:
small_dir_total += sz
if sz >= need and sz < min_size_to_free:
min_size_to_free = sz
print('Part 1:', small_dir_total)
print('Part 2:', min_size_to_free)
Ta-dah! Thankfully an easier part 2 than part 1, and nothing too weird.
Problem statement — Complete solution — Back to top
The long-awaited grid of characters strikes back. This time we are dealing with
trees. We are given a grid of tree heights, each represented by a single digit
between 0
and 9
, and we need to count how many trees are visible from
outside the grid. A tree is considered visible from outside the grid if it is
taller than any other tree in at least one of the four cardinal directions
(looking north, south, east or west).
Obviously, all trees on the perimeter of the grid are visible from the outside, but what about the inner ones?
Let's start by parsing the input data. Transforming each digit of each line into
an int
seems natural, but it's unneeded. We are, as usual, dealing with plain
ASCII characters, and the byte values of the digits 0
through
9
in ASCII are ascending (from 0x30
to 0x39
). If we open the input file in
binary mode, we can simply use .splitlines()
and we'll
have a list of bytes
objects (one per line), and iterating over bytes
already yields integers. We could do the same by opening the file in text mode
(the default), since '1' > '0'
still holds for strings too, but decoding is
unneeded.
with open(..., 'rb') as fin:
grid = fin.read().splitlines()
As already said, all the trees on the perimeter of the grid are visible, and we
can pre-calculate this number to save time: it's simply H * 2 + W * 2 - 4
(the
- 4
is needed to avoid counting vertices twice). While we are at it, let's
also calculate a couple of useful values for later.
height, width = len(grid), len(grid[0])
maxr, maxc = height - 1, width - 1
visible = height * 2 + width * 2 - 4
We'll use two nested for
loops to iterate over the entire grid. Since we
already pre-computed the visibility of the perimeter of the grid, we can skip
it, which means starting our iterations from 1
and stopping right before
maxr
and maxc
(calculated above). We'll also keep the current row in a
variable, to avoid double-indexing the grid where possible.
for r in range(1, maxr):
row = grid[r]
for c in range(1, maxc):
tree = row[c]
...
To know whether a tree is visible from outside the grid on the east side we can
iterate the current row
starting from c + 1
up to its end, and check if our
tree
is higher than all of them. If not, there is no visibility from east.
# ... continues from above
visible_from_east = True
for t in row[c + 1:]:
if tree <= t:
visible_from_east = False
Let me take a second to make a small digression. It should be noted that in
general doing something like row[c + 1:]
is not that good of an idea, since in
Python slicing lists, tuples, strings, bytes or similar collections performs a
copy of the sliced values. Clearly a copy is unneeded if we don't need to
modify the data and only slows things down, more so if the operation is
performed on large collections or repeated a lot of times. This is a pretty
unfortunate design flaw of the language that we can get around in a couple
different ways, including using external libraries such as NumPy,
since by default slices of NumPy arrays return views
instead of performing copies. In our case, the input grid is 99 by 99, which is
a pretty small size, so we can get away with all this unnecessary copying to
keep the code more concise and easier to read, but be warned that in general
this isn't the case.
Back to the problem. The kind of loop we just wrote is exactly what the
all()
built-in function was invented for: using all()
together with a generator expression gives us a one-liner
equivalent to the above loop that is also easily readable:
visible_from_east = all(tree > t for t in row[c + 1:])
Analogously, we can do the same thing for west, south and north. The only hiccup
is that for south and north we don't have a nice column
variable to iterate
over, and we'll have to iterate over the grid
with an index instead of using
range()
.
visible_from_east = all(tree > t for t in row[c + 1:])
visible_from_west = all(tree > t for t in row[:c])
visible_from_south = all(tree > grid[i][c] for i in range(r + 1, len(grid)))
visible_from_north = all(tree > grid[i][c] for i in range(r - 1, -1, -1))
Now, in theory, a tree is visible from outside the grid if any of the above
variables is True
, so we are doing unneeded work. For example, if
visible_from_east
is True
, we don't need to calculate the rest, and the same
goes for the others. Since we are in a loop, we could use continue
to skip
right to the next iteration when any of the four is satisfied:
for r in range(1, maxr):
row = grid[r]
for c in range(1, maxc):
tree = row[c]
if all(tree > t for t in row[c + 1:]): # east
visible += 1
continue
if all(tree > t for t in row[:c]): # west
visible += 1
continue
# Same for south and north...
However, if we save the generators in a variable instead of feeding them to
all()
immediately we can also do this without duplicating much code with a
single if
statement, retaining the lazyness of the operation with a simple
or
, which only evaluates its right operand if the left one is False
.
for r in range(1, maxr):
row = grid[r]
for c in range(1, maxc):
tree = row[c]
east = (tree > t for t in row[c + 1:])
west = (tree > t for t in row[:c])
south = (tree > grid[i][c] for i in range(r + 1, len(grid)))
north = (tree > grid[i][c] for i in range(r - 1, -1, -1))
if all(east) or all(west) or all(south) or all(north):
visible += 1
print('Part 1:', visible)
For the second part we need to do a similar job, but this time we want to find out how far can we see from the top of a given tree in the four cardinal directions. Being on top of a tree, we can only see in a certain direction up until our view is blocked by a taller tree. For each tree, we need to calculate a score equal to the product of the viewing distances in the four directions. Out of all these scores, we must then find the highest.
As an example, given the following small grid:
8303730
9255124
3677320
The tree with height 5
in the middle of the grid can see 1
unit north, 3
units east, 1
unit south and 1
unit west (blocked by the other 5
). Its
score would then be 1 * 3 * 1 * 1 = 2
.
Calculating the view distance in each direction is pretty similar to what we
just did for part 1, except that we need to count how many grid cells we pass
before stopping because of a higher tree. Well, if we use plain and simple (but
also boring) for
loops instead of writing fancy generator expressions, we can
do this pretty easily.
For example, for east:
for r in range(1, maxr):
row = grid[r]
for c in range(1, maxc):
tree = row[c]
for i in range(c + 1, width):
if row[i] >= tree:
break
view_east = i - c
Doing the same for all directions, and adding the formula to calculate the score, we have the solution for part 2:
for r in range(1, maxr):
row = grid[r]
for c in range(1, maxc):
tree = row[c]
for east in range(c + 1, width):
if row[east] >= tree:
break
for west in range(c - 1, -1, -1):
if row[west] >= tree:
break
for south in range(r + 1, height):
if grid[south][c] >= tree:
break
for north in range(r - 1, -1, -1):
if grid[north][c] >= tree:
break
score = (east - c) * (c - west) * (south - r) * (r - north)
if score > best:
best = score
print('Part 2:', best)
Technically, the code for both parts could be easily joined together into the
same extended loop we just wrote for part 2, but I liked the concise part 1 code
using all()
with generator expressions, so I kept it in my final solution.
The algorithms I implemented for today's part 1 and part 2 scan the grid in four direction for every single grid cell. This means that in the worst case scenario we are potentially scanning the entire grid 4 times per single tree. For a square grid of side n there are n2 trees, thus the computational complexity of these algorithm is at worst O(4n2n2) = O(n4) operations.
That honestly doesn't sound very good. Indeed, there is a smarter solution, at least for part 1, which achieves the same result in O(n2) operations only:
- Scan the entire matrix once per direction.
- For each row/column, keep track of the tallest tree while scanning.
- Mark all the trees taller than the tallest as visible from outside the grid.
- As a bonus: stop scanning the current row/column when the tallest tree
encountered is a
9
.
I implemented this approach in my alternative solution for those that are curious.
For part 2, a similar logic can be used, keeping track of the last time a tree of each possible height was seen, and using such information to instantly compute the view distance from any encountered tree in the current scanning direction. This requires a bit more effort though, and I did not have time to re-implement my part 2 solution using this approach.
Problem statement — Complete solution — Back to top
2D grids again, huh? Today we're not really taking a grid as input, but moving
through one instead. Given a list of moves of the form <direction> <N>
(e.g.
R 5
) we need to simulate the movements of a small piece of rope by tracking
the position of its head and tail in a grid. The rope moves head first one step
at a time in any of the four cardinal directions, and the tail will always stay
"attached" to the head, meaning in one of the 8 neighboring cells. Whenever the
head makes a movement that separates it from the tail, the tail also moves
toward the head in the nearest cell that is directly above, below, left or right
of the head. We need to count the number of different positions the tail will
occupy while simulating all the moves in our input.
An example makes it easier to understand what is going on. Let's analyze the
movement of the head and tail, given that they start in the same spot and the
moves to perform are R 2
followed by U 2
:
... ... ... ... ..H
... -> ... -> ... -> ..H -> ..T
H.. TH. .TH .T. ...
As can be seen above, when the head (H
) moves up one cell, the tail (T
) is
still "attached" to it. However, the second movement of the head would cause the
tail to disconnect, thus it moves towards the head.
If we zoom in and break a single transition in two steps, the movement of the tail will be clearer:
. . . . . . . . .
. . . => . . . => . . .
T H . T-------H . T H
In the simple case where the head moves, but stays on the same horizontal or vertical line as the tail, the movement of the tail is straightforward: it wants to move towards the head following the segment depicted above, and so it simply needs to move one unit in the right direction.
The second, not-so-obvious case is when the head moves while not on the same horizontal/vertical line as the tail:
. . . . . H . . H
/
. . H => . . / . => . . T
/
. T . . T . . . .
Again, the instant after the head moves up, the tail becomes disconnected, and thus must move towards the head. The tail wants to follow the straight segment plotted above, thus it moves up one unit and right one unit, ending up right below the head.
To generalize: whenever the tail becomes disconnected, it will move in the direction "suggested" by the imaginary segment connecting it straight to the head. Thus, if this segment is parallel to the grid axes, the movement is a simple step (up, down, left or right), otherwise the movement will be a larger diagonal step in the same direction as the segment.
Oookay. Now that we (hopefully) understood how this dance of head and tail rope movements works, let's start coding. First of all, since for each input move we'll have to simulate all the steps, let's make things easier with a dictionary of deltas to apply to make a single step given a direction:
# {direction: (deltax, deltay)}
DELTA = {'U': (0, 1), 'D': (0, -1), 'L': (-1, 0), 'R': (1, 0)}
In order to keep track of head and tail positions we'll use 4 variables as two
2D coordinates. Parsing the input is only a matter of .split()
and conversion to int
, and the movement of the head can be done with an
addition after looking up the right delta on both axes using the dictionary we
just defined.
hx, hy = 0, 0
tx, ty = 0, 0
with open(...) as fin:
for line in fin:
direction, steps = line.split()
steps = int(steps)
for _ in range(steps):
deltax, deltay = DELTA[direction]
hx += deltax
hy += deltay
# Move the tail somehow...
Okay, now we are simulating the movement of the head, but we also need the tail
to follow along. We can represent the segment that the tail wants to follow as
2D vector, meaning another couple of variables: dx
and dy
, which can be
easily calculated:
dx = hx - tx
dy = hy - ty
Now, how do we know if the head moved too far from the tail? We can use Euclidean distance for that. The 8 possible cells surrounding the head will always be at an Euclidean distance of at most √(2) (square root of 2) units. That is, the four cells above, below, left and right will have a distance of 1 unit, while the other four cells will have a distance of √(12+12) = √(2) units. Thus, the tail needs to move and follow the head IFF √(dx2+dy2) > √(2). If we get rid of the square root by squaring both sides, we are left with the condition dx2+dy2 > 2.
How does the tail need to move? Well, following dx
and dy
. If dx
is
positive, it moves one unit to the right, while if it's negative it moves one
unit to the left. If dx
is zero it means that either the tail is right on top
of the head, or that it's on its same horizontal line: either way, it does not
need to move along the x axis (left or right). The same reasoning applies to
dy
.
Following these conclusions, and using a set
to keep track of all
the different coordinates visited by the tail, we have the complete solution:
hx, hy = 0, 0
tx, ty = 0, 0
seen = {(0, 0)}
with open(...) as fin:
for line in fin:
direction, steps = line.split()
steps = int(steps)
for _ in range(steps):
deltax, deltay = DELTA[direction]
hx += deltax
hy += deltay
dx = hx - tx
dy = hy - ty
if dx**2 + dy**2 > 2:
if dx != 0:
tx += 1 if dx > 0 else -1
if dy != 0:
ty += 1 if dy > 0 else -1
seen.add((tx, ty))
The use of the conditional expression 1 if dx > 0 else -1
could be simplified. After all, what we are doing is merely taking the sign of
dx
. As it turns out though, Python does not have a built-in for the sign of a
number, apparently because the authors of the language could not
agree on what such a function would return in special cases (e.g.
0
or NaN
). We can however write our own sign()
function to take care of
this, and make it return 0
in case the given value is 0
.
# Intuitive implementation, the first one you'd probably think of
def sign(x):
if x == 0:
return 0
return 1 if x > 0 else -1
# More clever, branchless implementation:
def sign(x):
return (x > 0) - (x < 0)
And now we have:
# ...
if dx**2 + dy**2 > 2:
tx += sign(dx)
ty += sign(dy)
# ...
In any case, after the main loop is done, we have all the different coordinates
visited by the tail in our seen
set, and the solution is one len()
away:
different_tail_positions = len(seen)
print('Part 1:', different_tail_positions)
Now our rope becomes longer. It's composed of 10 pieces in total (including the head). The rules to follow are the same as before, but now every piece follows the one in front of it. The tail is the last piece, and we still need to count the number of different positions occupied by it.
We can re-write our solution to keep track of a list
of 2D coordinates instead
of just four variables. Furthermore, we can solve both parts together, since
part 1 fundamentally asks us to keep track of the second piece of the rope, and
part 2 asks us to keep track of the tenth:
rope = [(0, 0)] * 10
seen1 = {(0, 0)}
seen9 = {(0, 0)}
The only thing that actually changes is that the logic used to move the tail for part 1 now needs to be applied 9 times (once per piece following the head). Each time, any given piece follows the one in front of it. Instead of going through each step again, I'll just let the code speak for itself with the help of a few comments:
with open(...) as fin:
for line in fin:
direction, steps = line.split()
steps = int(steps)
for _ in range(steps):
hx, hy = rope[0]
dx, dy = DELTA[direction]
# Move the head
rope[0] = hx + dx, hy + dy
for i in range(9):
# Consider this piece as the "head"
hx, hy = rope[i]
# And the one after it as the "tail"
tx, ty = rope[i + 1]
# Do the exact same thing we did for part 1
dx, dy = hx - tx, hy - ty
if dx**2 + dy**2 > 2:
tx += sign(dx)
ty += sign(dy)
rope[i + 1] = (tx, ty)
# Keep track of both the second and the last pieces
seen1.add(tuple(rope[1]))
seen9.add(tuple(rope[9]))
answer1 = len(seen1)
answer2 = len(seen9)
print('Part 1:', answer1)
print('Part 2:', answer2)
Problem statement — Complete solution — Back to top
It wouldn't be Advent of Code without some kind of virtual machine to implement!
Today's puzzle requires us to implement a VM that only has one register X
and
knows two instructions: noop
and addx n
. The first one has a duration of
1 cycle and does nothing, while the second one has a duration of 2 cycles and
adds the immediate value n
to X
after the two cycles have passed.
We start at the first cycle, and need to emulate a list of instructions. Then,
starting from cycle 20 and every 40 cycles after that (60, 100, 140, ...) we
need to calculate the product of the current value of X
and the current cycle
number. Summing up all these products together will give us the answer for part
1.
Seems quite simple. The only pitfall is that the addx
instruction takes 2
cycles instead of 1, and one of those could be the one we are interested in
(where we need to compute the product). Not a real problem: whenever we
encounter an addx
we'll just need to repeat the cycle increment twice and
check its value in between these two increments.
We don't need to do much input parsing, let's just get everything into a list
with .readlines()
. We could also wrap our main loop
inside a with
statement as we did yesterday and read the file one line
at a time if we want.
with open(...) as fin:
program = fin.readlines()
Now we'll start a simple loop for instr in program
, incrementing a cycles
counter at the start of each iteration. We'll skip the noop
instructions
completely as they do nothing else, while for addx
we'll need to increment the
counter a second time and perform the addition on the X
register.
total = 0
cycle = 1
x = 1
for instr in program:
cycle += 1
if instr.startswith('addx'):
# Check cycle and calculate product
# Increment again and perform the operation
cycle += 1
x += int(instr[5:])
# Check cycle and calculate product
In between each increment of the cycle counter, we need to check whether it is one of the wanted values and perform the multiplication. The values we have are all of the form 40n + 20 with n ≥ 0, so a simple modulus operation will suffice:
for instr in program:
cycle += 1
if instr.startswith('addx'):
if cycle % 40 == 20:
total += cycle * x
cycle += 1
x += int(instr[5:])
if cycle % 40 == 20:
total += cycle * x
print('Part 1:', total)
We are now told that this weird VM has a CRT monitor, and that each cycle a pixel is drawn. Each line on the monitor consists of 40 pixels, so a new line of pixels begins every 40 pixels drawn. This means that the second line of the monitor starts at the 41st pixel, the third line at the 81st, and so on.
Whether the pixel drawn each cycle is lit or not depends on the value of the X
register, which indicates the position of the left side of a 3-pixel wide,
1-pixel tall sprite. Whenever the pixel currently being drawn falls within this
sprite, the pixel is lit.
After emulating the display, a 40-by-6 image should appear, representing 8 capital letters: our part 2 answer. This seems fun, as we actually need to "draw" these pixels in our terminal to see the answer.
We'll implement part 2 within the same loop used for part 1. The only two additional things we need to do each cycle are:
- Check whether we need to advance to the next line. This again can be easily done with a modulo operation since the beginning of each line will be at 40n + 1 cycles (with n ≥ 0).
- Check whether we need to draw a lit pixel or not. This means checking whether
the column of the current pixel to draw falls within the "sprite". A row is
40 pixels wide, so the column within the current row is simply the cycle
counter modulo 40, and the sprite position within the current row is
represented by the
X
register. This means that a lit pixel is drawn any time the cycle counter modulo 40 is withinX
andX + 2
(extremes included).
It really takes 10 times the amount of lines to explain it than to write it. To
emulate the CRT monitor we'll use a simple string, where each character is a
pixel. We'll use '#'
to represent lit pixels, ' '
(a space) for dark
pixels and we'll add a newline ('\n'
) each time a new row is needed.
Here's the code to add to the main loop we already wrote:
+crt = ''
for instr in program:
+ if x <= cycle % 40 <= x + 2:
+ crt += '#'
+ else:
+ crt += ' '
+
cycle += 1
if instr.startswith('addx'):
if cycle % 40 == 20:
total += cycle * x
+ elif cycle % 40 == 1:
+ crt += '\n'
+
+ if x <= cycle % 40 <= x + 2:
+ crt += '#'
+ else:
+ crt += ' '
cycle += 1
x += int(instr[5:])
if cycle % 40 == 20:
total += cycle * x
+ elif cycle % 40 == 1:
+ crt += '\n'
The 3-way comparison x <= y <= z
is equivalent to
x <= y and y <= z
, but only evaluates y
once, and only evaluates z
if
x <= y
is not satisfied. This is a nice little feature that makes bound
checking in Python much cooler than in other languages.
The addition of new pixels to the crt
variable can also be done using a
tristate operator:
crt += '#' if x <= cycle % 40 <= x + 2 else ' '
Now all that's left to do is print the solution. We'll tell
[print()
][py-builtin-print] to add an extra newline between the prompt and the
answer using sep='\n'
so that the contents of crt
are printed nicely:
print('Part 2:', crt, sep='\n')
The output will look something like this:
Part 2:
### ## ## #### # # # # # ####
# # # # # # # # # # # # #
### # # # ### ## # #### ###
# # #### # # # # # # # #
# # # # # # # # # # # # #
### # # ## #### # # #### # # #
And here we go, 20/50 stars!
Problem statement — Complete solution — Back to top
Today's problem is more of a reading comprehension challenge than a real problem, at least for part 1. We need to simulate some sort of game happening between a number of monkeys. Each monkey has some specific attributes that are listed in our input:
- A list of items (positive integers) it currently holds.
- An operation to perform when inspecting an item: this is either an addition or a multiplication, of which the first operand is always the item value.
- The second operand of the operation: either a fixed integer or the item value again.
- A divisor to check the item value against after inspecting it and performing the operation on it: if the calculated value is divisible by this number, the item is passed to a given monkey, otherwise to another.
- Two indices of the monkeys to pass items to, as explained in the previous point.
The game to simulate happens in rounds, and each round every monkey plays one turn, consisting of the following operations:
-
Inspect the first item in the list of items, applying the operation to its value.
-
Divide the result by 3.
-
Check whether the obtained value is divisible by the divisor.
- If so, pass the item (with the new value) to a given monkey (the index of the monkey to pass to is a property of the monkey itself).
- Otherwise, pass it to another given monkey.
After simulating 20 rounds, we need to find the two monkeys who inspected the most items and calculate the product of the total number of inspections they performed.
Since we are dealing with complex objects with different properties to remember,
let's write a class to represent a single monkey. This class will hold all the
properties defined above. We don't really need to pass a code review, so we'll
assign all attributes after creating the class. For now, let's just define its
attributes with a __slots__
class variable. This allows for
faster attribute lookup and less memory usage, but the class will only be able
to hold the properties defined in __slots__
(which is fine by us).
class Monkey:
__slots__ = (
'items', 'op', 'op_value', 'divisor',
'pass_if_true', 'pass_if_false', 'inspections'
)
Now, for the input parsing... this is going to be painful. Here's the description of a single monkey in our input:
Monkey 0:
Starting items: 98, 97, 98, 55, 56, 72
Operation: new = old * 13
Test: divisible by 11
If true: throw to monkey 4
If false: throw to monkey 7
All monkey descriptions are divided by an empty line, so we can initially
.read()
the whole input file and .split()
the resulting string on empty lines to obtain a list of "raw" monkey
description strings:
with open(...) as fin:
raw_monkeys = fin.read().split('\n\n')
Now we can iterate over the strings in raw_monkeys
, split them into single
lines through .splitlines()
, and parse the properties on
each line. Since every line contains different text, a
regular expression to extract integers on each line is probably
the simplest way to go. We can create one with the re
module, and then use its
.findall()
method to extract a list of matches from a
string.
import re
regexp = re.compile(r'\d+') # r is for "raw string"
>>> regexp.findall("Hello 123 World 456, today is December 11th, 2022")
['123', '456', '11', '2022']
The item list contains one or more integers, so we can keep the matches as a
list and transform them to int
with map()
. Since we will
be removing/adding items to/from this list a lot, a
deque
from the collections
module
is a better choice than list
, as it provides fast insertion and deletion from
either of its ends (while list
does not).
Other lines contain only one integer, so we can simply do [0]
on the obtained
list of matches. The operation to perform (e.g. new = old * 13
) can be either
an addition or a multiplication. We can import those two as functions from the
operator
module: operator.add()
and operator.mul()
.
The only "annoying" line is the third, which can either contain an integer (e.g.
Operation: new = old * 13
) or not (e.g. Operation: new = old + old
): for
this, we'll have to first check if we have any match, and if not, it means the
operation uses old
as operand on both sides.
from collections import deque
from operator import add, mul
monkeys = []
for raw_monkey in raw_monkeys:
lines = raw_monkey.splitlines()
matches = regexp.findall(lines[2])
m = Monkey()
m.items = deque(map(int, regexp.findall(lines[1])))
m.op = add if '+' in lines[2] else mul
m.op_value = int(matches[0]) if matches else None
m.divisor = int(regexp.findall(lines[3])[0])
m.pass_if_true = int(regexp.findall(lines[4])[0])
m.pass_if_false = int(regexp.findall(lines[5])[0])
m.inspections = 0
monkeys.append(m)
As promised, we initialized all attributes of each instance of Monkey
from
outside the class. You would normally do this by passing all the values to the
class constructor (i.e. its __init__()
method), but there
are a lot of them and I did not want to write a function taking so many
parameters.
Now that we have a list of objects to work with, let's actually simulate the
game. To make things simpler, let's implement the basic "inspection" operation
that a monkey performs as a method of the Monkey
class:
class Monkey:
# ... unchanged ...
def inspect(self):
# Remove the first item from the items we have (we'll pass it on to
# another monkey anyway)
item = self.items.popleft()
if self.op_value is None:
# Operation to perform is `old + old` or `old * old`
return self.op(item, item)
# Operation to perform is `old + VALUE` or `old * VALUE`
return self.op(item, self.op_value)
The rules of each round, as explained in the numbered list a few paragraphs above, are pretty straightforward. Let's write a function to play a given number of rounds in a row and return the answer we want.
After simulating all the requested rounds, we can extract the .inspections
attribute of each monkey class using map()
plus a lambda
, or
better map()
plus attrgetter()
. Then, the top two
values can be extracted after sorting in descending order with
sorted()
(not a good idea in general if we only need the
top two, but we have a very small list, so it's fine).
from operator import attrgetter
def simulate(monkeys, n_rounds):
for _ in range(n_rounds):
for m in monkeys:
# This monkey will inspect all of its items in this round
m.inspections += len(m.items)
while m.items:
# Pop the first item and apply the needed operation to it, then
# divide by 3
item = m.inspect() // 3
# Check if the new value is divisible by m.divisor and pass it
# on to the correct monkey by appending it to its items
if item % m.divisor == 0:
monkeys[m.pass_if_true].items.append(item)
else:
monkeys[m.pass_if_false].items.append(item)
# Get the two highest total number of inspected items
a, b = sorted(map(attrgetter('inspections'), monkeys), reverse=True)[:2]
return a * b
All that's left to do is call the function we just wrote:
answer = simulate(monkeys, 20)
print('Part 1:', answer)
We need to do the same thing we did for part 1, but the rules change slightly: we no longer need to perform the division by 3 after a monkey "inspects" an item, and we need to simulate 10000 (ten thousand) rounds. This is interesting.
The first requirement is not a problem, we can just remove the // 3
from the
code of our simulate()
function. The second requirement is a bit of a pain
though. We are dealing with additions and multiplications, this means that the
value of an item can only ever increase, and doing 10000 rounds means doing a
lot of additions and multiplications... we will never be able to work with such
large numbers, at least not in a reasonably fast way.
Apart from the multiplication and addition done by each monkey when "inspecting"
an item, the only other operation we perform is a divisibility check (item % m.divisor == 0
). In other words, at the end of the day, the only thing that
matters about a given item value, is whether it can be divided by the divisor of
any given monkey. This is what "drives" the game in different directions since
it's used to choose the next monkey to pass an item to. We need a way to keep a
value small enough to be able to work with it, while still being able to check
whether it is divisible by a given set of integers (the divisors of each
monkey).
How do we do this? With a bit of modular arithmetic:
-
Clearly, if there only was one monkey, instead of
item
we could keep the much smalleritem % m.divisor
around. After all, we don't care about the real value, only about its divisibility bym.divisor
. The divisibility would then be preserved since congruence to a fixed modulus is preserved by addition and multiplication, which are the only two operations we perform. In other words,(x + a) % d == ((x % d) + (a % d)) % d
, or in mathematical terms a + b ≡ (a mod d) + (b mod d) (mod d). -
In the case of two monkeys with divisors
d
andq
, in order to preserve the divisibility by both, we only need to keepitem % (d * q)
around. This is somewhat intuitive since a number divisible byd * q
will also always be divisible by bothd
andq
separately. We have(x % (d*q)) % d == x % d
for any value ofq
, or in mathematical terms: x mod dq ≡ a (mod d) ⇔ x ≡ a (mod d) for any positve integer q. -
In the general case of multiple different divisors, we only need to keep item values modulo the product of all the divisors. This product is small enough and is also pre-computable and fixed through the entire simulation (since divisors are fixed). To be precise, we actually only need the least common multiple of the divisors.
Knowing all of the above, we can get our part 2 solution almost for free with
only a couple of modifications to our simulate()
function:
-def simulate(monkeys, n_rounds):
+def simulate(monkeys, n_rounds, part2=False):
+ if part2:
+ modulus = lcm(*map(attrgetter('divisor'), monkeys))
+
for _ in range(n_rounds):
for m in monkeys:
m.inspections += len(m.items)
while m.items:
- item = m.inspect() // 3
+ if part2:
+ item = m.inspect() % modulus
+ else:
+ item = m.inspect() // 3
if item % m.divisor == 0:
monkeys[m.pass_if_true].items.append(item)
else:
monkeys[m.pass_if_false].items.append(item)
a, b = sorted(map(attrgetter('inspections'), monkeys), reverse=True)[:2]
return a * b
The lcm()
function (used above to compute the least common
multiple of all the .divisor
s) is available in the math
module from Python 3.9
onwards. If we are on a lower Python version, we can implement it ourselves
using gcd()
, available in the math
module from Python 3.5
onwards.
# Python >= 3.5: from math import gcd
def gcd(a, b):
while b:
a, b = b, a % b
return a
# Python >= 3.9: from math import lcm
def lcm(*integers):
it = iter(integers)
res = next(it)
for x in it:
res = res * x // gcd(res, x)
return res
One last thing before computing the answer: since we are modifying the .items
of each monkey while simulating, we need to keep a copy of the original values
in order to perform the simulation again. We can do this with
copy.deepcopy()
.
We finally have a complete solution:
from copy import deepcopy
original = deepcopy(monkeys)
answer1 = simulate(monkeys, 20)
print('Part 1:', answer1)
answer2 = simulate(original, 10000, True)
print('Part 2:', answer2)
Smooth!
Problem statement — Complete solution — Back to top
Pathfinding on a grid! I was missing that on my AoC bingo card
this year :'). We are given a grid of letters ranging from a
to z
. The grid
represents a topographic map, and each letter is a
different level of elevation: a
is lower than b
, which is lower than c
,
and so on until z
. This grid also contains two uppercase letters, S
and E
,
which are respectively the start cell and the destination to reach. The start is
at elevation a
, and the destination is at elevation z
.
We need to find the minimum number of steps in the four cardinal directions that are needed to reach the destination from the start, with the constraint that we can only move to cells that have an elevation not higher than the current elevation plus one.
If you've ever done any problem similar to this, you already know that the answer to "what's the minimum distance from A to B on a grid" is always breadth-first search (BFS) as the steps to perform are unitary. All we need to do is parse the input grid and create and perform an adequate BFS on the grid, taking into account the rules to move from one cell to another.
Let's start with the BFS implementation right off the bat. We'll be working with
coordinates of the form (row, column)
. Given a cell's coordinates, we need to
explore all the reachable cells using BFS, avoiding neighboring cells that are
too high in elevation, and stopping whenever we reach the coordinates of the
destination. Here's a commented function that implements BFS, using a
deque
as queue and math.inf
to
indicate that no solution is found:
from collections import deque
from math import inf as INFINITY
def bfs(grid, src, dst):
h, w = len(grid), len(grid[0]) # pre-calculate height and width for simplicity
queue = deque([(0, src)]) # queue of tuples (distance_from_src, coords)
visited = set()
# While there are coordinates to visit
while queue:
# Get the first one in the queue
distance, rc = queue.popleft()
# If we reached the desination, return the distance traveled so far
if rc == dst:
return distance
# Skip already vsited coordinates
if rc not in visited:
visited.add(rc)
# For each neighboring cell that wasn't already visited
for n in get_neighbors(grid, r, c, h, w):
if n in visited:
continue
# Add it to the back of the queue with a distance equal to the
# current one plus 1
queue.append((distance + 1, n))
return INFINITY
So far, this is just plain and simple BFS. The get_neighbors()
used above will
contain the actual logic to solve the problem, let's write it as a
generator function that will yield
all the coordinates of
reachable neighboring cells according to the elevation rules. We will assume to
be working with a grid
where each cell contains integers representing
elevation values. Given a position (r, c)
, the maximum elevation reachable
will therefore grid[r][c] + 1
, and the neighboring cells in the 4 cardinal
directions will be at (r+1, c)
, (r-1, c)
, (r, c+1)
and (r, c-1)
.
def get_neighbors(grid, r, c, h, w): # get h, w as parameters for simplicity
max_el = grid[r][c] + 1
for nr, nc in ((r + 1, c), (r - 1, c), (r, c + 1), (r, c - 1)):
if 0 <= nr < h and 0 <= nc < w: # ensure we are within the grid
if grid[nr][nc] <= max_el: # ensure this neighbor is reachable
yield nr, nc
We are almost done. Let's parse the input now: since we are working with ASCII
characters, we can do what we did for day 8 part 1 and open the input
file in binary mode ('rb'
) to get a free conversion from characters to
elevation values. The lines can be extracted as bytes
with
.splitlines()
after reading the whole file, and we can
then transform each line into a list of int
with map()
(pretty easy since iterating over bytes
already yields int
s).
with open(..., 'rb') as fin:
lines = fin.read().splitlines()
grid = list(map(list, lines))
A couple of constants to represent the source, the destination and the highest/lowest elevations will help us make the code more human readable:
# These will all be ints since iterating bytes yields ints
START, END, LOWEST, HIGHEST = b'SEaz'
Now we need to find the source and destination coordinates. A simple scan of the
grid with a double for
loop gets the job done, and I will also throw a couple
of enumerate()
in there for ease. Since our
get_neighbors()
function looks at grid cells assuming they all represent
elevation values, after finding the source and the destination, we will edit the
grid replacing their value with the actual elevation.
src = dst = None
for r, row in enumerate(grid):
for c, cell in enumerate(row):
if cell == START:
src = r, c
grid[r][c] = LOWEST
elif cell == END:
dst = r, c
grid[r][c] = HIGHEST
if src and dst:
break
Perfect, we've got all we need. The answer is one bfs()
call away:
min_dist = bfs(grid, src, dst)
print('Part 1:', min_dist)
For part 2, we are told that all cells at elevation a
are valid starting
points: we need to find the minimum possible distance to travel to get to the
destination (E
) starting from any a
.
The instinctive thing to do here is to just call bfs()
in a loop passing the
grid coordinates of any a
cell as source, and keep the minimum returned value.
We can do better though. Since we are using BFS on a graph where arcs have all
the same weight (i.e. all steps we make are one unit of distance), we know for a
fact that whenever we visit a given cell for the first time, we have also
traveled the shortest possible distance to do so. This means that if we perform
BFS backward from the destination, we can simply stop at the first cell with
elevation a
that we encounter, and we'll have found the minimum distance from
any a
to the destination.
There's a catch though. If we want to move backward from the destination (E
),
we also need to write a different get_neighbors()
function, since now the
check to perform on the elevation is the opposite: we can only move to
neighboring cells that have an elevation higher than or equal to the current
cell's elevation plus one.
We still need the original get_neighbors()
function. Let's split it into three
different functions: a generic neighbors4()
function that only returns the
coordinates of any valid neighboring cell, a second one that checks elevation
constraints assuming we are moving forward, and a third one that checks elevation constraints
assuming we are moving backward.
def neighbors4(grid, r, c, h, w):
for nr, nc in ((r + 1, c), (r - 1, c), (r, c + 1), (r, c - 1)):
if 0 <= nr < h and 0 <= nc < w:
yield nr, nc
def neighbors_forward(grid, r, c, h, w):
max_el = grid[r][c] + 1
for nr, nc in neighbors4(grid, r, c, h, w):
if grid[nr][nc] <= max_el:
yield nr, nc
def neighbors_backward(grid, r, c, h, w):
min_el = grid[r][c] - 1
for nr, nc in neighbors4(grid, r, c, h, w):
if grid[nr][nc] >= min_el:
yield nr, nc
We can further simplify the last two functions using a filtering
generator expression of the form
x for x in iterable if <condition on x>
, which will only yield items that
satisfy a certain condition. Here it is:
def neighbors_forward(grid, r, c, h, w):
max_el = grid[r][c] + 1
neigh = neighbors4(grid, r, c, h, w)
yield from ((nr, nc) for nr, nc in neigh if grid[nr][nc] <= max_el)
def neighbors_backward(grid, r, c, h, w):
min_el = grid[r][c] - 1
neigh = neighbors4(grid, r, c, h, w)
yield from ((nr, nc) for nr, nc in neigh if grid[nr][nc] >= min_el)
The BFS code we wrote for part 1 only needs a couple of modifications to also work for part 2:
- Since we want to use two different functions to get cell neighbors, we can accept a function as an additional parameter.
- Since for part 2 we also want to stop whenever we visit a cell containing a
given value, we can alter the condition used to
return
to also check whethergrid[r][c] == dst
, and then pass the destination coordinates assrc
, and the valueLOWEST
asdst
. This won't break part 1 sincegrid[r][c] == dst
will never match whendst
is a tuple of coordinates.
-def bfs(grid, src, dst):
+def bfs(grid, src, dst, get_neighbors):
h, w = len(grid), len(grid[0])
queue = deque([(0, src)])
visited = set()
while queue:
distance, rc = queue.popleft()
- if rc == dst:
+ r, c = rc
+ if rc == dst or grid[r][c] == dst:
return distance
if rc not in visited:
visited.add(rc)
for n in get_neighbors(grid, r, c, h, w):
if n in visited:
continue
queue.append((distance + 1, n))
return INFINITY
And here we go, we now have the solution for both parts:
min_dist = bfs(grid, src, dst, neighbors_forward)
print('Part 1:', min_dist)
min_dist_from_any_a = bfs(grid, dst, LOWEST, neighbors_backward)
print('Part 2:', min_dist_from_any_a)
Problem statement — Complete solution — Back to top
Today we're dealing with nested lists of integers, called "packets" in the problem statement. We are given a list of packet pairs, and we need to compare the two packets in each pair to determine if they are in the correct order or not. In other words, we need to establish an order for all the given pairs.
The order of a pair of packets is based on some arbitrarily given rules that need to be applied recursively:
-
If both values are integers, the lower integer should come first. If they are equal, we don't know which one should come first.
-
If both values are lists, check the order between each pair of items in the lists. The first pair of items for which the order can be determined also determines the order of the two lists. Whichever list runs out of items before a decision can be made should come first. If instead no order can be determined for any of the pairs, and both lists have the same length, we don't know which list should come first.
-
If one of the values is a list and the other is an integer, transform the integer into a 1-item list (containing only said integer), and re-perform the check.
We need to find the sum of the (1-based) indices of the pairs of packets that are in the right order.
Let's start with an example to understand the rules. Let's say we have the following pair of packets:
[[1],[2,3,4],9]
[[1],4,0]
These are the checks that would be performed, and their outcome, according to the above rules:
- Compare
[[1],[2,3,4],9]
with[[1],4,0]
. Both are lists, so compare their items pairwise:- Compare
[1]
with[1]
:- Compare
1
with1
:- They are equal, order is unknown.
- Compare
- Compare
[2,3,4]
with4
. One is a list and one is an integer. Transform4
into[4]
and retry:- Compare
[2,3,4]
with[4]
:- Compare
2
with4
:2
is less than4
so2
should come first, and indeed it does. This means that the original packets were in the right order. Stop.
- Compare
- Compare
- Compare
As can be seen from the above example, Whenever we are able to make a decision
about the order of two items, at any nesting level, we should stop and
"propagate" that decision back to the original packets. It does not matter
what's in the rest of the packets. In fact, the two packets above also contained
a 9
and a 0
, which were never compared because we could determine the order
before getting to them.
As yesterday, we'll use a bottom-up approach. Let's implement a recursive
function to compare two packets. We'll assume to have two parameters a
and
b
, both of which can either be list
or int
, and apply the rules to
determine which one should come first. For simplicity, we'll make the function
return an integer: negative if a
should come before b
, positive if b
should come before a
, and zero otherwise. This is also known as a "cmp"
function (which actually was a built-in function in Python 2).
-
In case both parameters are
int
, we can simply returnb - a
. In fact,a - b
will be negative ifa < b
, positive ifa > b
and zero ifa == b
. -
In case only one is an
int
, we'll wrap it into a list and make a recursive call to re-perform the comparison. -
In case both parameters are
list
, to compare their items pairwise we can iterate over the lists withzip(a, b)
and make a recursive call for each pair of items. If any of the recursive calls we make is conclusive (i.e. the result is not0
), we are done. Otherwise, we'll keep going.If we run out of items to compare, we'll need to check the length of the two lists instead: the shorter one should come first (same logic as the one for two integers), so we can just return
len(a) - len(b)
.
To check whether a variable is an int
, we can either use
type(x)
or isinstance(x, int)
).
I'll go with the former.
def compare(a, b):
a_is_int = type(a) is int
b_is_int = type(b) is int
# Both integers.
if a_is_int and b_is_int:
return a - b
# One is an integer, wrap it into a list and re-check.
if a_is_int != b_is_int:
if a_is_int:
return compare([a], b)
else:
return compare(a, [b])
# Both lists, check their items pairwise, stop at the first conclusive result.
for x, y in zip(a, b):
res = compare(x, y)
if res != 0:
return res
# We ran out of items before coming to a conclusion. Now what matters is
# which list was longer (according to rule number 2).
return len(a) - len(b)
Okay, now to the input parsing. The first thing to do is read each packet in the
input as a string. We can skip empty lines delimiting packet pairs either
iterating over the input line by line or reading it all at once and using
.replace()
. Then, .splitlines()
will
give us a list of raw packets (strings) to later parse.
with open(...) as fin:
lines = fin.read().replace('\n\n', '\n').splitlines()
We chose to read all packets at once into a list in order to easily parse them
all at once with a single call to map()
, given a function
able to parse one packet.
As you probably already noticed, the packets are valid Python lists, meaning
that if we pasted them into a Python interpreter they would evaluate to actual
lists. There's a function that can do exactly this: eval()
.
It must be noted however that using eval()
for input parsing is in general a
really bad idea, as it can evaluate arbitrary Python code and therefore do
unexpected things. If the input comes from an untrusted source, we should never
eval()
it before making sure it's safe to do so, and we should avoid eval()
in general.
# Example of "unexpected things"
eval('__import__("os").system("arbitrary shell commands here")')
As it turns out though, our packets are also valid JSON objects,
and we have json.loads()
that parses a string representing a
JSON object into a Python object. JSON arrays ([...]
) are turned into list
and integers into int
, so this function is all we need to parse a packet
without worrying about evaluating random Python code. We could write an actual
parsing function ourselves without the use of eval()
or json.loads()
, but...
I am honestly too lazy for that, at least today. Also, why do it in 10 lines of
code when you can do it in one (said no sane programmer ever)?
Anyway... here goes nothing:
from json import loads
packets = list(map(loads, lines))
Now we can group our packets in pairs:
pairs = []
for i in range(0, len(packets), 2):
pairs.append(packets[i:i + 2])
Which can be simplified to a single generator expression:
pairs = (packets[i:i + 2] for i in range(0, len(packets), 2))
Note that the above does not actually create a list or tuple, but rather a
generator that will yield
pairs on the fly, and which is thus only iterable
once, which is enough for us.
Now we can finally iterate over the pairs and compare()
them. Since we need to
provide the sum of 1-based indices of correctly ordered pairs as the answer,
we'll also use enumearte()
.
answer = 0
for i, (a, b) in enumerate(packets, 1):
if compare(a, b) < 0: # a should come before b
answer += i
print('Part 1:', answer)
All the above loop is doing is calculating a sum, so we can also re-write the
entire thing using sum()
plus a filtered generator
expression (x for x in ... if <condition>
):
answer = sum(i for i, (a, b) in enumerate(pairs, 1) if compare(a, b) > 0)
print('Part 1:', answer)
Now that we have a "primitive" to establish the order between two packets, why
not use it to sort all the packets? That's exactly what we are asked to do,
after adding to our packets two "divider" packets: [[2]]
and [[6]]
. Once
sorted, we need to find the product of the two 1-based indices of the divider
packets.
Well. This is easy. We already have a comparison function, and we have
list.sort()
. The only hiccup is that list.sort()
does not
take a comparison function... it used to in Python 2 with the cmp=
argument,
but in Python 3 that was replaced with key=
, which only takes a single value
and is expected to somehow transform it.
Transforming a cmp
function into a key
function and vice-versa is not really
straightforward. Thankfully, Python developers thought about this for us:
functools.cmp_to_key()
is a function that takes a
cmp
-style function (taking two arguments and returning an integer) and
magically transforms it into a key
-style function (taking one argument and
returning an arbitrary object). The way the conversion is performed is actually
quite interesting and clever, you can read about it in this Stack Overflow
post if curious.
Okay, let's do this:
packets.extend(([[2]], [[6]]))
packets.sort(key=cmp_to_key(compare))
Now we can use .index()
to find the indices of the dividers
and multiply them together. We also know that the second divider ([[6]]
) must
come after the first as per the ordering rules, so we can pass a second
argument to .index()
(the index to start from) to skip already checked items.
answer = packets.index([[2]]) + 1
answer *= packets.index([[6]], answer) + 1
print('Part 2:', answer)
Like a walk in the park!
Problem statement — Complete solution — Back to top
Today we are dealing with a cave that fills with sand. As the problem statement remings us, a similar (although more intricate) concept was explored in 2018 day 17 (of which unfortunately I did not write a walkthrough for... yet).
The cave we are in is a 2D grid, and our input is a list of segments in the form
ax,ay -> bx,by -> ...
, which represent sequances of vertical and horizontal
segments of cave cells occupied by rocks. Units of sand, each occupying one grid
cell, start pouring into the cave one at a time from x=500,y=0
(y=0
is the
top) and fall down the cave obeying four simple rules:
- If the cell immediately below is free, the unit of sand falls one step straight down.
- If instead that cell is occupied, it falls one step diagonally, one cell down and one to the left.
- If instead that cell is occupied, it falls one step diagonally, one cell down and one to the right.
- Otherwise (the three cells under the unit of sand are all occupied), it stops falling and settles down where it is.
Our cave is infinitely deep (y
does not have an upper bound), but there is a
limited number of cells occupied by rocks (our input) forming platforms that
sand can rest on. At some point though, sand will inevitably fill all the
platforms it is able to reach, at which point the next unit that is poured will
start falling down indefinitely into the void. We want to know how many units
of sand will fall and settle down before this happens.
To make the above rules a little bit clearer, here's a simple diagram:
...s...
..213..
A unit of sand (s
above) first tries to fall to the cell labeled 1
; if
that's occupied it tries 2
; if that's also occupied it tries 3
. If all of them
are occupied, it settles down where it is. In other words, a unit of sand can
only settle down above the middle of a 3-cell platform (whether it is sand,
rock, or a mix of the two). More examples that can help better understand the
problem are provided in the problem statement, so I won't go
deeper with the explanation and dive directly into the solution.
It's tempting to create a matrix as a list
of list
to represent our 2D cave.
However, we don't know its size before parsing the entire input, and even then,
it could potentially need expansion due to the sand being poured in the cave
(spoiler, we'll be dealing with a lot more sand in part 2). Instead, we'll use a
set
of points (i.e. tuples of the form (x, y)
), which is a lot simpler to
manage: if a point is in the set, that cave cell will be considered occupied.
cave = set() # Any point (x, y) in this set is occupied (by either a rock or a sand unit)
Let's parse the input. Each line consists of a list of 2D coordinates separated
by ' -> '
, and each point has its x and y components separated by a comma.
Parsing a single line is therefore only a matter of a couple of
.split()
operations:
fin = open(...)
for line in fin:
points = []
for p in line.split(' -> '):
x, y = p.split(',')
points.append(int(x), int(y))
The internal loop above can be further simplified with the use of
map()
and a couple lambda
functions:
for line in fin:
points = line.split(' -> ')
points = map(lambda p: p.split(','), points)
points = map(lambda p: (int(p[0]), int(p[1])), points)
Note that we did not actually need to create a list
of points for each parsed
line, we can keep the generator returned by map()
as we don't need to iterate
on the points more than once.
Each pair ax,ay -> bx,by
in the lists of points we get on each line indicates
a segment of cells occupied by rocks. These segments are always either vertical
or horizontal, never diagonal. In the first case all the cells with x == ax
(== bx
) and y
from ay
to by
are occupied by rocks, while in the second
case all the cells with y == ay
(== by
) and x
from ax
to bx
are
occupied with rocks. Therefore, to add rocks to our cave, we need to do
something like this for each pair of points on each line:
if ax == bx: # vertical segment
for x in range(ax, bx + 1):
cave.add((x, ay))
else: # horizontal segment
for y in range(ay, by + 1):
cave.add((ax, y))
There's a problem though: we have no guarantee on the order of the coordinates,
so we don't know if ax < bx
or ay < by
. We need to distinguish between the
two cases and iterate over a different range()
accordingly. Let's write a function to do that:
def autorange(start, end):
if start > end:
return range(start, end - 1, -1)
return range(start, end + 1)
# Or, alternatively (this range always does steps of +1)
def autorange(start, end):
return range(min(start, end), max(start, end) + 1)
For simplicity, we can now also create a range2d()
generator function that takes two points and uses
autorange()
to yield
all the points of the horizontal or vertical segment
connecting them:
def range2d(a, b):
ax, ay = a
bx, by = b
if ax == bx:
# yield all points in the vertical segment
for y in autorange(ay, by):
yield ax, y
else:
# yield all points in the horizontal segment
for x in autorange(ax, bx):
yield x, ay
Now we can add rocks to our cave. To iterate on points pairwise we'll just have
two variables prev
and cur
, and initialize prev
with the first point
yielded returned by the map()
generator using next()
.
The .update()
method of set
can take any iterable and add
all its elements to the set, which means we can avoid explicitly looping over
the points yielded by range2d(prev, cur)
.
for line in fin:
points = line.split(' -> ')
points = map(lambda p: p.split(','), points)
points = map(lambda p: (int(p[0]), int(p[1])), points)
prev = next(points)
for cur in points:
cave.update(range2d(cur, prev))
prev = cur
In Python >= 3.10, we have itertools.pairwise()
that
can be used to achieve the same thing:
for line in fin:
# ...
for prev, cur in pairwise(points):
cave.update(range2d(cur, prev))
Or, even more concisely, with the help of
itertools.starmap()
to unpack the pairs:
for line in fin:
# ...
cave.update(*starmap(range2d, pairwise(points)))
Before we can simulate the sand pouring into the cave, we know that at some
point sand will fill all the rock platforms in the cave and start falling down
to oblivion, we need to detect this in order to avoid ending up in an infinite
loop. We can do this by pre-calculating the y
coordinate of the deepest rock
in the cave and making sure to never exceed it. This means iterating over all
the points initially in the cave
set and finding the max y
, which we can do
in one line with a simple max()
plus map()
+ lambda
,
or better map()
+ itemgetter()
.
from operator import itemgetter
maxy = max(map(itemgetter(1), cave))
Good, now let's start pouring some sand into this cave. The rules are quite
simple, it's only a matter of following them. Let's write a function to pour a
single unit of sand down the cave from the source ((500, 0)
), adding the final
coordinates to our cave
set if the unit settles down. Since we'll need to call
this function multiple times in a loop, let's also make it return True
if the
sand unit settles and False
otherwise, so we'll know when to stop pouring.
def pour_sand(cave, maxy):
x, y = 500, 0
while y < maxy:
# Can this unit of sand fall directly down?
if (x, y + 1) not in cave:
y = y + 1
continue
# Can it fall one unit down and one to the left (diagonally)?
if (x - 1, y + 1) not in cave:
x, y = x - 1, y + 1
continue
# Can it fall one unit down and one to the right (diagonally)?
if (x + 1, y + 1) not in cave:
x, y = x + 1, y + 1
continue
# This unit of sand cannot fall anymore, so it settles.
cave.add((x, y))
return True
# This unit will keep falling down the infinitely deep cave without settling
return False
We are doing quite a number of repeated calculations with those x
and y
coordinates. Since we are performing the same fundamental operation
(if ... in cave
) 3 times each iteration of the while
loop, we can simplify
things with a for
:
def pour_sand(cave, maxy):
x, y = 500, 0
while y < maxy:
newy = y + 1
for newx in (x, x - 1, x + 1):
if (newx, newy) not in cave:
x, y = newx, newy
break
else:
cave.add((x, y))
return True
return False
Yep, in Python for ... else
is a thing. The else
block is
only entered if no break
is performed during the loop, meaning that our sand
unit could not continue falling down. All we need to do now is call
pour_sand()
repeatedly until it returns False
, counting the number of sand
units poured.
sand = 0
while pour_sand(cave, maxy):
sand += 1
print('Part 1:', sand)
Now we also need to account for sand that overflows the rock platforms (after
filling them all). The cave does actually have a bottom: exactly two units under
the lowest piece of rock lies a flat horizontal and infinitely wide surface. Any
sand unit that previously overflowed will now settle and pile up because of this
horizontal floor. At some point, the pile of sand will become so high that it
will obstruct the source (at 500,0
). We want to know exactly how many units of
sand will it take for this to happen.
There are a few ways to solve this second part:
-
The simplest: keep doing what we did for part 1 and pour a single unit of sand, this time accounting for the cave floor, until some unit settles right at
500,0
. -
Using breadth-first search. We can in fact find the solution in a much faster way using BFS without having to simulate every single sand unit. If we stop and think about it for a moment, the problem we are facing now is merely identifying all grid cells that can be reached by sand units starting from the source. A simple BFS on the entire cave, starting from the source and exploring for each cell the three cells below it, will effectively explore all the cells reachable by sand.
-
Using depth-first search. Analogously, DFS is just another way of exploring all the cells reachable by sand in the cave. However, while BFS is only useful to solve part 2 due to the order in which cave cells would be explored, DFS could also be used to solve part 1. This is because DFS allows us to more closely emulate the behavior of the falling sand units, with the opportunity to stop and backtrack whenever a unit overflows and falls under the lowest rock (
maxy
we calculated in part 1).In fact, the
pour_sand()
function we wrote for part 1 is exploring a single path to an unique final grid cell in a depth-first manner every time it is called. If the path until a certain sand unit settles is 100 steps long, calling the function a second time means re-traveling along 99 of those steps! To avoid this, each time a unit of sand settles, we should in theory be able to backtrack going back to all the previous positios in the path and find the first one where we can either settle another unit of sand or from which we can start exploring a different sub-path.
Admittedly, I was tired enough this morning (AoC happens at 6 AM in Italy) to
just choose option 1 above and call it a day. However, as it turns out,
posting your solution to Reddit saying you know there's
some optimization to make but you are lazy to implement it, actually attracts
other smart and helpful programmers that at the end of the day convince you to
optimize your solution. I'm going to go with option number 3 above, scrapping
the pour_sand()
function we wrote for part 1 in favor of a simpler, concise
recursive DFS implementation.
Now that we are familiar with the problem, the logic should be quite clear in our mind. DFS is very easy to implement recursively, and the Wikipedia page linked above has an example implementation that looks a lot like Python code.
There is however a small difference between the "standard" DFS algorithm and the
one we need to write now: normally a "node" would be marked as visited as soon
as it is reached (i.e. as the first operation in the function), before visiting
its neighbors. In our case though, "visiting" a given cell does not
necessarily mean settling a unit of sand there! Whether or not that's the case
can only be decided after making sure that the three cells under it are
occupied. This means that marking a unit of sand as settled (adding its
coordinates to our grid
) is in fact the last thing we'll do before returning
from the function, after first recursively exploring the 3 cells immediately
below.
Let's first implement the function with only part 1 in mind. If the above
paragraph is clear, there isn't much more to say. Our new pour_sand()
function
still directly adds coordinates to cave
, returning True
when a unit settles
at the given coordinates, and False
otherwise.
def pour_sand(cave, maxy, x=500, y=0):
# We are overflowing to the bottom of the cave!
# This unit of sand will not settle anywhere, stop.
if y == maxy:
return False
newy = y + 1
# Check the 3 cells below this one: straight down, down & left, down & right
for newx in (x, x - 1, x + 1):
# If this cell (newx, newy) is not already occupied...
if (newx, newy) not in cave:
# ... will another unit of sand settle here?
if not pour_sand(cave, maxy, newx, newy):
# If not, the current unit cannot settle at (x, y).
return False
# All three cells below this unit of sand are occupied (otherwise we would
# have already returned False), so it can settle here.
cave.add((x, y))
return True
Now we can modify the function to also take into account the bottom of the cave
for part 2. We know the cave floor is a horizontal line at depth maxy + 2
,
which means that sand will only ever settle at most at maxy + 1
. The
addition of a boolean floor
parameter indicating whether the cave floor is
present or not is enough to adapt our function to solve both parts:
- In case
floor=False
(part 1), we need to give up ify == maxy
, which is exactly the check we are making at the beginning of the function above. - In case
floor=True
(part 2), we can actually exceedmaxy
, so we can skip the initial check. Additionally, since we know that every single cell at depthmaxy + 2
will be occupied, if we find ourselves at somey > maxy
, we already know that the current unit will settle here (the only value ofy
higher thanmaxy
that we can get ismaxy + 1
, as we'll mark settled any unit that gets there.
Here's the new version of the function:
def pour_sand(cave, maxy, floor=False, x=500, y=0):
# For part 2 (floor=True), this check is not needed
if not floor and y == maxy:
return False
# For part 1 (floor=False) we always need to do a recursive check of the
# three cells below us. However, for part 2 (floor=True) we only need to do
# this when y <= maxy. At y = maxy + 1 we can skip the check.
if not floor or y <= maxy:
newy = y + 1
for newx in (x, x - 1, x + 1):
if (newx, newy) not in cave:
if not pour_sand(cave, maxy, floor, newx, newy):
return False
cave.add((x, y))
return True
Now we can solve both parts with single call to pour_sand()
. To know the
number of settled sand units we can simply calculate the size of the cave
set
before pouring the sand (when it only includes rocks) and after.
rocks = len(cave)
pour_sand(cave, maxy)
sand = len(cave) - rocks
print('Part 1:', sand)
pour_sand(cave, maxy, True)
sand = len(cave) - rocks
print('Part 2:', sand)
This was a fun one! Let's continue our journey...
Problem statement — Complete solution — Back to top
Hmm bacon, yummy! Err, nope, sorry, it's beacons we have to deal with today.
We are in 2D space, dealing with integer coordinates only, and measuring
distances using Manhattan geometry. We are given a list of
sensors as input: each sensor sits at a given x,y
point in space, and knows
the position of its closest beacon, which is also given.
For each sensor in our input, there cannot be any other beacons at a Manhattan
distance lower than the distance of the closest one, however, there could very
well be beacons farther away. Using this observation, we need to determine how
many integer coordinates with y=2000000
are invalid beacon positions.
Let's start by parsing the input. Each line is a rather long string from which
we should only extract four integers. Similarly to what we did for
day 11, to make parsing simple, we can use the re
module and a
regular expression that matches digits optionally preceded by a
minus sign, transforming each match into an integer with
map()
.
We do not really care about the actual position of any beacon, but rather about the distance from the sensor that detected them, so we'll just calculate the Manhattan distance right away and save it for later.
import re
def manhattan(ax, ay, bx, by):
return abs(ax - bx) + abs(ay - by)
exp = re.compile(r'-?\d+')
sensors = []
with open(...) as fin:
for line in fin:
sx, sy, bx, by = map(int, exp.findall(line))
sensors.append((sx, sy, manhattan(sx, sy, bx, by)))
The Manhattan distance we just calculated for each of the sensor we are dealing
is effectively their detection radius. Since we know the coordinates of each
sensor and the ones of its closest beacon, we also know that there must be no
other beacon at a distance lower than this detection radius. Since we are
working with Manhattan geometry, the area of detection will not be shaped like a
circle, but rather like a diamond (or well, a square rotated 45 degrees). For
example, a sensor at 3,0
that can see the closest beacon at 4,2
will have a
detection radius of 3
:
3 #
2 ##B
1 #####
Y 0 ###S###
-1 #####
-2 ###
-3 #
0123456
X
In the above example, apart from the point marked as B
, we can be sure there
is no other beacon within the detection radius of the sensor, meaning no beacon
can be in any of the points marked as #
.
We are only interested in a single line at y=2000000
though, and we want to
know how many spots are impossible positions for beacons. We can visualize this
as follows:
. .
... ...
..... .....
....... . . ...
. ......... ... ... .
... ....... ..... .....
..... ..... ....... .......
y=200000: ####### ### ##### #########
..... . ... ...........
... . . .........
. ... .......
..... .....
... ...
. .
Out of all the diamonds "projected" by each sensor, some of them will intersect
with the y=200000
line, and some will not. For those that do, we can easily
calculate the x
coordinate of the start and end of the intersection. If the
y
coordinate of the sensor is exactly 200000
, the length is exactly double
the intersection radius: from sx - r
to sx + r
. Otherwise, it will decrease
linearly with the distance from the y=200000
line on both sides, and we'll
have a segment from sx - r + dist
to sx + r - dist
.
Let's write a function to detect whether the diamond projected by a given sensor
intersects with y=200000
and if so, also calculate its extremes:
def diamond_segment(sx, sy, r):
dist = abs(2000000 - sy)
if dist <= r:
return (sx - r + dist, sx + r - dist)
return None
We can now call the above function for all sensors, and collect the results that
are not None
. This will give us a "raw" list of segments of the form
(start, end)
where we know no beacon can be present. We can use
itertools.starmap()
to unpack the arguments correctly
(as we parsed the sensors into tuples of 3 elements). Then, we can filter those
that are None
using the filter()
built-in.
from itertools import starmap
segments = list(filter(None, starmap(diamond_segment, sensors)))
Cool. All that's left to do is sum up all the segment lengths, right? Wrong! There's a small problem. We could be dealing with overlapping segments. Consider for example the following scenario:
.
...
.....
. .......
...........
...........
y=200000: ######X####
..... ...
... .
.
In the above example, simply summing up the length of each segment, we would be
over-counting the point marked as X
because the segments from two diamonds
overlap there.
To avoid such a situation, we can detect and merge overlapping and contiguous segments into larger ones. We can do this by sorting the segments by their start and then iterating over them, "accumulating" them into a larger segment until we find a segment that is detached (i.e., it has start after the end of the current merged segment).
Sorting is just a matter of calling the sorted()
built-in. Then we can get an iterator over the sorted sequence with
iter()
, use next()
to extract the
first segment, and finally iterate over the rest to join them. Each time a new
"detached" segment is encountered, we can add the length of the current merged
segment to the total, update the current start and end, and keep going. Let's
write a function to do all of this.
def invalid_spots(sensors):
segments = iter(sorted(filter(None, starmap(diamond_segment, sensors))))
start, end = next(segments) # Current "merged" segment
total = 0
for s, e in segments:
if s > end + 1:
# This segment is detached from the current one: add current length
# to the total and use this as new initial segment to merge into.
total += end - start + 1
start, end = s, e
else:
# This segment overlaps with the current one: keep its end and keep going.
end = max(end, e)
# Account for the last segment
total += end - start + 1
return total
All that's left to do is call the above function!
answer = invalid_spots(sensors)
print('Part 1:', answer)
But wait, what? The returned value seems wrong! Indeed, there is still one
thing we did not consider. If there already is any beacon detected on the
y=200000
line, we should not count its spot as invalid. We could simply
subtract the number of beacons with y=200000
from the total, but this won't
work, as multiple sensors could be detecting the same beacon, and we would be
over-counting them. Instead, we can use a set
of x
coordinates to
exclude for all the beacons that are on the target line.
Let's modify the initial input-parsing code to also create this set
:
exp = re.compile(r'-?\d+')
sensors = []
+beacons_on_target = set()
with advent.get_input() as fin:
for line in fin:
sx, sy, bx, by = map(int, exp.findall(line))
sensors.append((sx, sy, manhattan(sx, sy, bx, by)))
+
+ if by == 2000000:
+ beacons_on_target.add(bx)
And we can finally calculate the correct answer:
answer = invalid_spots(sensors) - len(beacons_on_target)
print('Part 1:', answer)
For part 2 our task becomes harder: we are told there is exactly one valid
spot for a beacon with X and Y coordinates between 0 and 4000000, and we
need to find it. Our answer will then be 4000000 * x + y
.
Technically, we could implement a very simple soluion by modifying the
diamond_segment()
function we wrote for part 1 and search from y=0
to
y=4000000
until we find a line where we have exactly 4000000
invalid points.
However, that's going to be just a little bit slow... so we better think of
another approach.
Given what we discussed above for part 1, it should be clear enough that wherever the magical spot we are looking for might be, it must be outside the detection diamonds of all our sensors. There are however a few additional observations we can make:
- Since there is only one such spot, this also means that it will be surrounded by invalid spots belonging to some detection diamonds.
- Therefore, we can deduce that the spot we are looking for must be "touching" the border of one or more detection diamonds.
- Therefore, such a spot must be located at a distance of
radius + 1
from more than one diamond.
Given the above observations, we can use the following strategy:
- For each detection diamond, find the linear equation of the four lines that
are just outside its perimeter, at
radius + 1
(exactly one more unit away than the real radius). These will be 4: one per side. - For every possible pair of diamonds, try intersecting those lines. If there is any intersection, that point is a candidate for the magic spot we are looking for.
- Test any such candidate to check whether it is within the [0, 4000000] range for both X and Y: the first one that does is the one we want.
The reason I am talking about linear equation above is
that those are what is usually needed to find an intersection between two lines
in space. For a given "diamond", the linear equations for its four sides can be
easily calculated with the help of some pen and paper. Assuming that the center
(i.e. the sensor) is at 0,0
and the radius is r
, we have 4 lines that form a
45-degree angle with both axes, with pretty simple equations.
Here's an ASCII-art to illustrate this, with sides labeled A
, B
, C
, D
in
clockwise order for reference:
y = x + r + 1 A/\B y = -x + r + 1
/ \
\ /
y = -x - (r + 1) D\/C y = x - (r + 1)
For a sensor at an arbitrary sx,sy
position, we only need to apply a
linear translation.
y - sy = x - sx + r + 1 A/\B y - sy = -(x - sx) + r + 1
/ \
\ /
y - sy = -(x - sx) - (r + 1) D\/C y - sy = x - sx - (r + 1)
Now, how do we plan on "storing" these equations? Let's look at them more closely:
- Side A: y = x - sx + sy + r + 1
- Side B: y = -x + sx + sy + r + 1
- Side C: y = x - sx + sy - r - 1
- Side D: y = -x + sx + sy - r - 1
The parts we care about are those following the x variable in the above
equations. They are constants that depend on the sensor position (sx,sy
) and
its detection radius. We can easily calculate them for every sensor. Let's write
a generator function to do this:
def sides(sx, sy, r):
r += 1
yield -sx + sy + r # a/\b a: y = x - sx + sy + r + 1
yield sx + sy + r # / \ b: y = -x + sx + sy + r + 1
yield -sx + sy - r # \ / c: y = x - sx + sy - r - 1
yield sx + sy - r # d\/c d: y = -x + sx + sy - r - 1
Now let's say we want to intersect the A
side of a diamond with the B
side
of a second one, we will have the following equations (where the constants
a1 and b2 have already been calculated using the above
function):
- Side A of diamond 1: y = x + a1
- Side B of diamond 2: y = -x + b2
The intersection can be calculated as follows:
- x + a1 = -x + b2
- 2x = b2 - a1
- x = (b2 - a1) / 2
- y = x + a1
We can now write a function to calculate the intersection given the two linear equation constants:
def intersect(a, b):
x = (b - a) // 2
return x, x + a
Using the above function, given any pair of diamonds, we can now calculate all
the possible intersections between pairs of their sides (or well, actually the
sides at radius + 1
). We saw side A
intersected with side B
above. The
situation is analogous if we want to intersect side A
and D
, side C
and
B
, or side C
and D
: the only thing that changes is the order of the
operands in the subtraction. Additionally, it does not make sense to intersect
sides that are parallel to each other (e.g., A
and C
) because such an
intersection will either give infinite solutions (the two lines coincide) or no
solutions at all.
Let's write another generator function for this. It will take two sensors, extract the linear equation constants for the sides of their detection diamonds, and calculate all possible candidate points as intersections of perpendicular sides:
def candidates(s1, s2):
a1, b1, c1, d1 = sides(*s1)
a2, b2, c2, d2 = sides(*s2)
yield intersect(a1, b2)
yield intersect(a1, d2)
yield intersect(c1, b2)
yield intersect(c1, d2)
yield intersect(a2, b1)
yield intersect(a2, d1)
yield intersect(c2, b1)
yield intersect(c2, d1)
We have almost all we need to solve the problem. The final function we'll write
will find the magic spot for the missing (undetected) beacon by computing all
possible candidate points as intersections between parallel sides of any pair of
diamonds and checking their validity. To check the validity of a candidate
point, we need to ensure its coordinates are within 0
and 4000000
, and that
it is outside of all the detection diamonds of our sensors.
We'll use itertools.combinations()
to iterate
over unique pairs of sensors. If a candidate point has coordinate within range,
we'll check that the manhattan distance between the point and any sensor is
greater than its detection radius, which can be done with
all()
plus a generator expression. The
first candidate satisfying these conditions is the magic beacon spot we are
looking for.
from itertools import combinations
def missing_beacon(sensors):
# For all unique pairs of different sensors
for s1, s2 in combinations(sensors, 2):
# Intersect the perpendicular sides of their detection diamonds (at radius + 1)
for x, y in candidates(s1, s2):
# Ensure the solution is within range
if x < 0 or x > 4000000 or y < 0 or y > 4000000:
continue
# Ensure the solution is outside the detection diamond of all sensors
if all(manhattan(sx, sy, x, y) > r for sx, sy, r in sensors):
return x, y
Finally, we can calculate the answer:
x, y = missing_beacon(sensors)
answer = x * 4000000 + y
print('Part 2:', answer)
Problem statement — Complete solution — Back to top
We are inside a volcano that is about to erupt, and we want to stop it. We can
control the internal pressure of the volcano by means of a series of pressure
relief valves connected together in a network of pipes. All
valves start closed, and each one will release a given amount of pressure per
minute if opened. Opening a valve takes 1 minute, and traveling from a valve to
another that is connected to it also takes 1 minute. We initially start at the
valve with identifier AA
and have 30 minutes until the volcano erupts, and we
want to maximize the total amount of pressure to relieve during this time.
It's clear from the problem statement that what we are dealing with is an undirected graph, where the nodes are valves and the edges are the pipes connecting them. What we essentially want to do is find the best "path" of actions that take at most 30 mintues and give the best total pressure release possible.
Let's parse the input into a graph structure then. I will use a dictionary of
lists for this purpose. A defaultdict
comes in
handy as usual in these cases, so that we don't have to check whether a key is
already present or not. The lines of input are a bit annoying to parse, but if
we simply .split()
each one on whitespace one we'll notice
that the second field contains the name of a valve, and the fifth contains its
"pressure released per minute". Fields from the 10th onwards contain other
valves connected to the first one: we can use map()
with a
lambda
to easily extract those by .strip()
ping
away commas where needed.
rates = {}
graph = defaultdict(list)
with open(...) as fin:
for fields in map(str.split, fin):
src = fields[1] # name of this valve
dsts = list(map(lambda x: x.rstrip(','), fields[9:])) # valves connected to it
rate = int(fields[4][5:-1]) # pressure release rate (per minute)
rates[src] = rate
for dst in dsts:
graph[src].append(dst)
The graph
dictionary we just created is of the form {valve: [<connected valves>...]}
, so doing graph['AA']
will give us the list of graph nodes
adjacent to the valve named 'AA'
. Additionally, the rates
dictionary, kept
separate from graph
for simplicity, will tell us what's the pressure release
rate of any valve.
There are several ways to solve the problem: most notably, this problem is a good candidate for a dynamic programming. This is the solution I initially had in mind (but miserably failed to implement for whatever reason) and also the one that a lot of other people used. The first instinctive thought however when reading such a problem would be: can we brute-force the solution? Well, as it turns out, yes, almost, if we are smart about it.
The first thing to usually do to brute-force something is understand what's the
space of the solutions we are looking to brute-force. We can represent a
solution as a dictionary of the form {valve_id: time_of_release}
, which
contains all the valves we decided to open and the time at which they were
opened. Better yet, {valve_id: time_remaining}
where time_remaining
is the
amount of minutes remaining before the volcano erupts: this is much nicer since
the total pressure release of a valve can then simply be computed as
rates[valve_id] * time_remaining
. Summing this up for all the valves in a
given solution will yield the total pressure release of the solution (i.e. the
value whose maximum we are looking for).
Let's implement a function to calculate the score of a solution:
def score(rates, chosen_valves):
tot = 0
for valve, time_left in chosen_valves.items():
tot += rates[valve] * time_left
return tot
How can we enumerate all the possible solutions (i.e. choices of valves to open and when)? In order to choose a set of valves, we need to also make sure we do not exceed the 30 minutes time limit, and therefore the path taken to choose a given set of valves to open is very important. Clearly, it doesn't make sense to loop around and waste time, so ideally we would want to follow the shortest possible path between any pair of valves.
Say we want to choose valves AA
and XX
, but they are not directly
connected to each other: we'd need to find the shortest path from AA
to XX
.
In order to enumerate all valid solutions we will need to perform this action an
awful lot of times! Thankfully, there's an algorithm that can help us. The
Floyd-Warshall algorithm does exactly what we need: it
calculates the minimum distance between any possible pair of nodes of a given
graph. In our case, since all the arcs of the graph have the same weight, this
would be equivalent to performing a BFS scan starting from every
single node and saving all the distances found, so that could also be
implemented instead.
I'll be using standard Floyd-Warshall, implemented as a function returning a
dictionary of dictionaries of the form {a: {b: dist, ...}, ...}
. I'm not going
to copy-paste my implementation here as it's nothing special (check it out in
the full solution if you need), let's just say that we have a
floyd_warshall()
function that returns what we need.
distance = floyd_warshall(graph)
# distance['AA']['BB'] -> minimum distance from valve AA to BB
The minimum distance from one valve to another represents the time it takes to travel between the two. Now that we have all the distances between any pair of valves, it's easy to know how much it'd cost to choose an arbitrary valve at any point in time.
Now we can start enumerating all the possible solutions as choices of valves
that respect the timing constraint. One simple way to do it is through
depth-first search, starting from valve AA
.
- Start at
AA
with30
minutes remaining and no valves chosen. - For each of the valves we did not already choose: try choosing it. Choosing a valve means spending some time traveling to it (which we already calculated for all pairs of valves), plus 1 minute to open it.
- Make a recursive call with the updated remaining time, valves to choose from and valves chosen. Save the returned choices to return them later.
As already said, we'll keep track of the valves chosen to be released through a
dictionary {valve: time_of_release}
, and return value of our function will be
a list of dictionaries representing the currently chosen valves along with all
the other possibilities of chosen valves explored by deeper recursive calls.
Here's the commented code to do this:
def solutions(distance, rates, valves, time=30, cur='AA', chosen={}):
res = [chosen]
# We can't reach any other valve in less than 2m, as it would take minimum
# 1m to reach it plus 1m to open it, and therefore it'd be stay open for 0m.
if time < 2:
return res
# For all the valves we can currently choose
for nxt in valves:
# Choosing this valve will take distance[cur][nxt] to reach it, plus 1m to open it
new_time = time - (distance[cur][nxt] + 1)
# Choose this valve, it will stay open exactly for new_time (i.e. the time
# we have now minus the time it takes to reach and open it).
new_chosen = chosen | {nxt: new_time}
# The new valves to choose from will not include this one
new_valves = valves - {nxt}
# Collect all possible choices having taken this valve
res += solutions(distance, rates, new_valves, new_time, nxt, new_chosen)
return res
Now the above solutions()
function will return all possible combinations of
valves that we can open in under 30 minutes, along with the amount of time each
one will stay open. There are a couple of optimizations to be made though.
First of all, let's convert it into a generator. Lists are
slow to work with, especially when you have a lot of them. This is very simple,
all we need to do is yield
each possible choice as soon as we get ahold of it
instead of accumulating them in a list:
def solutions(distance, rates, valves, time=30, cur='AA', chosen={}):
yield chosen
if time < 2:
return
for nxt in valves:
new_time = time - (distance[cur][nxt] + 1)
new_chosen = chosen | {nxt: new_time}
new_valves = valves - {nxt}
yield from solutions(distance, rates, new_valves, new_time, nxt, new_chosen)
valves = set(graph.keys())
for choice in solutions(distance, rates, valves):
pass # calculate max score
Now, the above works fine, however it's very slow. The function does not even
seem to terminate soon. Indeed, even for the small example input we have, it
generates over 9 million possible solutions. We can reduce this number by a
lot if we notice one very important thing: there seem to be a lot of valves
with a pressure release rate of zero in our input. It is pointless to open
any of them, since they will contribute nothing and only make us waste time to
reach them. We can therefore avoid them. If we pass a valves
set that excludes
them, the number of possible choices for each recursive call (for nxt in valves
) will be greatly reduced, which will in turn exponentially decrease the
total number of solutions returned.
We can use filter
and the rates in the rates
dictionary
to exclude useless valves, and re-perform the call:
good = set(filter(rates.get, graph.keys()))
for choice in solutions(distance, rates, good):
pass # calculate max score
And finally, another important optimization to make is avoiding unnecessary
recursive calls when we already know the remaining time we are passing is not
enough to open any other valve. This means moving the time < 2
check inside
the loop:
def solutions(distance, rates, valves, time=30, cur='AA', chosen={}):
- yield chosen
-
- if time < 2:
- return
-
for nxt in valves:
new_time = time - (distance[cur][nxt] + 1)
+ if new_time < 2:
+ continue
+
new_chosen = chosen | {nxt: new_time}
new_valves = valves - {nxt}
yield from solutions(distance, rates, new_valves, new_time, nxt, new_chosen)
+
+ yield chosen
We have all we need to get the answer now:
best = 0
for choice in solutions(distance, rates, valves):
cur = score(rates, choice)
if cur > best:
best = cur
And since all we are doing is calculating a maximum, we can use max()
plus a
generator expression for it:
best = max(score(rates, s) for s in solutions(distance, rates, good))
print('Part 1:', best)
For the second part of today's problem, we are told that now we work in two at the same problem. Both "players" move together, and the total time is now 26 minutes instead of 30. The answer we need to find is the same.
Okay... clearly the fact that we have two players means that they are going to open different valves, since a valve cannot be opened twice. Therefore, at least in theory, all we would need to do is try every possible combination of two solutions that do not have valves in common, more or less like this:
sols = list(solutions(distance, rates, good, 26))
best = 0
for s1 in sols:
for s2 in sols:
if ...: # s1 and s2 do not have valves in common
cur = score(rates, s1) + score(rates, s2)
if cur > best:
best = cur
However... len(sols)
for our input is 18676
, and a double nested for
loop
means 186762 = ~350 million solutions to test (half of that in
theory, since we do not care about the order of s1
and s2
, but still). That
is... a bit too much. We need to somehow find another simplification.
There are a lot of solutions, but we know that only some among them will achieve the maximum score. What about among all the solutions that involve the same valves? Well, given a set of valves to open, there will be a lot of different possible orders in which to open them... however, analogously, only some among them will yield the maximum score, therefore only a single maximum score per subset of chosen valves is possible.
We can iterate over the solutions for a single player once and pre-calculate the
maximum score possible for any given set of valves to open (regardless of the
order in which they are opened). Let's do this using a dictionary, more
precisely a defaultdict
of int
for commodity. The "key" to remember will be
the set of valves in a given solution. We cannot use a set
as key since it is
mutable, but we can use a frezenset
(the immutable variant of a set
).
maxscore = defaultdict(int)
for solution in solutions(distance, rates, good, 26):
k = frozenset(solution)
s = score(rates, solution)
if s > maxscore[k]:
maxscore[k] = s
Now maxscore[set_of_valves]
will hold the maximum possible score attainable by
opening set_of_valves
, irrespective of the order they are opened. The size of
this dictionary is a lot smaller than the number of different solutions returned
by solutions()
:
print(len(list(solutions(distance, rates, good, 26)))) # 18676
print(len(maxscore)) # 2342
Iterating over all possible pairs of solutions, we can then discard those that have common keys (since as we already said both players cannot open the same valve), and calculate the maximum of the scores of the remaining ones:
best = 0
for s1, score1 in maxscore.items():
for s2, score2 in maxscore.items():
if len(s1 & s2) == 0: # a & b == intersection of the sets a and b
continue
cur = score1 + score2
if cur > best:
best = cur
We are doing a bit too many iterations as we don't care about the order of a
and b
. We can use itertools.combinations()
to
iterate over all possible pairs without taking the order into account:
from itertools import combinations
best = 0
for (s1, score1), (s2, score2) in combinations(maxscore.items(), 2):
if len(s1 & s2) == 0:
continue
cur = score1 + score2
if cur > best:
best = cur
print('Part 2:', best)
Problem statement — Complete solution — Back to top
Today we are dealing with 3D space and cubes. We are given a list of x,y,z
coordinates, each one representing a unit cube (1x1x1) in space, and we want to
find out how many faces in total do not touch other cubes.
Let's start by quickly parsing the input. We have lines of integers separated by
commas, so just .split()
them and map()
them to int
. We want to keep a count of cube faces, so let's add the
coordinates of each cube we scan to a dictionary of the form
{coords: free_faces}
, initially filled with all 6
.
cubes = {}
with open(...) as fin:
for line in fin:
coords = tuple(map(int, line.split(',')))
cubes[coords] = 6
A unit cube at x,y,z
can be in contact with any of its immediate neighbors, at
x±1,y,z
, x,y±1,z
and x,y,z±1
. Let's write a
generator function to calculate and yield
the coordinates of
the neighbors of a cube. We only need to go forward one step in any of the 6
possible directions:
def neighbors(x, y, z):
yield (x + 1, y , z )
yield (x - 1, y , z )
yield (x , y + 1, z )
yield (x , y - 1, z )
yield (x , y , z + 1)
yield (x , y , z - 1)
For any given unit cube, we can know how many faces touch other cubes by simply
checking how many of the 6 neighbors were present in our input. For each
present neighbor, we will have exactly one face touching it, and therefore we
can decrement our initial count of 6
free faces. Since the function we
just wrote takes 3 arguments, we can use unpack (*
) each
cube's coordinates when calling it.
for c in cubes:
for n in neighbors(*c):
if n in cubes:
cubes[c] -= 1
The internal for
is only performing a sum over a condition, so we can turn it
into a call to sum()
plus a
generator expression:
for c in cubes:
cubes[c] -= sum(n in cubes for n in neighbors(*c))
Finally, summing up all the values in cubes
will yield the total number of
"free" faces that are not in contact with other cubes:
surface = sum(cubes.values())
print('Part 1:', surface)
What we were actually trying to compute in part 1 was the total external surface area of the cubes we have, counting connected cubes together as a single 3D shape. However, the naïve approach of counting how many faces are "free" (not touching other cubes) is not enough, as we can have a structure of cubes that completely surrounds an empty portion of space, and the free faces in the inside should not be counted. This is what we need to do for part 2: figure out how many faces we over-counted and calculate the real external surface area.
This is intimidating at first, but things will get much clearer with a few
examples. For simplicity, we can reason in 2D, and then apply the same concepts
to 3D. Take for example the following structures where #
means we have a
square there and .
means empty space:
A B C D
..#.. .###. ##### #####
.#.#. #..#. #...# #.#.#
..#.. #.#.. .##.# ###.#
.#... ..... ..###
We have:
-
Structure A above composed of 4 squares, of which none of them touch each other on any side, so we have a total of 16 free sides. However, these 4 squares completely surround an empty square, and therefore we can distinguish between 4 internal sides and 12 external sides. The external perimeter is 12.
-
Similarly, for B, we have a "hole" of 3 contiguous squares. The total number of free sides is 23, but the external perimeter is 16.
-
Structure C does not have an internal perimeter, as it does not completely surround any region of 2D space.
-
Structure D has two "holes" and traps two different regions of space on its inside. It has an internal perimeter of 10 and an external perimeter of 18.
Now it should be clear what we are dealing with. The same concepts apply in 3D space where we have surface areas instead of perimeters, it's just harder to visualize. Given a solid structure of unit cubes, some regions of spaces could be trapped inside it, and therefore not contribute to the external surface area.
How can we detect these pockets of empty space? The simplest solution is to check every single unit of space in the 3D bounding box that includes all the unit cubes in our input. The 3D bounding box (you could actually call it a "bounding cube") we are looking for will have coordinates spanning from the minimum to the maximum coordinates we have one on each axis. These can be calculated easily:
minx = miny = minz = float('inf')
maxx = maxy = maxz = 0
for x, y, z in cubes:
minx, maxx = min(x, minx), max(x, maxx)
miny, maxy = min(y, miny), max(y, maxy)
minz, maxz = min(z, minz), max(z, maxz)
To simplify any bound checks to perform later, we can create a
range
object for each axis, and then use x in rangex
instead of minx <= x <= maxx
(which is exactly the check made when doing
x in rangex
):
rangex = range(minx, maxx + 1)
rangey = range(miny, maxy + 1)
rangez = range(minz, maxz + 1)
For each unit cube of free space inside the bounding box, we can try "escaping" the bounding box by visiting any other free unit cubes of space that we can reach (i.e. coordinates that are not in our input). This can be done through either BFS or DFS. If we manage to escape the bounding box, it means that we were not inside an internal "pocket" of space. On the other hand, if we cannot escape, it means we found such a pocket, and we are surrounded by non-empty unit cubes. To count the number of "internal" faces we can perform our exploration and increment a counter whenever we are prevented from moving in any of the 6 directions by a non-empty unit cube.
Let's implement an escape()
function that does exactly this, using iterative
DFS with a deque
as queue. We'll keep track of the
number of "touched" internal faces with a counter, and return 0
if we manage
to escape, otherwise the counter.
def escape(cubes, src, rangex, rangey, rangez):
seen = set()
queue = deque([src])
faces_touched = 0
while queue:
p = queue.pop()
if p in seen:
continue
seen.add(p)
x, y, z, = p
# Did we escape the bounding box?
if x not in rangex or y not in rangey or z not in rangez:
# If so, we are not trapped in an internal pocket
return 0, seen
# Try exploring all 6 directions
for n in neighbors(x, y, z):
# If we are blocked in this direction, it means we touched an internal face
if n in cubes:
faces_touched += 1
else:
# Otherwise, keep going
if n not in seen:
queue.append(n)
# We ran out of free space to visit, which means we are trapped in an internal pocket
return faces_touched, seen
You may have noticed that the above function also returns a seen
set of
visited coordinates. This is because we need to keep track of all the visited
coordinates of free space: if an exploration finds an internal pocket starting
from a given point in space, we do not need to perform a second redundant
exploration for any of the points we have visited during the first. Most
importantly, doing so would also make us double-count the number of internal
faces we touch, and we don't want that!
Now that we have a function able to identify and count internal faces, we can
call it for every empty unit cube inside our bounding box, and accumulate the
results. We need three nested for
loops to iterate over all possible x,y,z
coordinates, so we can use itertools.product()
to
make our life easier.
Let's get our part 2 answer:
allseen = set()
for c in product(rangex, rangey, rangez):
if c not in cubes: # this is an empty unit cube of space
if c not in allseen: # this is not in an internal pocket we already found
touched, seen = escape(cubes, c, rangex, rangey, rangez)
surface -= touched
allseen |= seen
print('Part 2:', surface)
Problem statement — Complete solution — Back to top
For today's problem we are building robots to mine some minerals... interesting. There are 4 mineral types: ore, clay, obsidian and geodes. Additionally, there are also 4 robot types, each of which can mine the corresponding mineral. We are given a list of blueprints: each blueprint tells us which minerals (and how many units) are needed to build a robot of a given type. We start with only one ore-mining robot, and we have 24 minutes of time. Each robot we have can mine one unit of its mineral per minute, and building a new robot takes 1 minute. We need to figure out the maximum number of geodes we can mine with each of the given blueprints, and calculate the best score among the blueprints we have, which is given by the blueprint id multiplied by the maximum number of geodes achievable with it.
Let's get input parsing out of the way. Each line of input is a quite long
sentence containing some positive integer quantities. We know all sentences are
formatted in the same way, so we can just extract all the integers on each line
with a regular expression using .findall()
, and
accumulate these numbers in a list:
import re
exp = re.compile(r'\d+')
blueprints = []
with open(...) as fin:
for line in fin:
blueprints.append(tuple(map(int, exp.findall(line))))
We are dealing with a search problem. Given four things, we can theoretically solve any search problem:
- A consistent definition of a "state".
- A start state.
- A way to find successors of a given state (i.e. transition from one state to any other possible state).
- A way to check if a state is a goal.
Let's see what we have for our problem:
-
First, we need to define what exactly is a state. In our case, a state could be represented by the current time left, the amount of resources we have of each type, and the number of robots we have of each type. If at any given time two different sequences of choices bring us to the same state, we know both those choice sequences are equivalent, and the reachable states from now on are the same.
-
Given the above, for our start state we have
24
minutes left,0
of each material and0
of each robot type, except for1
ore-mining robot. -
A transition from one state to another can be of two different kinds:
- Wait one minute without building any robot, and let the current robots we have mine resources.
- Build one robot of some type (assuming we have resources to do so), while also having pre-existing ones mine resources. There are 4 different transitions of this kind (one per robot type).
-
A goal state is any state where the time left is
0
, and its "score" is the number of geodes we have in such a state.
States can be seen as nodes in a graph, actions that change state can be seen as edges of the graph, and goal states can be seen as nodes with no outgoing edges. A search problem can therefore be solved using different graph exploration techniques like BFS, DFS, A-star, and so on.
Let's go with iterative DFS, which is pretty simple to implement. We'll write a
search()
function that takes the parameters of a blueprint as input and
explores the state space finding the best goal states it comes across. As with
normal DFS, it's pointless to visit the same state multiple times, so we'll keep
a set of visited
states to avoid this. As usual, we'll use a
deque
for the queue of states to visit.
As said above, each state will be a tuple
of the following form:
(time_left, ore, clay, obsidian, geodes, robots_ore, robots_clay, robots_obsidian, robots_geode)
def search(blueprint):
(rore_cost, # Cost in ore to build an ore-mining robot.
rclay_cost, # Cost in ore to build a clay-mining robot.
robs_cost_ore, # Cost in ore to build an obsidian-mining robot.
robs_cost_clay, # Cost in clay to build an obsidian-mining robot.
rgeo_cost_ore, # Cost in ore to build a geode-mining robot.
rgeo_cost_obs # Cost in obsidian to build a geode-mining robot.
) = blueprint
time = 24
best = 0 # Best number of geodes we are able to collect.
visited = set() # Visited states.
# The "frontier" of our search, containing states to explore next.
# In the initial state we only have 1 ore-mining robot.
q = deque([(time, 0, 0, 0, 0, 1, 0, 0, 0)])
while q:
time, ore, clay, obs, geo, rore, rclay, robs, rgeo = state = q.pop()
if state in visited:
continue
visited.add(state)
# Each robot we have mines 1 resource of its type, taking 1 minute.
newore = ore + rore
newclay = clay + rclay
newobs = obs + robs
newgeo = geo + rgeo
time -= 1
# If we run out of time, we reached a "goal" state. Update the best
# number of geodes we were able to mine.
if time == 0:
best = max(best, newgeo)
continue
# Following are the possible actions (transitions) to take...
# We can always just spend one minute only mining without building any robot.
q.append((time, newore, newclay, newobs, newgeo, rore, rclay, robs, rgeo))
# If we have enough materials for a geode-mining robot, we could also build that.
if obs >= rgeo_cost_obs and ore >= rgeo_cost_ore:
q.append((
time,
newore - rgeo_cost_ore,
newclay,
newobs - rgeo_cost_obs,
newgeo,
rore, rclay, robs, rgeo + 1
))
# If we have enough materials for an obsidian-mining robot, we could also build that.
if clay >= robs_cost_clay and ore >= robs_cost_ore:
q.append((
time,
newore - robs_cost_ore,
newclay - robs_cost_clay,
newobs,
newgeo,
rore, rclay, robs + 1, rgeo
))
# If we have enough materials for a clay-mining robot, we could also build that.
if ore >= rclay_cost:
q.append((
time,
newore - rclay_cost,
newclay,
newobs,
newgeo,
rore, rclay + 1, robs, rgeo
))
# If we have enough materials for an ore-mining robot, we could also build that.
if ore >= rore_cost:
q.append((
time,
newore - rore_cost,
newclay,
newobs,
newgeo,
rore + 1, rclay, robs, rgeo
))
return best
The above DFS search is theoretically correct... however, if we try to run it, we'll notice that it does take a while to finish. A relatively long while. This is because the search space that we are trying to explore is currently very large. Taking a look above, for every single state we branch out 5 times. We need to somehow reduce the search space. This is one of the most important aspects of a search problem. It's rather straightforward to code a DFS search on any given search problem once you have identified states and transitions, the real work is optimizing it, finding a way to prune away paths that lead to sub-optimal goals as much as possible and as early as possible.
A lot of smart people have listed different assumptions and optimizations that can be applied while searching in today's Reddit solution megathread, some of which are rather straightforward. I'm going to apply some here, until we reach a state where our program is efficient enough.
The first thing to notice is that we can set an upper limit to the number of robots to build. If the maximum amount of ore we need to build a robot is N, then we'll never need more than N ore-mining robots. Having N is enough to build the most expensive robot (ore-wise) every single minute. The same can be applied to other minerals: we don't need more clay robots than the amount of clay needed for obsidian robots, and we don't need more obsidian robots than the amount of obsidian needed for geode robots.
We can calculate these maximum needed amounts of minerals at the start of our function:
def search(blueprint):
# ...
max_ore_needed = max(rore_cost, rclay_cost, robs_cost_ore, rgeo_cost_ore)
max_clay_needed = robs_cost_clay
max_obs_needed = rgeo_cost_obs
# ...
Before trying to build a robot of any kind, let's stop and ask ourselves: do
we really need to? If not, we can avoid it and prune the branch. This simple
optimization cuts down the search space by a considerable amount. In terms of
code, it means adding a condition to the if
for each branch. The only branch
that does not need this is the one for building geode-mining robots, as we
always want the most we can.
def search(blueprint):
# ... unchanged ...
if clay >= robs_cost_clay and ore >= robs_cost_ore:
# Avoid building more obsidian robots than the max obsidian per minute needed.
if obs < max_obs_needed:
q.append((...)) # unchanged
if ore >= rclay_cost:
# Avoid building more clay robots than the max clay per minute needed.
if rclay < max_clay_needed:
q.append((...)) # unchanged
if ore >= rore_cost:
# Avoid building more ore robots than the max ore per minute needed.
if rore < max_ore_needed:
q.append((...)) # unchanged
return best
Of course, those if
conditions can be merged for simplicity:
def search(blueprint):
# ... unchanged ...
if obs < max_obs_needed and clay >= robs_cost_clay and ore >= robs_cost_ore:
q.append((...))
if rclay < max_clay_needed and ore >= rclay_cost:
q.append((...))
if rore < max_ore_needed and ore >= rore_cost:
q.append((...))
return best
This already cuts the execution time in half (at least on the example input), but we can keep improving.
Let's now ask ourselves: when does it make sense to wait for some minerals to be mined? Currently, our code always tries to explore a new state with a transition that just waits one minute of time to only mine. Exactly here:
# We can always just spend one minute only mining without building any robot.
q.append((time, newore, newclay, newobs, newgeo, rore, rclay, robs, rgeo))
This can sometimes be a waste of time. Similarly to what we discussed for the previous optimization above, there is a certain maximum amount of any given mineral type above which we'll never need to go. That is the maximum amount needed to build one robot per minute using that mineral. It only makes sense to wait for some mineral if we have less than the maximum needed and if we also have at least one robot to mine it.
We already calculated these maximum needed amounts. Let's also use them to perform this check and avoid spending one turn only mining if needed. The above line becomes:
# Does it make sense to wait for a resource? I.E. do I have less than
# the max needed and do I also have robots to produce it?
if (robs and obs < max_obs_needed) or (rclay and clay < max_clay_needed) or ore < max_ore_needed:
# If so, we can also try just spending one minute only mining without building any robot.
q.append((time, newore, newclay, newobs, newgeo, rore, rclay, robs, rgeo))
A little faster, nice, let's keep going!
The third thing to notice is that we can estimate how good of a state we are
in, and prune away any paths passing through sub-optimal states. In the
best-case scenario (assuming we have robots and resources), we can build exactly
1 geode-mining robot per minute, and keep all the other geode-mining robots
mining. For example, if we find ourselves at time=3
minutes left, and we have
0
geodes and 3
geode-mining robots, in the best-case scenario (building one
more geode-mining robot per minute) we can mine another 3 + 2 + 1 = 6
geodes.
The same reasoning can be applied to all minerals.
In general, if we find ourselves with t time left, n minerals of a given type already mined, and r mining robots for that mineral, in the best-case scenario we can get a maximum of n + r + (r + 1) + (r + 2) + ... + (r + t) minerals, which can also be written (grouping r) as n + (t + 1)r + 1 + 2 + 3 + ... + t. That last part consisting of the sum of the first t natural numbers is exactly the t-th (lol) triangular number, so the formula can be further simplified to n + (t + 1)r + t(t + 1)/2.
Let's write a small helper function to calculate the maximum number of minerals of a given type achievable in the best-case scenario:
def best_case_scenario(initial_amount, robots, t):
return initial_amount + robots * (t + 1) + t * (t + 1) // 2
We can now use it to perform three optimizations of the same kind any time we visit a new state:
- If the amount of geodes achievable in the best-case scenario is lower than
the current
best
we have, we can discard the state and any of its successors. - If the amount of obsidian achievable in the best-case scenario is lower than
the amount needed to build a geode robot, we know we'll never be able to
build geode robots anymore, so we can calculate the final score as
newgeodes + rgeo * time
and avoid exploring any further. - Likewise, this also applies to "ore", as geode robots need both ore and obsidian to be built.
def search(blueprint):
# ... unchanged ...
while q:
# ... unchanged ...
if time == 0:
best = max(best, newgeo)
continue
# If we can't mine more geodes in the best-case scenario, bail out.
if best_case_scenario(newgeo, rgeo, time) < best:
continue
# If we can't mine enough obsidian to build new geode robots even in the
# best-case scenario, we already know how many geodes we'll be able to get.
if best_case_scenario(newobs, robs, time) < rgeo_cost_obs:
best = max(best, newgeo + rgeo * time)
continue
# Likewise for ore.
if best_case_scenario(newore, rore, time) < rgeo_cost_ore:
best = max(best, newgeo + rgeo * time)
continue
# ... unchanged ...
We are getting there. There is only one last optimization I will make, which is also the one that improves our solution the most. Credit goes to u/Coffee_Doggo for figuring this out and commenting about it.
Fourth and final thing to notice: if we ever find ourselves with the ability to build a robot of a given type, and the previous minute we also had the chance to build it, but decided to not build anything instead... doing it now doesn't make much sense. We should have done it earlier! We can throw away this option.
The above will require us to keep track of which robots we were able to build at any given state and pass that information to the next state. We can do it by adding a simple list to our queue, containing numeric IDs:
ORE, CLAY, OBS, GEO = range(4)
Here are the modifications that we need to apply:
def search(blueprint):
# ...
# Add another element here, the list of robots we could have built in the
# previous step, but decided not to build instead.
q = deque([(time, 0, 0, 0, 0, 1, 0, 0, 0, [])])
while q:
tmp = q.pop()
# This last list we added is not part of the state.
state = tmp[:-1]
if state in visited:
continue
visited.add(state)
time, ore, clay, obs, geo, rore, rclay, robs, rgeo, did_not_build = tmp
# ... unchanged ...
can_build = []
if obs >= rgeo_cost_obs and ore >= rgeo_cost_ore:
# Did we have the chance to build this robot type in the previous iteration
# but decided to not build anything instead? If so, we just wasted precious
# time, it's pointless to do it now that we are late, the result is inevitably
# going to be worse.
if GEO not in did_not_build:
# Remember that we could have built a geode robot.
can_build.append(GEO)
# Pass along an empyy list, we built a robot so we don't have a list
# of robots that we "could have built" but didn't.
q.append((..., []))
if robs < max_obs_needed and clay >= robs_cost_clay and ore >= robs_cost_ore:
# Likewise.
if OBS not in did_not_build:
can_build.append(OBS)
q.append((..., []))
if rclay < max_clay_needed and ore >= rclay_cost:
# Likewise.
if CLAY not in did_not_build:
can_build.append(CLAY)
q.append((..., []))
if rore < max_ore_needed and ore >= rore_cost:
# Likewise.
if ORE not in did_not_build:
can_build.append(ORE)
q.append((..., []))
# If we can also "wait" without building, pass along the list of robots
# we could whave built, but decided to not build instead.
if (robs and obs < max_obs_needed) or (rclay and clay < max_clay_needed) or ore < max_ore_needed:
q.append((..., can_build))
return best
NOTE: in my complete solution I use a bitmask instead of a list to remember the
robots that could have been built. I simply represent "none" as 0, and any other
robot kind as a power of 2 (1, 2, 4, 8). Then I can bitwise or robot kinds
together. A value of 0b1111
for example means all robot kinds could have been
built.
We can finally run the above function for each blueprint and collect the results:
total = 0
for bid, *blueprint in blueprints:
total += bid * search(blueprint)
Pardon me, but I cannot resist turning that loop into a
sum()
+ generator expression:
total = sum(bid * search(b) for bid, *b in blueprints)
print('Part 1:', total)
We are now told to only consider the first 3 blueprints, calculate the maximum amount of geodes achievable with them in 32 minutes, and then calculate the product of these three numbers.
Thankfully, all those optimizations we made in part one were worth it! We have
a reasonably fast function, and we can simply move our time
variable in the
function signature to be passed as a parameter:
-def search(blueprint):
+def search(blueprint, time=24):
rore_cost, rclay_cost, robs_cost_ore, robs_cost_clay, rgeo_cost_ore, rgeo_cost_obs = blueprint
max_ore_needed = max(rore_cost, rclay_cost, robs_cost_ore, rgeo_cost_ore)
max_clay_needed = robs_cost_clay
max_obs_needed = rgeo_cost_obs
- time = 24
best = 0
...
All that's left to do is get our second star:
total = 1
for bid, *blueprint in blueprints[:3]:
total *= search(blueprint, 32)
Again, the above is simplifiable into a call to math.prod()
plus a generator expression:
total = prod(search(b, 32) for _, *b in blueprints[:3])
print('Part 2:', total)
Woah. Tough day, but definitely an entertaining one!
Problem statement — Complete solution — Back to top
Simple problem today. We are given a list of integers, and we need to "shuffle"
them moving each one around according to the same rule. Each number, in the
order they are originally given, needs to be moved in the list a number of
steps forward or backward equal to its value (negatives numbers move backward).
The input list is cyclic, so moving forward past its end will mean wrapping back
to the first spot. After performing this operation on every single number, we
need to locate the number 0
and find the sum of the 1000th, 2000th and 3000th
number after it (remembering that the list is cyclic).
We mostly have to follow the rules here. The only problem with the algorithm we need to apply is that the mixing operation must be done for each number in the original order we are given, but the operation itself moves numbers around: we need to somehow keep track of them. It would be easy to do if all the numbers were different, but we have duplicates in our input, and we need to track duplicates separately. This can be done in different ways, but the simplest one is to "wrap" each number up in an object that stores additional information to help us unequivocally identify it.
The wrapper object could be anything, for example, a tuple of the form
(original_index, number)
would suffice, as the original index will be unique
for each number. This is what most people used today as it's intuitive and also
one of the first things that come to mind. I implemented this approach too,
the code is here, however, I'm going to use a
wrapper class
, which in my opinion makes things a bit more intuitive and is
also slightly faster using CPython.
By default, in Python all instances of a class are different from each other
when compared, unless they implement special
__eq__()
or __ne__()
methods. We're going to define a class
that only holds one property (the number we want to wrap) and implements no
methods except the __init__()
constructor. Since we're going
to use this class a lot, listing the property name in the
__slots__
class variable will speed things up a bit (follow
the link to understand why).
class Number:
__slots__ = 'value'
def __init__(self, value):
self.value = int(value)
Now let's parse the input. It's going to be very easy as we have one integer per
line, just map()
everything to a Number
, which will also
take care of str
to int
conversion (self.value = int(value)
), and store
all the objects in a tuple
. To mix stuff around, we can later copy the content
of said tuple
into a list
, which will allow us to remove and insert elements
as we please, while still iterating over the elements in the original order.
with open(...) as fin:
order = tuple(map(Number, fin))
Onto the actual mixing, let's write a function for it. The rules to follow are simple. For every number in the original order:
- Find its index in the list using
.index()
. - Remove it with
.pop(index)
. - Calculate its new index and re-insert it with
.insert()
.
Remember: the list we are dealing with is actually cyclic, so if we exceed the
length we should wrap back around. This can be easily done with a modulo (%
)
operation: index % len(numbers)
will always return an index between 0
and
len(numbers) - 1
, wrapping back to 0
when index == len(numbers)
. What's
even cooler is that this will also work out of the box with negative values,
given the way Python's modulo is defined.
Following the steps above, let's write a function to do the mixing and calculate
the final result. After mixing the numbers one by one according to their
original order, we'll simply scan the list to find the number whose .value
is
0
and then calculate the sum of the 1000th, 2000th and 3000th number past it.
def mix(order):
# Copy everything into a list to be able to move things around.
numbers = list(order)
for num in order:
# Find the number.
i = numbers.index(num)
# Remove it.
numbers.pop(i)
# Calculate its new position.
i = (i + num.value) % len(numbers)
# Re-insert it.
numbers.insert(i, num)
for zero_idx, num in enumerate(numbers):
if num.value == 0:
break
res = numbers[(zero_idx + 1000) % len(numbers)].value
res += numbers[(zero_idx + 2000) % len(numbers)].value
res += numbers[(zero_idx + 3000) % len(numbers)].value
return res
A little digression. It's worth noting that we are doing a lot of .pop()
and .insert()
operations on a list
. This is usually slow and requires
re-scanning the entire list to fix the position of any elements past the
insertion point. In fact, these operations take O(n) time. Another way of
solving the problem would be using collections.deque
,
which has fast pop and append operations on both ends, and exposes a
.rotate()
method to cyclically rotate elements around n
spots in O(n).
The deque
data structure was used by a lot of people, and I also wrote
an alternative solution using it. Even if at a first
glance it seems like it should be a faster solution, it really is not. This is
because we would still be doing 3 operations with complexity O(n) (one
.index()
and two .rotate()
), which is fundamentally the same we are already
doing with our list (one .index()
, one .pop()
and one .insert()
). On top
of that, since under the hood
deque
is implemented as a doubly-linked list, the .index()
method needs to traverse a linked list of pointers, which is an inherently
slower operation compared to scanning a contiguous chunk of memory holding the
elements.
Now, getting back to the problem, we are basically done! All we need to do is call the function to get our answer:
answer = mix(order)
print('Part 1:', answer)
The numbers we had now change. Every number needs to be multiplied by
811589153
before doing anything. Then, we need to do the same thing we did for
part 1, but this time the mixing happens 10 times in a row.
Well, we already have the mix()
function ready. We can effortlessly add a
parameter to specify the number of times to mix, and wrap the mixing code in one
more for
loop:
def mix(order, times=1): # Added times= parameter.
numbers = list(order)
for _ in range(times): # Mix as many times as needed.
for num in order:
i = numbers.index(num)
numbers.pop(i)
i = (i + num.value) % len(numbers)
numbers.insert(i, num)
for zero_idx, num in enumerate(numbers):
if num.value == 0:
break
res = numbers[(zero_idx + 1000) % len(numbers)].value
res += numbers[(zero_idx + 2000) % len(numbers)].value
res += numbers[(zero_idx + 3000) % len(numbers)].value
return res
Now we can now modify the values of our numbers and run the algorithm again:
for num in order:
num.value *= 811589153
answer = mix(order, 10)
print('Part 2:', answer)
As simple as that!
Problem statement — Complete solution — Back to top
Note: my solution for part 2 of today's problem relies on the fact that the input is a well-formed binary expression tree. This is not explicitly stated in the problem statement, and is an assumption I (and many others) made to solve the problem with a rather simple algorithm.
Today we are dealing with a binary expression tree. We
are given a list of variables of the form name: value
, where each name
is a
string, while each value
can either be an expression to compute on two other
variables (a <op> b
), or an integer. We need to calculate the value of root
.
Let's start with input parsing. As already anticipated, we are dealing with a
tree, and root
represents the root of the tree. We aren't explicitly told that
this is a tree (i.e. all nodes appear on the right at most once), we could
actually be dealing with a graph where multiple nodes can have edges pointing to
the same node. Nonetheless, as long as there are no loops, we are safe and the
expression result can be easily computed. As a side note, the inputs for today's
problem do actually represent trees.
Each node can either be a binary expression node referencing other two nodes or
a leaf (a simple integer). We'll create a dictionary of the form {name: value}
where value
is either an integer or a 3-element list of the form
[name1, op, name2]
. The way our input is given, creating this structure is
trivial. All we need to do is .strip()
and
.split()
each line in the correct way. First split on ': '
to get the left and right side, then check if the right side is an integer
(using e.g. .isdigit()
) to decide whether to parse it or
.split()
again.
T = {}
with open(...) as fin:
for line in fin:
a, b = line.strip().split(': ')
T[a] = int(b) if b.isdigit() else b.split()
Our tree T
now contains all the information we need to calculate the value of
the expression. Starting at 'root'
, we can perform a simple
depth-first traversal of the tree: if we encounter a "value" node we
can just return the value; while if we encounter an "expresison" node we can
first find the left and right operands' values recursively, then perform the
needed operation and return its result.
To make our life easier, we can use a dict
as map for the operations. We need
to perform addition (+
), subtraction (-
), multiplication (*
) and division
(/
). The operator
module has all these in the form of
functions taking two parameters. For the division, we'll use
floordiv
(equivalent to the //
operator).
from operator import add, sub, mul, floordiv
OPMAP = {
'+': add,
'*': mul,
'-': sub,
'/': floordiv
}
To distinguish whether we are dealing with a value or expression node, we can
use isinstance()
to check whether T[node]
is a
list
or an int
. We will treat T
as a global variable for simplicity, but
it could also be passed as parameter.
def calc(node):
value = T[node]
if not isinstance(node, list):
# This is a value node, just return the value.
return value
# This is an expression node.
l, op, r = value
# Compute left and right value.
lvalue = calc(l)
rvalue = calc(r)
# Perform operation and return result.
return OPMAP[op](lvalue, rvalue)
Since our input seems to be a well-formed tree, we're only going to visit each
node exactly once. If that was not the case though, and our input was an acyclic
graph, we could optimize the above function with the help of
@lru_cache()
to memoize node
values and avoid wasting time computing them multiple times.
To get our answer we can now simply call the function on 'root'
:
answer = calc('root')
print('Part 1:', answer)
Now we are told that the 'root'
expression node is actually an equality check
between its left and right operand. Furthermore, we do not know the value of the
value node 'humn'
: we need to find a value for it such that the equality check
at 'root'
is satisfied.
The complexity of this second part of the problem depends on the form of the two
expressions on the left and right side of the root equality check. We know that
we are looking for exactly one solution, so we cannot be dealing with
polynomials of degree higher than 1, meaning that we cannot find an expression
node of the form humn * humn
, or in general A * b
where both A
and B
include humn
in their subtrees.
There's a small problem though: nobody technically guarantees that we cannot
encounter expression nodes of the form humn + humn
(or A + B
where either of
the two leads to humn
) or similar. Encountering one of these would mean that
the input does not form a tree, but a graph, which would make the problem harder
to solve. Since as already stated in part 1, the inputs of today's problem are
all well-formed binary expression trees, we will assume that something like this
cannot happen. For this reason, I wrote an is_tree()
check for this assumption
in my complete solution.
What is happening now is that we essentially have one side of the tree at root
with an unknown value, but we know it must match the value of the other side.
Let's replace T['humn']
with None
, and modify the calc()
function to tell
us which side has unknown value: whenever we encounter a value that is None
or
an expression where one of the two operands has value None
, we'll return
None
to propagate it back to the top. While we are at it, let's also decorate
the function with @lru_cache()
since we're going to
call it multiple times for the same nodes now.
+@lru_cache()
def calc(node):
value = T[node]
if not isinstance(value, list):
return value
l, op, r = value
lvalue = calc(l)
rvalue = calc(r)
+
+ if lvalue is None or rvalue is None:
+ return Noner
return OPMAP[op](lvalue, rvalue)
Note: since we have decorated calc
with lru_cache
, we need to remember
calling calc.cache_clear()
after using it for part 1, otherwise the cache will
hold the wrong values for part 2.
Now we can check which side of the root
node has value None
, and we'll know
that its expected value should be the one of the other side:
T['humn'] = None
l, _, r = T['root']
lvalue = calc(l)
rvalue = calc(r)
if lvalue is None:
# 'humn' is on this side of the tree.
# The expected value of this side of the tree is rvalue.
...
else:
# 'humn' is on this side of the tree.
# The expected value of this side of the tree is lvalue.
...
To solve the problem and find the unknown value of humn
, we can navigate the
tree recursively exploring the node whose value is unknown, keeping track of
what value it should have:
- If we are at
humn
, we just found its value: it's the expected value we are carrying around. - Otherwise we must be at some expression node, where one of the two child nodes (left or right) has unknown value. We know the expected value of the current node. We can compute the expected value of the node with unknown value performing the opposite of the expression.
For example, assume that initially for root
we have lvalue = None
and
rvalue = 123
: we know that humn
is somewhere on the left of root
, and that
the value of the left node should be 123
. We explore the left node, and find
an expression with operation +
, lvalue = 100
and rvalue = None
. We know
that lvalue + rvalue == 123
, and therefore 100 + rvalue == 123
. We can now
perform the opposite operation to find out that
rvalue == 123 - lvalue == 123 - 100 == 23
. Now we can continue on the right
node knowing its expected value is 23
.
Each time we explore a new node, we can compute the new expected value of either
of the two children as follows (let's call the expected value E
for
conciseness):
- If the operation is
+
: we knowE = L + R
, soL = E - R
andR = E - L
. - If the operation is
*
: we knowE = L * R
, soL = E / R
andR = E / L
. - If the operation is
-
: we knowE = L - R
, soL = E + R
andR = L - E
. - If the operation is
/
: we knowE = L / R
, soL = E * R
andR = L / E
.
Notice how in the above list we have two expressions that stand out. While in
general we have L_or_R = E <inverse_op> R_or_L
, for -
and /
we have
different formulas to compute L
and R
. This is because these operations are
not commutative.
Nonetheless, we can build another dict
to use as mapping for the operations to
perform in the same way we did for part 1. Each function will take E
as first
aprameter and the other side as second parameter. This time though we'll also
need to know if the value that we want to calculate is the left or the right: we
can do this with a bool
. We can mostly still use the functions from the
operator
module, except for calculating the right side of /
and -
.
# Calculate R or L knowing that E = L op R.
# True = need to calculate L given E and R as parameters.
# False = need to calculate R given E and L as parameters.
REV_OPMAP = {
('+', True ): sub,
('+', False): sub,
('*', True ): floordiv,
('*', False): floordiv,
('-', True ): add,
('-', False): lambda x, l: l - x,
('/', True ): mul,
('/', False): lambda x, l: l // x,
}
We can now apply the logic explained above to recursively follow the side of the
tree where our "unknown value" (humn
) is located and update its expected value
along the way:
def find_value(node, expected):
if node == 'humn':
# We reached the node we wanted and we now know its value.
return expected
l, op, r = T[node]
lvalue = calc(l)
rvalue = calc(r)
# We know expected = L op R...
if lvalue is None:
# Left side has unknown value, calculate its expected value and keep going.
expected = REV_OPMAP[op, True](expected, rvalue)
return find_value(l, expected)
# Right side has unknown value, calculate its expected value and keep going.
expected = REV_OPMAP[op, False](expected, lvalue)
return find_value(r, expected)
We can now find the answer we were looking for:
T['humn'] = None
l, _, r = T['root']
lvalue = calc(l)
rvalue = calc(r)
if lvalue is None:
answer = find_value(l, rvalue)
else:
answer = find_value(r, lvalue)
Since root
is L == R
, we could also re-write it as L - R
, knowing that the
resoult should be 0
, and simplify the above logic:
T['humn'] = None
T['root'][1] = '-'
answer = find_value('root', 0)
print('Part 2:', answer)
Problem statement — Complete solution — Back to top
Ah, moving inside a 2D ASCII grid yet again. What's different this time? Let's see...
We are given a 2D grid as input, which however looks kind of weird, it's not the
usual rectangle, but instead, it has multiple rectangular sections connected
together. There are walls (#
) and free cells (.
) we can walk on. Other than
the map, we are given a list of movement instructions in the form of a single,
long string. Instructions are of two types: turning 90 degrees in place (L
or
R
) or moving forward a given number of steps. Whenever we bump into a wall, we
need to stop advancing, and whenever we fall off of the grid we wrap around and
continue on the opposite side (i.e. if there isn't a wall immediately stopping
us there).
We start facing right, and we need to determine our final position and facing direction after following all the instructions. The answer to provide is a linear combination of our row, column and facing direction expressed as a number: 1000r + 4c + d.
The task looks simple enough. The only annoying part is the wrapping around, but that can be solved quite easily once we understand how to represent the grid.
Our input looks something like this (scaled down):
.#...#..
...#....
..#...#.
##..
.#.#
.#..
........
#.#...#.
#....#..
.#..
.#..
....
14R42L38R43L9L2L14L11...
We can only walk on free cells (.
), but not on empty ones (spaces), and
coordinates start from the top-left at row 1, column 1. We are initially on row
1 on the first free cell (.
). The final line of input contains the movement
instructions to follow. They are all weirdly compressed together, so we'll
somehow need to separate them later.
As the problem statement mentions, directions can be considered integers. Let's create some global constants for directions, cell types and row/column deltas to apply to move in a given direction:
WALL, FREE, EMPTY = '#. '
RIGHT, DOWN, LEFT, UP = range(4)
DIRMAP = [
(0, 1), # moving right
(1, 0), # moving down
(0, -1), # moving left
(-1, 0), # moving up
]
For now, let's parse this weirdly shaped grid into a list of strings. We can
just read the entire input and use .splitlines()
, then
extract the last line containing movement instructions and throw away the empty
one.
grid = []
with open(...) as fin:
grid = fin.read().splitlines()
moves = grid[-1] # Take last line with movement instructions.
grid = grid[:-2] # Throw away last two lines.
Because of the shape of the grid, not all lines of input are of the same length,
and in fact, the top part of the grid is way wider than the bottom one. To make
our life a lot easier, we can just take the maximum width and turn the grid into
an actual rectangle, expanding shorter rows with empty spaces through
.ljust()
:
HEIGHT, WIDTH = len(grid), max(map(len, grid))
for i in range(HEIGHT):
grid[i] = grid[i].ljust(WIDTH, EMPTY)
Additionally, we will need an easy way to detect whether we are going out of
bounds (and therefore need to wrap around). To avoid a bunch of annoying
bound-checking if
statements, we can simply add two rows of empty cells at the
top and at the bottom of the grid, as well as two columns of empty cells on the
left and the right. This also makes the actual grid start at grid[1][x]
(where
x
is the column of the leftmost free cell in the top row) instead of
grid[0][x]
, which is nice since the coordinate system used in the problem
statement considers the top-left as row 1, column 1.
for i in range(HEIGHT):
# Add two empty columns left and right.
grid[i] = EMPTY + grid[i].ljust(WIDTH, EMPTY) + EMPTY
HEIGHT += 2
WIDTH += 2
# Add two empty rows at the top and at the bottom.
grid = [EMPTY * WIDTH] + grid + [EMPTY * WIDTH]
Now, in order to iterate over the movement instructions we'll need to separate
turning instructions (L
, R
) from advancement instructions (numbers). We can
do this with a silly hack: add spaces around any L
and R
using
.replace()
, and then split the whole thing. We can then use
map()
with a lambda to transform integers into
actual int
s as needed:
moves = moves.replace('R', ' R ').replace('L', ' L ').split()
moves = tuple(map(lambda m: m if m in 'LR' else int(m), moves))
To simulate walking around the grid, we'll need to handle wrapping around it. Let's write a function to simulate one single step, and handle the wrapping.
The way our grid is shaped, wrapping around going in a certain direction means
scanning the opposite side of our grid until we find the first cell that is not
empty. We can do this with a simple for
loop, for example, wrapping from the
right side back to the left side while going right:
for c in range(WIDTH):
if grid[r][c] != EMPTY:
break
newc = c
The above for
loop scans the grid until it finds a cell that satisfies a
specific condition. With the help of a filtered
generator expression and the next()
built-in, this can be simplified to a single line of code:
newc = next(c for c in range(WIDTH) if grid[r][c] != EMPTY)
We can apply the above logic for all four directions. Here's the final function:
def step(r, c, direction):
dr, dc = DIRMAP[direction]
r += dr
c += dc
if grid[r][c] == EMPTY:
# We fell off the edge of the grid, we need to wrap to the opposite side.
if direction == RIGHT:
c = next(c for c in range(WIDTH) if grid[r][c] != EMPTY)
elif direction == LEFT:
c = next(c for c in range(WIDTH - 1, -1, -1) if grid[r][c] != EMPTY)
elif direction == DOWN:
r = next(r for r in range(HEIGHT) if grid[r][c] != EMPTY)
else:
r = next(r for r in range(HEIGHT - 1, -1, -1) if grid[r][c] != EMPTY)
return r, c
Since there's only one unique opposite cell for each border cell of our grid, we
if we want we can also cache the results through
lru_cache()
by moving the wrapping logic inside four
different functions, for example like this:
from functools import lru_cache
@lru_cache()
def leftmost_col(r):
return next(c for c in range(WIDTH) if grid[r][c] != EMPTY)
Now that we have a function to perform a single step in a given direction, we
can write a function to follow all the movement instructions in order. The only
thing we have to worry about is stopping at walls. The cell to start from will
be the leftmost free cell (.
) on row 1
, so we can use
.index()
to find the column.
We'll be encoding directions as an integer modulo 4
, that is from 0
to 3
:
each time we turn left (L
) we'll decrement the number, and each time we turn
right (R
) we'll increment it. When moving forward, we'll advance one step()
at a time until we either finish moving forward or hit a wall.
def walk(grid, moves):
r = 1
c = grid[1].index(FREE)
d = RIGHT
for move in moves:
if move == 'L':
d = (d - 1) % 4
elif move == 'R':
d = (d + 1) % 4
else:
for _ in range(move):
newr, newc = step(r, c, d)
# Stop if we hit a wall.
if grid[newr][newc] == WALL:
break
# Otherwise just keep going.
r, c = newr, newc
# Calculate the final answer.
return 1000 * r + 4 * c + d
All that's left to do is call the function we just wrote, which alredy calculates
the final result as 1000 * r + 4 * c + d
for us:
answer = walk(grid, moves)
print('Part 1:', answer)
For the second part of the problem, we are told that the weirdly-shaped grid we are working with is actually the net of a 3D cube (i.e. an "unfolded" cube laid out on a plane). The steps we need to make now take place on the outer surface of the cube, so wrapping around happens completely differently. The result we need to provide is the same as before: considering the grid we have, we want the value of 1000r + 4r + d given the final row (r), column (c) and facing direction (d).
"Falling off" of an edge now is a completely different story. It's very hard to visualize with only ASCII art, or even with just pen and paper, so I'd recommend drawing the net, cropping and folding an actual paper cube with uniquely numbered faces (from 1 to 6). This is what I did while originally solving the problem, and I doubt I would have been able to solve it without it.
This is the numbering we'll use to identify faces:
+----+----+
| 1 | 2 |
| | |
+----+----+
| 3 |
| |
+----+----+
| 4 | 5 |
| | |
+----+----+
| 6 |
| |
+----+
Now, considering that the above is the 2D net of a 3D cube, the funny thing about wrapping around the edges of the cube is that we can end up on an entirely different face, facing a different direction (as seen from the 2D net perspective). For example, if we reach the right edge of face 2 going right, we'll end up on the right edge of face 5 going left! Even weirder, if we reach the top edge of face 1 going up, we'll end up on the left edge of face 6 going right!
This is extremely counter-intuitive to reason about on the 2D net of our cube, but here's a diagram to illustrate it:
x<<<<<<<<<<<x
v ^
v +--^-+----+
v | 1^ | 2 |
v | | >>>>x
v +----+----+ v
v | 3 | v
v | | v
v +----+----+ v
v | 4 | 5 | v
v | | <<<<<<<<<x
v +----+----+
v | 6 |
x>>>>> |
+----+
If we had to solve a more generalized version of the problem, we would also have to deal with the fact that a 3D cube can actually be represented by 11 different 2D nets (layed out in different shapes). Thankfully, we are given a fixed net layout, and thus we know which edges of which faces are adjacent to one another. If this wasn't the case, we would also need to detect that and act accordingly, figuring out the wrapping rules for any possible net.
In any case, after building and inspecting our own paper cube with faces labeled
as above, the wrapping becomes much simpler to understand. In terms of code,
we'll need a face()
function to understand on which face we are standing and
a new step()
function to apply the wrapping rules correctly.
Our faces are 50x50, and we already know the minimum and maximum row/column for each of them:
1 51 101 151
| | | |
| +----+----+--- 1
| | 1 | 2 |
| | | |
| +----+----+--- 51
| | 3 |
| | |
+----+----+-------- 101
| 4 | 5 |
| | |
+----+----+-------- 151
| 6 |
| |
+----+------------- 201
Accounting for the fact that we added one additional row/column on each side, we
can build a simple global tuple
of constants to identify the bounds of our six
faces:
FACES = (
# rmin, rmax, cmin, cmax
( 1 , 50 , 51 , 100 ),
( 1 , 50 , 101 , 150 ),
( 51 , 100 , 51 , 100 ),
( 101 , 150 , 1 , 50 ),
( 101 , 150 , 51 , 100 ),
( 151 , 200 , 1 , 50 )
)
Given a row and column, we can now identify the face by iterating over the above tuple and checking the coordinates against the bounds.
Another important thing to remember is that whenever we need to wrap around and switch faces, we will need to keep the correct offset on the edge of the new face. For example, if we are on face 1 going up, we'll wrap to the left edge of face 6. On this edge, the distance from the top of face 6 will be the same as the previous distance from the left of face 1.
x<<<<<<<<<<<<<<<<<<x
v x<<<<<<<<<x ^
v v +^----^+
v v |^ ^|
v v | 1 |
v v | |
v v +------+
v v
v v +------+
v x>>> |
v | 6 |
x>>>>>>> |
+------+
Similarly, if we are on face 1 going left, we'll wrap to the left edge of face 4: this time the distance from the top of face 4 will be the same as the previous distance from the bottom of face 1.
+------+
x<<<<<<<<<<<<<< |
v | 1 |
v x<<<<<<<<<< |
v v +------+
v v
v v +------+
v x>>> |
v | 4 |
x>>>>>>> |
+------+
To take the above into account, in our face()
function, in addition to
identifying the face we are on, we'll also calculate the relative row and
column within the face. This will later be useful to calculate the exact
position on the edge of the new face after wrapping.
To identify the face, we'll iterate over the previously created FACE
global
tuple, using enumerate()
to start counting from 1
:
def face(r, c):
for face_id, (rmin, rmax, cmin, cmax) in enumerate(FACES, 1):
if rmin <= r <= rmax and cmin <= c <= cmax:
return face_id, r - rmin + 1, c - cmin + 1
Now we'll need a new step()
function, let's call it step_cube()
. First,
we'll calculate the new position as if we don't need to wrap. If the new
position is not an empty cell we're done, otherwise we'll need to wrap around.
Using the face()
function, we can identify the current face and calcualte the
relative position within it. Then, the logic to apply is pretty boring and
repetitive, but the reasoning is the same as explained above for face 1. I'm not
going to copy-paste the entire function here as it's quite long, but you can
check it out in my complete solution.
def step_cube(r, c, direction):
dr, dc = DIRMAP[direction]
newr = r + dr
newc = c + dc
if grid[newr][newc] != EMPTY:
return newr, newc, direction
# We fell off an edge, we need to wrap to another face...
face_id, fr, fc = face(r, c)
# fr = face-relative row (from the top left of the face)
# fc = face-relative column (from the top left of the face)
if face_id == 1:
if direction == UP:
# We'll end up on face 6 going right, the new row will be 150 + the
# distance from the left of face 1 (i.e. fr), and the new column is
# always 1.
return fc + 150, 1, RIGHT
# direction == LEFT -> we'll end up on face 4 going right, the new row
# will be 100 + the distance from the *bottom* of face 1 (i.e. 51 - fr),
# and the new column is always 1.
return (51 - fr) + 100, 1, RIGHT
if face_id == 2:
...
# ... and so on for all the other faces...
As you may have noticed above, the new step_cube()
function also returns the
new direction to move in, since changing face also changes the direction we are
facing from the perspective of our 2D net. We can generalize our existing
walk()
function to take the step function to use as an argument with a couple
of modifications:
-def walk(grid, moves):
+def walk(grid, moves, step_func=step):
r = 1
c = grid[1].index(FREE)
d = RIGHT
for move in moves:
if move == 'L':
d = (d - 1) % 4
elif move == 'R':
d = (d + 1) % 4
else:
for _ in range(move):
- newr, newc = step(r, c, d)
+ newr, newc, newd = step_func(r, c, d)
if grid[newr][newc] == WALL:
break
- r, c = newr, newc
+ r, c, d = newr, newc, newd
return 1000 * r + 4 * c + d
Given the way we modified walk()
we also need to return a direction from our
original step()
function, so let's do that:
def step(r, c, direction):
# ...
- return r, c
+ return r, c, direction
The final solution is again only one function call away:
answer = walk(grid, moves, step_cube)
print('Part 2:', answer)
Problem statement — Complete solution — Back to top
Today's problem involves pathfinding, but with a twist. We are in a rectangular
grid surrounded by walls (#
), where each cell is either free or represents a
blizzard (<
, >
, ^
, v
). Given a start (2nd column of the first row) and
a destination (second-to-last column of the last row), we need to navigate the
grid being careful not to step in a blizzard or into a wall.
Problem is, each instant of time, every blizzard moves forward one step in the
direction it is facing (i.e. >
moves to the right). Whenever a blizzard
reaches one of the four walls of the grid, it wraps around the same row/column,
continuing from the opposite side. Furthermore, a single cell can contain more
than one blizzard (each going in a different direction) at any given time.
Each instant of time we can either move in one of the 4 cardinal directions (north, south, east, west) or wait in place. We need to calculate the minimum amount amount of time needed to reach the destination avoiding blizzards.
Let's start by parsing the input into a grid: after reading the entire input,
.splitlines()
is all we need.
with open(...) as fin:
grid = fin.read().splitlines()
Now, if it weren't for the added difficulty of the moving blizzards, today's
solution would be standard BFS on a grid. However, we'll need to
simulate the evolution of the blizzard in time. Keeping around the actual grid
and editing its content each iteration is slow and error-prone. The simplest way
to represent cells occupied by a blizzard would be a set()
of coordinates so
that we can easily check whether a certain position contains any blizzard with
(r, c) in blizzard
. However, this is not enough as we also need to keep track
of the direction each blizzard is moving. To do this, we can use a dictionary of
the form {coords: list_of_directions}
: each cell (at a given coordinate) can
contain more than one blizzard at a given time, so this is an easy way to track
both the number of blizzards present and all their directions.
A minimal example can help understand how multiple blizzards going in different directions can end up overlapping on the same cell:
#.##### #.##### #.##### #.#####
#.>.<.# ==> #..2..# ==> #.<.>.# ==> #<...># ==> ...
#####.# #####.# #####.# #####.#
To easily advance each blizzard in the right direction, we can remember
directions as tuples of the form (delta_r, delta_c)
. Let's create a map that
associates each character representing a blizzard with one of these:
DIRMAP = {
'>': ( 0, 1),
'<': ( 0, -1),
'v': ( 1, 0),
'^': (-1, 0),
}
We know that each blizzard will wrap around and continue from the opposite side
of the row/column once it reaches a wall: if we discard all walls and only
consider the internal rectangle (enclosed in the walls) as the grid to work on,
the first row and column will start at 0
, and wrapping around can later be
handled with a modulo operation, e.g. (row + 1) % height
.
01234567
0 #.###### 012345
1 #>>.<^<# 0 >>.<^<
2 #.<..<<# =====> 1 .<..<<
3 #>v.><># 2 >v.><>
5 ######.#
Actual grid Simplified grid
Let's now iterate over the grid, ignoring the first and last row and the first
and last column of each row (which we know are all going to be walls). For each
blizzard we encounter, we'll remember its position and direction. The
enumerate()
built-in comes in handy here.
bliz = {}
for r, row in enumerate(grid[1:-1]):
for c, cell in enumerate(row[1:-1]):
if cell in DIRMAP:
# There's a blizzard here (< > ^ v). Its directoin (delta_r, delta_c)
# is given by the DIRMAP we created above.
bliz[r, c] = [DIRMAP[cell]]
Since we discarded the external walls, the actual height and width of the internal rectangle are 2 units less than the original grid's height and width. The starting position is now at the first column of the row immediately above the first one, and the destination is at the last column of the row immediately below the last one. Let's calculate these values for later:
height, width = len(grid) - 2, len(grid[0]) - 2
src, dst = (-1, 0), (height, width - 1)
Now into the actual guts of the problem: let's write a function to "evolve" the
current blizzards moving each of them one step forward, and returning a new
dictionary representing the new state of the blizzards. Each key in the bliz
dictionary we just created represents the coordinates of a cell containing one
or more blizzards, and their directions are listed in the associated value (a
list
).
We can iterate over the .items()
of the dictionary, applying
the deltas listed in the directions for each coordinate. A
defaultdict
of list
makes the whole operation
more convenient.
from collections import defaultdict
def evolve_blizzard(bliz, width, height):
newbliz = defaultdict(list)
for (r, c), directions in bliz.items():
for dr, dc in directions:
newr = (r + dr) % height
newc = (c + dc) % width
newbliz[newr, newc].append((dr, dc))
return newbliz
Given the coordinates of our position and current the state of the blizzards, we can now calculate all the possible neighboring coordinates we can move to. These will be the ones inside the bounds of our simplified grid and not overlapping with any blizzard. In addition to those, we can also stand still if a blizzard does not hit us. Finally, one last thing to consider is that our destination is at the last column and one row below the last row, which is technically out of bounds: we can special-case this check.
Let's write a generator function to yield
all valid
coordinates we can move to given the current position.
def neighbors(bliz, r, c, height, width):
# Right above the destination? Even if out of bounds, that's also a valid position to move to.
if r == height - 1 and c == width - 1:
yield height, c
# For each of the 4 cardinal directions...
for dr, dc in ((0, 1), (1, 0), (0, -1), (-1, 0)):
rr, cc = r + dr, c + dc
# ... check if we are in bounds and if there is NO blizzard here...
if 0 <= rr < height and 0 <= cc < width and (rr, cc) not in bliz:
# ... if so, we can move here.
yield rr, cc
# We can also stand still if no blizzard hits us.
if (r, c) not in bliz:
yield r, c
Now for the heart of the problem: as anticipated, we can perform an iterative breadth-first search from the starting position until we reach the destination, tracking time instead of distance. This works because each "step" we take is unitary, meaning it advances the time of exactly 1 unit. Contrary to a "traditional" BFS though, in this case we can re-visit cells more than once. In fact, merely standing still means only advancing the time, visiting the same cell two times in a row. We can therefore avoid keeping track of visited cells, but we still need to keep track of where we are.
We'll keep track of all the possible positions we can be at the current time
using a set
. A list
or another similar dat structure would still work, but
we'd have to deal with unneeded duplicates, and a set
takes care of those for
us. Initially (time = 0
) this will only include the start position. We'll then
advance one step forward in time each iteration, evolving the blizzards and
figuring out the new reachable positions given by applying the neighbors
function to all the current positions we are tracking. If we ever find the
destination among those, we can stop and return the current time.
def bfs(bliz, src, dst, height, width):
positions = {src}
time = 0
# While the destination is not reached.
while dst not in positions:
# Advance time and evolve blizzards moving them around.
time += 1
bliz = evolve_blizzard(bliz, width, height)
# For each possible position we are tracking, calculate the next valid
# positions, and add them to a new set.
new_positions = set()
for pos in positions:
neighs = neighbors(bliz, *pos, height, width)
new_positions.update(neighs)
# Track these new positions in the next iteration.
positions = new_position
return time
We can also refactor the above inner loop into an advance()
generator function
that takes a set of positions and yields any new valid position:
def advance(bliz, positions, height, width):
for pos in positions:
yield from neighbors(bliz, *pos, height, width)
This makes the body of our bfs()
function slimmer:
def bfs(bliz, src, dst, height, width):
positions = {src}
time = 0
while dst not in positions:
time += 1
bliz = evolve_blizzard(bliz, width, height)
positions = set(advance(bliz, positions, height, width))
return time
All that's left to do is call the above function with the right parameters:
time = bfs(bliz, src, dst, height, width)
print('Part 1:', time)
After reaching the destination, we need to go back to the start, and then go back to the destination again. The end goal is still to figure out the minimum amount of time needed, but the path is now longer.
No big deal, we already have almost all we need, we only need to make a few
modifications to our functions. First of all, since now we can also go back to
the starting position, we need to special-case that too in the neighbors()
function:
def neighbors(bliz, r, c, height, width):
- if r == height - 1 and c == width - 1:
+ if r == 0 and c == 0:
+ yield -1, 0
+ elif r == height - 1 and c == width - 1:
yield height, c
# ... unchanged ...
Secondly, we now need to remember the final state of all the blizzards after the
first search. We can just return that along with the time in our bfs()
function:
def bfs(bliz, src, dst, height, width):
# ... unchanged ...
- return time
+ return time, bliz
After solving part 1, we'll overwrite the bliz
variable with the new state:
time1, bliz = bfs(bliz, src, dst, height, width)
print('Part 1:', time1)
And for part 2, we'll now perform two more bfs()
calls: one to go back to
src
from dst
, and one to go back to dst
again. The total time taken will
will be the sum of the 3 times.
time2, blitz = bfs(bliz, dst, src, height, width)
time3, _ = bfs(bliz, src, dst, height, width)
total_time = time1 + time2 + time3
print('Part 2:', total_time)
Problem statement — Complete solution — Back to top
Last problem of the year, and a fairly simple one thankfully. We are given a
list of numbers in a weird base-5 representation called "SNAFU"
(LOL at this name choice). The digits used are 210-=
, where -
means -1 and =
means -2. We need to sum all these numbers, and calculate
the total in the same base (i.e. using these digits).
The first thing we'll need is a function to convert from SNAFU (weird base 5) to
base 10. It's fairly straightforward: given a number, iterate over its digits,
convert them to their actual values, and multiply by the appropriate power of 5,
accumulating the result. For simplicity, let's use a dict
to map SNAFU digits
to their actual values:
VALUES = {'=': -2, '-': -1, '0': 0, '1': 1, '2': 2}
Then, for the actual conversion, we can either start with 50 (1) or and iterate over the digits in reverse order, or start with 5L-1 (where L is the total number of digits) and iterate normally. I chose the latter.
def base10(n):
power = 5 ** (len(n) - 1)
res = 0
for digit in n:
res += VALUES[digit] * power
power //= 5
return res
We can now convert every number in our input to base 10 and calculate their sum. It's just a question of iterating and stripping input lines:
total = 0
with open(...) as fin:
for line in fin:
total += base10(line.strip())
Or more concisely, using map()
+ sum()
,
with the help of .splitlines()
to automatically trim away
newlines:
with open(...) as fin:
total = sum(map(base10, fin.read().split()))
The opposite conversion, from base 10 to SNAFU, is a little bit trickier. We
cannot do a "normal" base conversion as we are used to (we are, right?), because
we have negative digits (-
, =
). However, we can start with the code for a
normal conversion and see how we can get there.
This is what a normal base 10 to base 5 conversion would look like:
def base10_to_base5(n):
res = ''
while n:
n, digit = divmod(n, 5)
res += str(digit)
return res[::-1]
The divmod()
built-in used above simply calculates both
quotient and remainder of a division, it's the same as
n, digit = n // 5, n % 5
.
Now, the problem with the above conversion algorithm is that we will get digits
higher than 2
, which are not contemplated in SNAFU notation and cannot be
represented. However, 3
can be seen as 5 - 2
and 4
can be seen as 5 - 1
.
In fact, we can represent 3
and 4
with two SNAFU digits: 3
is 1=
(1×51 - 2×50) and 4
is 1-
(1×51 -
1×50). This also works for higher powers: 3×57 =
1x58 - 2×57; 4×59 = 1x510 -
1×59. Therefore, for any digit higher than 2
, we can increment
the next digit of the number, and then "borrow" from it.
Given the above, the function to convert a number from base 10 to SNAFU can be written as follows:
def snafu(n):
res = ''
while n:
n, digit = divmod(n, 5)
if digit > 2:
# Add 1 more of the next (higher) power of 5
n += 1
# Borrow from it:
if digit == 3:
res += '=' # Take 2 x current power of 5.
else: # digit == 4
res += '-' # Take 1 x current power of 5.
else:
res += str(digit)
return res[::-1]
Compacting and simplifying the above logic, we have:
def snafu(n):
res = ''
while n:
n, digit = divmod(n, 5)
res += '012=-'[digit]
n += digit > 2
return res[::-1]
All that's left to do is convert the total we previously calculated:
answer = snafu(total)
print('Part 1:', answer)
And as usual, no part 2 for day 25. Merry Christmas!
Copyright © 2022 Marco Bonelli. This document is licensed under the Creative Commons BY-NC-SA 4.0 license.