I’ve haven’t participated in the advent of code before. But always been curious.
What is advent of code?
It’s an advent Calendar for programmers. You get 25 challenges starting December 1st. Caveat: you have to solve the challenge to be eligible for the next day’s challenge 🙂
Day 1 Challenge – Part 1
On the first day your first task is to count how many times a value is bigger than its predecessor. They give us some sample data
199 N/A 200 bigger 208 bigger 210 bigger 200 smaller 207 bigger 240 bigger 269 bigger 260 smaller 263 bigger
When we count the times a value is bigger we get seven times bigger.
The actual data contains 2000 rows. This isn’t exactly big data but I’ve wanted to dust off my Pandas skill, so here we go:
Let’s look at the data
import pandas as pd df = pd.read_csv("./aoc_day_01_data.txt", header=None) df.describe
With the read_csv() function we can read in our data file and convert it into a data frame. It’s important to hand over the header=None. Otherwise pandas assumes the first row is a column header.
df.describe gives us:
<bound method NDFrame.describe of 0 0 159 1 158 2 174 3 196 4 197 ... ... 1995 8538 1996 8543 1997 8545 1998 8557 1999 8568 [2000 rows x 1 columns]>
Because we want to reference the columns by name we add a column header
df.columns = ["original"]
To compare the nth cell with its n+1th cell neighbour be add a new column but shift the values
df['shifted'] = df['original'].shift(-1)
The output looks like this:
original | shifted | |
---|---|---|
0 | 159 | 158.0 |
1 | 158 | 174.0 |
2 | 174 | 196.0 |
3 | 196 | 197.0 |
4 | 197 | 194.0 |
… | … | … |
1995 | 8538 | 8543.0 |
1996 | 8543 | 8545.0 |
1997 | 8545 | 8557.0 |
1998 | 8557 | 8568.0 |
1999 | 8568 | NaN |
We add another column where we place the value True when the value from the current row in the shifted column is bigger than in the original column:
df['increased'] = (df['shifted'] > df['original'])
Now it starts to look like the sample data from the introduction:
original | shifted | increased | |
---|---|---|---|
0 | 159 | 158.0 | False |
1 | 158 | 174.0 | True |
2 | 174 | 196.0 | True |
3 | 196 | 197.0 | True |
4 | 197 | 194.0 | False |
… | … | … | … |
1995 | 8538 | 8543.0 | True |
1996 | 8543 | 8545.0 | True |
1997 | 8545 | 8557.0 | True |
1998 | 8557 | 8568.0 | True |
1999 | 8568 | NaN | False |
the last thing we have to do is counting how many times True occurs:
true_count = df['increased'].sum()
which gives us “1583”
This is a bit of a hack because it assumes that True equals 1 and False == 0
A more elegant solution is to use value_counts:
df['increased'].value_counts(dropna=False)
No the output is:
True 1583 False 417 Name: increased, dtype: int64
And 1583 is the number we are looking for. This earned us our first golden star and unlocked the second part of the challenge:
Part 2
The second part is a bit more challenging because we have to sum up three adjacent values and compare them to the next three values.
199 A 200 A B 208 A B C 210 B C D 200 E C D 207 E F D 240 E F G 269 F G H 260 G H 263 H
I created a new notebook and started like part 1 with reading the data and naming the first column
import pandas as pd df = pd.read_csv("./aoc_day_01_data.txt", header=None) df.columns = ["original"]
To add the sum of three values to the row of the first value we use the following code
indexer = pd.api.indexers.FixedForwardWindowIndexer(window_size=3) df["rolling_sum"] = df.original.rolling(window=indexer).sum()
This demonstrates the power of Pandas once more: you have integrated sliding window functions!
The rest is equal to part one “shift, compare and count”
df['shifted_rs'] = df['rolling_sum'].shift(-1) df['increased_rs'] = (df['shifted_rs'] > df['rolling_sum']) true_count = df['increased_rs'].sum() true_count
As a little Fingerübung I did the same with vanilla Python:
data = [] with open("./aoc_day_01_test_data.txt") as f: for line in f: data.append(int(line.rstrip())) triplet_sums = [] for i, v in enumerate(data): if i < (len(data) - 2): triplet_sum = data[i] + data[i+1] + data[i+2] triplet_sums.append(triplet_sum) print(triplet_sums) sums_larger_than_previous_sums = 0 for i, v in enumerate(triplet_sums): if i < (len(triplet_sums) - 1): if triplet_sums[i] < triplet_sums[i+1]: sums_larger_than_previous_sums += 1 print(sums_larger_than_previous_sums)
Which works but is less elegant.
Stay tuned for more!