2021 - Advent of code - Day 1

I’ve haven’t participated in the advent of code before. But always been curious.

What is advent of code?

It’s an advent Calendar for programmers. You get 25 challenges starting December 1st. Caveat: you have to solve the challenge to be eligible for the next day’s challenge 🙂

Day 1 Challenge – Part 1

On the first day your first task is to count how many times a value is bigger than its predecessor. They give us some sample data

199 N/A
200 bigger
208 bigger
210 bigger
200 smaller
207 bigger
240 bigger
269 bigger
260 smaller
263 bigger

When we count the times a value is bigger we get seven times bigger.

The actual data contains 2000 rows. This isn’t exactly big data but I’ve wanted to dust off my Pandas skill, so here we go:

Let’s look at the data

import pandas as pd

df = pd.read_csv("./aoc_day_01_data.txt", header=None)
df.describe

With the read_csv() function we can read in our data file and convert it into a data frame. It’s important to hand over the header=None. Otherwise pandas assumes the first row is a column header.

df.describe gives us:

<bound method NDFrame.describe of          0
0      159
1      158
2      174
3      196
4      197
...    ...
1995  8538
1996  8543
1997  8545
1998  8557
1999  8568

[2000 rows x 1 columns]>

Because we want to reference the columns by name we add a column header

df.columns = ["original"]

To compare the nth cell with its n+1th cell neighbour be add a new column but shift the values

df['shifted'] = df['original'].shift(-1)

The output looks like this:

	original	shifted
0	159	158.0
1	158	174.0
2	174	196.0
3	196	197.0
4	197	194.0
…	…	…
1995	8538	8543.0
1996	8543	8545.0
1997	8545	8557.0
1998	8557	8568.0
1999	8568	NaN

We add another column where we place the value True when the value from the current row in the shifted column is bigger than in the original column:

df['increased'] = (df['shifted'] > df['original'])

Now it starts to look like the sample data from the introduction:

	original	shifted	increased
0	159	158.0	False
1	158	174.0	True
2	174	196.0	True
3	196	197.0	True
4	197	194.0	False
…	…	…	…
1995	8538	8543.0	True
1996	8543	8545.0	True
1997	8545	8557.0	True
1998	8557	8568.0	True
1999	8568	NaN	False

the last thing we have to do is counting how many times True occurs:

true_count = df['increased'].sum()

which gives us “1583”

This is a bit of a hack because it assumes that True equals 1 and False == 0

A more elegant solution is to use value_counts:

df['increased'].value_counts(dropna=False)

No the output is:

True     1583
False     417
Name: increased, dtype: int64

And 1583 is the number we are looking for. This earned us our first golden star and unlocked the second part of the challenge:

Part 2

The second part is a bit more challenging because we have to sum up three adjacent values and compare them to the next three values.

199  A       
200  A B     
208  A B C   
210    B C D
200  E   C D
207  E F   D
240  E F G
269    F G H
260      G H
263        H

I created a new notebook and started like part 1 with reading the data and naming the first column

import pandas as pd

df = pd.read_csv("./aoc_day_01_data.txt", header=None)
df.columns = ["original"]

To add the sum of three values to the row of the first value we use the following code

indexer = pd.api.indexers.FixedForwardWindowIndexer(window_size=3)
df["rolling_sum"] = df.original.rolling(window=indexer).sum()

This demonstrates the power of Pandas once more: you have integrated sliding window functions!

The rest is equal to part one “shift, compare and count”

df['shifted_rs'] = df['rolling_sum'].shift(-1)
df['increased_rs'] = (df['shifted_rs'] > df['rolling_sum'])
true_count = df['increased_rs'].sum()
true_count

As a little Fingerübung I did the same with vanilla Python:

data = []
with open("./aoc_day_01_test_data.txt") as f:
    for line in f:
        data.append(int(line.rstrip()))

triplet_sums = []

for i, v in enumerate(data):
    if i < (len(data) - 2):
        triplet_sum = data[i] + data[i+1] + data[i+2]
        triplet_sums.append(triplet_sum)
print(triplet_sums)

sums_larger_than_previous_sums = 0
for i, v in enumerate(triplet_sums):
    if i < (len(triplet_sums) - 1):
        if triplet_sums[i] < triplet_sums[i+1]:
            sums_larger_than_previous_sums += 1

print(sums_larger_than_previous_sums)

Which works but is less elegant.

Stay tuned for more!

2021 – Advent of code – Day 1

What is advent of code?

Day 1 Challenge – Part 1

Part 2

Related Posts