Advent of Code 2020

Advent of Code is a series of small programming challenges, released daily throughout December in the run-up to Christmas. Part 1 of the challenge is given first. On its successful completion, Part 2 is revealed. The challenges are designed to be solved in any programming language. I will be using R.

There will no doubt be a wide variety of ways to solve these problems. I’m going to go with the first thing I think of that gets the right answer. In most cases, I expect that there will be more concise and efficient solutions. Most of the time I’m working in R, it’s within the tidyverse, so I imagine that framework will feature heavily below.

Each participant gets different input data, so my numerical solutions may be different from others. If you’re not signed up for Advent of Code yourself, but want to follow along with my data, you can download it at from the data links at the beginning of each day’s section. The links in the day section headers take you to challenge on the Advent of Code page.

Day 1: Report Repair

My day 1 data

Part 1: Two numbers

The challenge is to find two numbers from a list that sum to 2020, then to report their product.

expand.grid() creates a data frame from all combinations of the supplied vectors. Since the vectors are the same, each pair is duplicated. In this case the two numbers in the list that sum to 2020 are 704 and 1316, and we have one row with 704 as Var1 and one with 704 as Var2. slice(1) takes the first occurrence of the pair.

Toggle the code

library(dplyr)

expenses <- readLines("data/AoC_day1.txt") %>%
  as.numeric()

expand.grid(expenses, expenses) %>% 
  mutate(sum = Var1 + Var2) %>%
  filter(sum == 2020) %>%
  mutate(prod = Var1 * Var2) %>%
  slice(1) %>%
  pull(prod)

[1] 926464

Part 2: Three numbers

The follow-up challenge is the same but with three numbers. I went with essentially the same code but it’s notably slower. There are a lot of repeated calculations here: each triplet appears six times in the table.

Toggle the code

expand.grid(expenses, expenses, expenses) %>% 
  mutate(sum = Var1 + Var2 + Var3) %>%
  filter(sum == 2020) %>%
  mutate(prod = Var1 * Var2 * Var3) %>%
  slice(1) %>%
  pull(prod)

[1] 65656536

Day 2: Password Philosophy

My day 2 data

Part 1: Number of letters

We need to find how many passwords are valid according to their policy. The policies and passwords are given as follows:

1-3 a: abcde
1-3 b: cdefg
2-9 c: ccccccccc

Each line gives the password policy and then the password. The password policy indicates the lowest and highest number of times a given letter must appear for the password to be valid. For example, 1-3 a means that the password must contain a at least 1 time and at most 3 times.

Toggle the code

library(dplyr)
library(tidyr)
library(stringr)

First load the libraries we’ll need. We then read in the data and use tidyr functions to separate out the parts of the policy and the password, making sure to convert the columns to numeric as appropriate:

Toggle the code

passwords <- readr::read_tsv("data/AoC_day2.txt", col_names = FALSE) %>%
  separate(X1, c("policy", "password"), sep = ":") %>%
  separate(policy, c("count", "letter"), sep = " ") %>%
  separate(count, c("min", "max")) %>%
  mutate(min = as.integer(min),
         max = as.integer(max))

Next, we use the stringr function str_count() to count how many times the given letter appears in the password, and conditional logic to check whether it is repeated within the specified number of times. Because TRUE has a numeric value of 1 and FALSE has a numeric value of 0, we can sum the resulting column to get a count of how many passwords are valid according to their policies.

Toggle the code

passwords %>%
  mutate(count = str_count(password, letter)) %>%
  mutate(password_in_policy = if_else(count >= min & count <= max, TRUE, FALSE)) %>%
  summarise(correct = sum(password_in_policy)) %>%
  pull(correct)

[1] 625

Part 2: Position of letters

Now the policy is interpreted differently. Each policy actually describes two positions in the password, where 1 means the first character, 2 means the second character, and so on. Exactly one of these positions must contain the given letter. How many are valid now?

There were a couple of gotchas here. When I used separate() in the previous part, I had inadvertently left a leading whitespace in front of the password, something that was messing up my indexing with str_sub. Using str_trim() first cleared that up. Also, we need exactly one of the positions to match. | is an inclusive or. We need xor() for exclusive or instead.

Toggle the code

passwords %>%
  mutate(password = str_trim(password)) %>%
  mutate(pos1_letter = str_sub(password, min, min),
         pos2_letter = str_sub(password, max, max)) %>%
  mutate(match_one = xor(pos1_letter == letter, pos2_letter == letter)) %>%
  summarise(correct = sum(match_one)) %>%
  pull(correct)

[1] 391

Day 3: Toboggan Trajectory

My day 3 data

Part 1: Encountering trees

Starting at the top left corner of the map, how many trees (“#”) do we encounter, going at a trajectory of 3 right and 1 down?

First, read in the data and save it into a matrix. My method here feels really hack-y. I’m sure there must be a better approach.

Toggle the code

library(dplyr)

tree_map <- readr::read_tsv("data/AoC_day3.txt", col_names = FALSE)

num_col <- tree_map %>%
  mutate(length = str_length(X1)) %>%
  slice(1) %>%
  pull(length)

tree_vec <- tree_map %>%
  mutate(X1 = strsplit(X1, split = character(0), fixed = TRUE)) %>%
  pull(X1) %>%
  unlist()

tree_mat <- matrix(tree_vec, ncol = num_col, byrow = TRUE)

Now work my way across and down the matrix, using the %% modulo operator to loop round where necessary. The -1 and +1 in the line ((y + right - 1) %% num_col) + 1 is a hack to get round the fact that, for num_col columns, the modulo runs from 0 to num_col - 1, but the column indexes for our matrix run from 1 to num_col.

Toggle the code

right <- 3
down <- 1

num_rows <- nrow(tree_mat)
num_col <- ncol(tree_mat)

# start counting trees encountered
trees <- 0

# start square
x <- 1
y <- 1
  
while (x <= num_rows) {
  
  # cat("row: ", x, "col: ", y, "\n")
  
  if (tree_mat[x,y] == "#") trees <- trees + 1
  
  x <- x + down
  y <- ((y + right - 1) %% num_col) + 1
  
}

trees

[1] 299

Part 2: Checking further slopes

We now need to check several other trajectories, and multiply together the number of trees we find, so we wrap the Part 1 code into a function.

Toggle the code

slope_check <- function(tree_mat, right, down) {
  
  num_rows <- nrow(tree_mat)
  num_col <- ncol(tree_mat)

  # start counting trees encountered
  trees <- 0

  # start square
  x <- 1
  y <- 1
  
  while (x <= num_rows) {
  
    if (tree_mat[x,y] == "#") trees <- trees + 1
  
    x <- x + down
    y <- ((y + right - 1) %% num_col) + 1
  
  }
  trees
}

prod(slope_check(tree_mat, 1, 1),
     slope_check(tree_mat, 3, 1),
     slope_check(tree_mat, 5, 1),
     slope_check(tree_mat, 7, 1),
     slope_check(tree_mat, 1, 2))

[1] 3621285278

Day 4: Passport Processing

My day 4 data

Part 1: Complete passports

Toggle the code

library(dplyr)
library(tidyr)

Using readr::read_tsv() off the bat removes the blank lines, making it impossible to identify the different passports, but reading in the data via readLines() then converting as_tibble() preserves them, and then allows us to use tidyverse functions for the remaining tidying. cumsum() on a logical vectors takes advantage of FALSE having a numeric value of zero and TRUE having a numeric value of one.

Toggle the code

passports <- readLines("data/AoC_day4.txt") %>%
  as_tibble() %>%
  separate_rows(value, sep = " ") %>%
  mutate(new_passport = value == "") %>%
  mutate(ID = cumsum(new_passport) + 1) %>%
  filter(!new_passport) %>%
  select(-new_passport) %>%
  separate(value, c("key", "value"), sep = ":") %>%
  relocate(ID)

Our data is now in three columns, with ID, key and value, so now we need to find the number of passports with all seven fields once cid is excluded:

Toggle the code

passports %>%
  filter(key != "cid") %>%
  count(ID) %>%
  filter(n == 7) %>%
  nrow()

[1] 210

Part 2: Valid passports

Now we need to add data validation checks:

byr (Birth Year) - four digits; at least 1920 and at most 2002.
iyr (Issue Year) - four digits; at least 2010 and at most 2020.
eyr (Expiration Year) - four digits; at least 2020 and at most 2030.
hgt (Height) - a number followed by either cm or in:
- If cm, the number must be at least 150 and at most 193.
- If in, the number must be at least 59 and at most 76.
hcl (Hair Color) - a # followed by exactly six characters 0-9 or a-f.
ecl (Eye Color) - exactly one of: amb blu brn gry grn hzl oth.
pid (Passport ID) - a nine-digit number, including leading zeroes.
cid (Country ID) - ignored, missing or not.

Ignoring the cid field, we narrow down on passports that at least have the right number of fields, and extract the number from the hgt column:

Toggle the code

complete_passports <- passports %>%
  filter(key != "cid") %>%
  add_count(ID) %>%
  filter(n == 7) %>%
  select(-n) %>%
  mutate(hgt_value = case_when(
    key == "hgt" ~ readr::parse_number(value),
    TRUE ~ NA_real_)) %>%
  ungroup()

Then we create a check column, which is TRUE when the value for each key meets the required conditions. Those with 7 TRUEs are valid. Note that with case_when() we’ve left the check column as NA when the condition is FALSE, requiring na.rm = TRUE in the call to sum(). We can get round that by adding a final line to the case_when() condition stating TRUE ~ FALSE. TRUE here is a catch-all for all remaining rows not covered by the conditions above, and then we set them to FALSE, but I find the line TRUE ~ FALSE unintuitive.

Toggle the code

complete_passports %>%
  mutate(check = case_when(
    (key == "byr" & value >= 1920) & (key == "byr" & value <= 2002) ~ TRUE,
    (key == "iyr" & value >= 2010) & (key == "iyr" & value <= 2020) ~ TRUE,
    (key == "eyr" & value >= 2020) & (key == "eyr" & value <= 2030) ~ TRUE,
    key == "hgt" & str_detect(value, "cm") & hgt_value >= 150 & hgt_value <= 193 ~ TRUE,
    key == "hgt" & str_detect(value, "in") & hgt_value >= 59 & hgt_value <= 76 ~ TRUE,  
    key == "hcl" & str_detect(value, "^#[a-f0-9]{6}$") ~ TRUE,
    key == "ecl" & value %in% c("amb", "blu", "brn", "gry", "grn", "hzl", "oth") ~ TRUE,
    key == "pid" & str_detect(value, "^[0-9]{9}$") ~ TRUE
  )) %>%
  group_by(ID) %>%
  summarise(check_all = sum(check, na.rm = TRUE)) %>%
  filter(check_all == 7) %>%
  nrow()

[1] 131

Day 5: Binary Boarding

My day 5 data

Part 1: Finding all seat IDs

Toggle the code

library(dplyr)
library(stringr)

The code below sets starts by setting each row number to 127 and each column number to 7, the maximum they can be, then, working along the string, lowering the maximum (or leaving it as is) one letter at a time:

Toggle the code

boarding <- readr::read_tsv("data/AoC_day5.txt", col_names = FALSE) %>%
  rename(binary = X1)

seat_IDs <- boarding %>%
  mutate(row = 127) %>%
  mutate(col = 7) %>%
  mutate(row = if_else(str_sub(binary, 1, 1) == "F", row - 64, row)) %>%
  mutate(row = if_else(str_sub(binary, 2, 2) == "F", row - 32, row)) %>%
  mutate(row = if_else(str_sub(binary, 3, 3) == "F", row - 16, row)) %>%
  mutate(row = if_else(str_sub(binary, 4, 4) == "F", row - 8, row)) %>%
  mutate(row = if_else(str_sub(binary, 5, 5) == "F", row - 4, row)) %>%
  mutate(row = if_else(str_sub(binary, 6, 6) == "F", row - 2, row)) %>%
  mutate(row = if_else(str_sub(binary, 7, 7) == "F", row - 1, row)) %>%
  mutate(col = if_else(str_sub(binary, 8, 8) == "L", col - 4, col)) %>%
  mutate(col = if_else(str_sub(binary, 9, 9) == "L", col - 2, col)) %>%  
  mutate(col = if_else(str_sub(binary, 10, 10) == "L", col - 1, col)) %>%  
  mutate(ID = row * 8 + col) 

seat_IDs %>%
  summarise(max = max(ID)) %>%
  pull(max)

[1] 963

OK, I know I said in the introduction to this post that I would go with the first solution I think of that gets the right answer, and the above does work, but I’m deeply unhappy with the code. There’s too much repetition, I don’t like the use of subtraction when diving by 2 feels more appropriate in a binary context, and it doesn’t feel like I’ve taken full advantage of the mathematical structure of the problem. So, on further reflection, I realise that the way that ID is defined is essentially turning a binary number into a decimal, where we get the binary number as a string by replacing “B” and “R” by “1” and L” and “F” by “0”. Then, I just found, there is a base R function strtoi() that takes a string of digits in a given base and converts it to a base 10 integer, just what we need:

Toggle the code

seat_IDs <- boarding %>%
  mutate(binary = str_replace_all(binary, "L|F", "0")) %>%
  mutate(binary = str_replace_all(binary, "B|R", "1")) %>%
  mutate(ID = strtoi(binary, base = 2)) %>%
  arrange(desc(ID))

seat_IDs %>%
  slice(1) %>%
  pull(ID)

[1] 963

That’s better!

Part 2: Finding my seat ID

We need to find the missing number, so we arrange the IDs in ascending order and look at the gap between each ID and the preceding one. In most cases, that should be one. Where we have a gap of 2, we must have skipped the integer below:

Toggle the code

seat_IDs %>%
  arrange(ID) %>%
  mutate(diff = lag(ID)) %>%
  mutate(gap = ID - diff) %>% 
  filter(gap == 2) %>%
  summarise(my_seat = ID - 1) %>%
  pull(my_seat)

[1] 592

Day 6: Custom Customs

My day 6 data

Part 1: Anyone answers

Toggle the code

library(dplyr)
library(tidyr)
library(stringr)

Within each group, we need to find the number of unique letters within each group. We read in and separate the data using the tricks learnt for Day 4, and take advantage of the rowwise() feature in dplyr 1.0.0.

Toggle the code

customs_groups <- readLines("data/AoC_day6.txt") %>%
  as_tibble() %>%
  mutate(new_group = value == "") %>%
  mutate(group_ID = cumsum(new_group) + 1) %>%
  filter(!new_group) %>%
  select(-new_group) %>%
  group_by(group_ID) 

customs_groups %>%
  summarise(qs = str_c(value, collapse = "")) %>%
  ungroup() %>%
  mutate(qss = str_split(qs, "")) %>%
  rowwise() %>%
  mutate(qsu = list(unique(qss))) %>%
  mutate(count = length(qsu)) %>%
  ungroup() %>%
  summarise(total = sum(count)) %>%
  pull(total)

[1] 6585

Part 2: Everyone answers

Now, instead of unique letters in a group, we need to find the number of letters which appear in all the answers for everyone in the same group. I first note how many people are in each group, then tabulate the number of occurrences of each letter in the group, then count (by summing a logical vector) the number of matches between occurrences of letter and the number in group. Finally, we sum across all groups.

Toggle the code

customs_groups %>%  
  add_count(group_ID, name = "num_in_group") %>%
  group_by(group_ID, num_in_group) %>%
  summarise(qs = str_c(value, collapse = "")) %>%
  ungroup() %>%
  mutate(qss = str_split(qs, "")) %>%
  rowwise() %>%
  mutate(letter_table = list(table(qss))) %>%
  slice(1) %>%
  mutate(in_common = sum(num_in_group == letter_table)) %>%
  ungroup() %>%
  summarise(total = sum(in_common)) %>%
  pull(total)

[1] 3276

Day 7: Handy Haverstocks

My day 7 data

Part 1: Number of colour bags

Toggle the code

library(tidyverse)

We have colour-coded bags that must contain a specific number of other colour-coded bags.

Toggle the code

bags <- read_tsv("data/AoC_day7.txt", col_names = FALSE)

head(bags)

# A tibble: 6 × 1
  X1                                                                            
  <chr>                                                                         
1 wavy bronze bags contain 5 striped gold bags, 5 light tomato bags.            
2 drab indigo bags contain 4 pale bronze bags, 2 mirrored lavender bags.        
3 pale olive bags contain 3 faded bronze bags, 5 wavy orange bags, 3 clear blac…
4 faded white bags contain 5 vibrant violet bags, 4 light teal bags.            
5 mirrored magenta bags contain 2 muted cyan bags, 3 vibrant crimson bags.      
6 dull purple bags contain 1 striped fuchsia bag.

Our first task is to parse the natural language and split the rules into one container/contains pair per line:

Toggle the code

rules <- bags %>%
  mutate(rule = row_number()) %>%
  separate(X1, c("container", "contains"), sep = " bags contain ") %>%
  separate_rows(contains, sep = ",") %>%
  mutate(contains = str_remove(contains, "\\.")) %>%
  mutate(contains = str_remove(contains, "bags|bag")) %>%
  #mutate(contains = str_replace(contains, "no other", "0 other")) %>%
  extract(contains, c('number', 'contains'), "(\\d+) (.+)") %>%
  filter(!is.na(number)) %>%
  mutate(contains = str_trim(contains)) %>%
  mutate(number = as.integer(number))

To find all bags that con eventually contain our shiny gold bag, we first find the bags that can contain it directly. We then find the bags that can contain those bags and take the union of the two levels. We repeat, stopping when going up a level adds no further bags to the vector of bag colours already found. We then subtract 1, because we don’t want to count the original shiny gold bag.

Toggle the code

# function to find all colours that contain a vector of other colours:
contains_colours <- function(colours) {
  rules %>%
    filter(contains %in% colours) %>%
    distinct(container) %>%
    pull(container)
}

bags <- "shiny gold"
old_length <- length(bags)
new_length <- 0

# keeping adding to the vector of bags, until no change
while(old_length != new_length) {
  old_length = length(bags)
  bags <- base::union(bags, contains_colours(bags)) %>% unique()
  new_length <- length(bags)
  #cat(old_length, ", ", new_length, "\n")
}

length(bags) - 1

[1] 274

Part 2: Number of bags

Now we need to discover the number of bags that a shiny gold bag must contain. I figured that lends itself to recursion, but struggled on the details. Hat tip to David Robinson for this solution. I’ve learnt a lot for myself by unpicking how it works.

Toggle the code

count_all_contained <- function(colour) {
  
  relevant_rules <- rules %>%
    filter(container %in% colour)
  
  sum(relevant_rules$number * (1 + map_dbl(relevant_rules$contains, count_all_contained)))
  
}

count_all_contained("shiny gold")

[1] 158730

Day 8: Handheld Halting

My day 8 data

Part 1: Infinite Loop

Our programme gets stuck in an infinite loop. As well as keeping track of the accumulator, we need to keep track of where we’ve visited, and stop when we visit the same instruction twice. We use a data.frame() rather than a tibble() as the former is easier to index into.

Toggle the code

instructions <- 
  read.table("data/AoC_day8.txt", col.names = c("instruction", "value"))

We start with a pretty straight-forward loop, noting that at most it can run for one more than the number of instructions in the programme until it hits an instruction it’s already visited. We update row number to visit next and the accumulator as appropriate.

Toggle the code

instructions$visited <- 0

row <- 1
accumulator <- 0

num_rows <- nrow(instructions)

for (i in 1:(num_rows+1)) {

  if (instructions[row, "visited"] != 0) break
  
  # +1 on number of times the row is visited
  instructions[row, "visited"] <- instructions[row, "visited"] + 1

  # case when the instruction is "acc"
  if (instructions[row, "instruction"] == "acc") {
    accumulator <- accumulator + instructions[row, "value"]
    row <- row + 1
  }
  
  # case when the instruction is "jmp"
  else if (instructions[row, "instruction"] == "jmp") {
    row <- row + instructions[row, "value"]
  }

  # case when the instruction is "nop"
  else if (instructions[row, "instruction"] == "nop") {
    row <- row + 1
  }
}
  
accumulator

[1] 1915

Part 2: Fixing the programme

To break the loop, one of the nop instructions in the programme should be a jmp or vice versa. The plan is to swap these out one by one and check if the programme completes. It’s not a sophisticated approach, but it works fast enough (about a second).

First we note that the broken instruction must be one that we visited in Part 1. Also, an instruction of jmp with a value of 0 will get us stuck in a one-line infinite loop, so we avoid that.

Toggle the code

library(dplyr)

rows_to_check <- instructions %>%
  mutate(row_id = row_number()) %>%
  filter(visited != 0) %>%
  filter(instruction != "acc") %>%
  filter(!(instruction == "nop" & value == 0)) %>%
  pull(row_id)

We have 93 instruction to check. We modify our code from Part 1 slightly, converting it into a function and returning a list with values completes and accumulator. completes is FALSE as soon as we visit a row twice and TRUE if the number of our next row to visit is greater than the number of rows in the programme.

Toggle the code

programme_completes <- function(instructions) {
  
  row <- 1L
  accumulator <- 0
  
  num_rows <- nrow(instructions)
  
  for (i in 1:(num_rows+1)) {
  
    if (instructions[row, "visited"] != 0) {
      return(list(completes = FALSE, accumulator = accumulator)) 
    }
    
    # +1 on number of times the row is visited
    instructions[row, "visited"] <- instructions[row, "visited"] + 1
  
    # case when the instruction is "acc"
    if (instructions[row, "instruction"] == "acc") {
      accumulator <- accumulator + instructions[row, "value"]
      row <- row + 1
    }
  
    else if (instructions[row, "instruction"] == "jmp") {
      row <- row + instructions[row, "value"]
    }
  
    else if (instructions[row, "instruction"] == "nop") {
      row <- row + 1
    }
  
    if (row > num_rows) {
      return(list(completes = TRUE, accumulator = accumulator)) 
    }
  }
}

We now loop over the rows we’ve identified to check, breaking the loop as soon as we find a programme that completes. Finally, we extract the accumulator value from the successful programme.

Toggle the code

instructions$visited <- 0

for (row in rows_to_check) {
  
  # modify one row of the instructions,
  # copying data frame so we don't have to modify it back
  modified_instructions <- instructions
  
  ifelse(instructions[row, 1] == "jmp", 
         modified_instructions[row, 1] <- "nop", 
         modified_instructions[row, 1] <- "jmp") 
  
  # check if the modified programme completes
  check_programme <- programme_completes(modified_instructions)
  
  if (check_programme$completes) 
    break
}

check_programme$accumulator

[1] 944

Day 9: Encoding Error

My day 9 data

Part 1: Weak Link

We have to find the first number in the list which is not the sum of a pair of different numbers in the preceding 25 numbers.

Toggle the code

input <- as.double(readLines("data/AoC_day9.txt"))

There’s a nice trick for finding the pair of numbers in a vector that sum to a target that was doing the rounds on twitter in response to the Day 1 challenge: intersect(input, 2020 - input). For this challenge, we expand on that idea, writing it as a check_sum function. Where there’s more than one pair, it won’t say which pair together, and if the number that’s half the target appears in the addends, it will only appear once in the output. However, for this challenge, we only need to know when there are no pairs that sum to the target, which will be the case when the length of the output of check_sum is 0.

Toggle the code

check_sum <- function(target, addends) {
  intersect(addends, target-addends)
}

Then, it’s simply a case of iterating over windows of length 25, checking whether the following number is the sum of a distinct pair in that window, and returning the first one that isn’t.

Toggle the code

find_invalid_num <- function(vec, win = 25) {
  
  for (i in (win+1):length(vec)) {
    check <- check_sum(vec[i], vec[(i-win):(i-1)])
    
    if (length(check) == 0) return(vec[i])
  }
  
}

find_invalid_num(input)

[1] 507622668

Part 2: Contiguous set

Find a contiguous set in the list that sums to the invalid number from Part 1, and add together the largest and smallest number in that range.

First, we note that after a certain point, all numbers in the input are larger than the target, so we don’t need to consider those. We reduce our input vector accordingly.

Toggle the code

target <- find_invalid_num(input)

input_reduced <- input[1:(max(which(input <= target)))]

To find the contiguous set in the list that sums to the target, we make use of accumulate() from the purrr package. Let the input list be \(x = (x_1, x_2,..., x_n)\). Then accumulate(x, sum) returns \(a = (x_1, x_1 + x_2,..., \sum_{j=1}^n x_j)\). We check whether any element of this vector is equal to the target. If so we index into the input vector appropriately, sum the min and max in the range and we’re done. If not, we consider the sums of all windows starting with the second element of the input list, and so on.

Toggle the code

contiguous_sum <- function(input, target) {
  
  len <- length(input)
  
  for (i in 1:len) {
    a <- purrr::accumulate(input[i:len], sum)
    b <- a == target
    
    if (sum(b) == 1) {
      output_length <- which(b)
      
      contiguous_set <- input[i:(i + output_length - 1)]
      
      return(sum(range(contiguous_set)))
    }
  }
}

contiguous_sum(input_reduced, target)

[1] 76688505

I appreciate that there’s some redundant calculation in this method. The vectors of accumulated sums can contain numbers larger than the target (if writing our own loop, we could break as soon as the accumulated sum got too big). Also, in retrospect, we could have only run accumulate once, then in the second iteration of the loop, subtracted input[1] from the result, in the third iteration subtracted input[2] from that result, etc. However, the function as written is concise and easy to understand, and gets our answer in around a second, so that will do!

Day 10: Adapter Array

My day 10 data

Part 1: Adapter Distribution

This is simply a case of ordering the adapters, prepending 0 and appending the the max in the list plus three, then finding the differences.

Toggle the code

library(dplyr)

Toggle the code

adapters <- 
  readLines("data/AoC_day10.txt") %>%
  as.integer()

adapter_diffs <- c(adapters, 0, max(adapters) + 3) %>% 
  sort() %>%
  diff()

sum(adapter_diffs == 1) * sum(adapter_diffs == 3)

[1] 3034

Part 2: Adapter combinations

Instead of building up sequences of adapters, we see what we can remove from the full list.

First, we check the diffs: are they just 1 and 3 or are there any 2s?

Toggle the code

table(adapter_diffs)

adapter_diffs
 1  3 
74 41

We can’t remove an adapter if its difference with the previous adapter is 3, otherwise the difference between the adapters on either side of it will be too big.

What about diffs of 1? It depends how many ones there are around it. We can check this using the rle() (run length encoding) function

Toggle the code

runs <- rle(adapter_diffs)
runs

Run Length Encoding
  lengths: int [1:48] 3 2 4 2 4 2 4 1 4 2 ...
  values : num [1:48] 1 3 1 3 1 3 1 3 1 3 ...

What is the distribution of lengths of sequences of 1s?

Toggle the code

runs_table <- table(runs$lengths) 
runs_table


 1  2  3  4 
13 14 10 11

We have at most four diffs of 1 in a row.

We need to check that if we remove an adapter, the new differences do not exceed 3. Example sequences really helped me figure out what’s going on here:

If the diff sequence is …, 3, 1, 3,… (e.g. adapters 1, 4, 5, 8)
- 1 option to keep as is
- We cannot remove any adapters
- 1 option in total
If the diff sequence is …, 3, 1, 1, 3,… (e.g. adapters 1, 4, 5, 6, 9)
- 1 option to keep as is
- 1 option to remove one adapter (e.g. the 5)
- we cannot remove two adapters
- 2 options total
If the diff sequence is …, 3, 1, 1, 1, 3,… (e.g. adapters 1, 4, 5, 6, 7, 10)
- 1 option to keep as is
- 2 options to remove one adapter (e.g. the 5 or 6)
- 1 options to remove two adapters (e.g. the 5 and 6)
- We cannot remove three adapters
- 4 options total
If the diff sequence is …, 3, 1, 1, 1, 1, 3,… (e.g. adapters 1, 4, 5, 6, 7, 8, 11)
- 1 option to keep as is
- 3 options to remove one adapter (e.g. 5, 6, or 7)
- 3 options to remove two adapters (e.g. any two of 5, 6, and 7)
- We cannot remove three adapters
- 7 options total

Finally, we multiply each run length of difference of 1s with the number of options we have for removing adapters, then take the product of those products.

Toggle the code

runs_df <- tibble(lengths = runs$lengths, values = runs$values)

options <- tibble(lengths = c(1,2,3,4), options = c(1,2,4,7))

runs_df %>%
  filter(values == 1) %>%
  left_join(options, by = "lengths") %>%
  summarise(prod_options = prod(options)) %>%
  pull(prod_options) %>%
  format(scientific = FALSE)

[1] "259172170858496"

Day 11: Seating System

My day 11 data

Part 1: Changing layout

My code for Day 11 runs a little slow (about 10 seconds for Part 1 and 80 seconds for Part 2), so for the sake of being able to rebuild this page quickly as I keep updating it working through the challenges, I will demonstrate this code with the test input provided as an example.

Toggle the code

library(dplyr)
library(stringr)
library(tidyr)

First we read in the data and convert it to a matrix (using the datapasta package for the test input):

Toggle the code

# layout <- readr::read_tsv("data/AoC_day11.txt", col_names = FALSE)

layout <- tibble::tribble(
  ~X1,
  "L.LL.LL.LL",
  "LLLLLLL.LL",
  "L.L.L..L..",
  "LLLL.LL.LL",
  "L.LL.LL.LL",
  "L.LLLLL.LL",
  "..L.L.....",
  "LLLLLLLLLL",
  "L.LLLLLL.L",
  "L.LLLLL.LL"
  )

Toggle the code

# get number of columns for matrix
num_col <- layout %>%
  mutate(length = str_length(X1)) %>%
  slice(1) %>%
  pull(length)

# split layout into characters and turn to vector
layout_vec <- layout %>%
  mutate(X1 = strsplit(X1, split = character(0), fixed = TRUE)) %>%
  pull(X1) %>%
  unlist()

# organise into matrix
initial_layout <- matrix(layout_vec, ncol = num_col, byrow = TRUE)

Next, we write a helper function that, given a matrix and row and column indices, returns a vector of the adjacent seats. We need to take care when indexing into the matrix, so we treat all corner and edge cases separately. Fiddly, but gets the job done.

Toggle the code

get_adj <- function(mat, i,j) {
  
  nr <- nrow(mat)
  nc <- ncol(mat)
  
  # corner cases
  if (i == 1 & j == 1) {adj <- c(mat[1,2], mat[2,1:2])}
  else if (i == 1 & j == nc) {adj <- c(mat[1,(nc-1)], mat[2,(nc-1):nc])}
  else if (i == nr & j == 1) {adj <- c(mat[nr,2], mat[nr-1,1:2])}
  else if (i == nr & j == nc) {adj <- c(mat[nr-1, (nc-1):nc], mat[nr, nc-1])}  
  
  # edge cases
  else if (i == 1) {adj <- c(mat[1, c(j-1,j+1)], mat[2, (j-1):(j+1)])}
  else if (i == nr) {adj <- c(mat[nr, c(j-1,j+1)], mat[nr-1, (j-1):(j+1)])}
  else if (j == 1) {adj <- c(mat[c(i-1, i+1), 1], mat[(i-1):(i+1), 2])}
  else if (j == nc) {adj <- c(mat[c(i-1, i+1), nc], mat[(i-1):(i+1), nc-1])}
  
  # inside cases
  else {adj <- c(mat[i-1,(j-1):(j+1)], mat[i,c(j-1,j+1)], mat[i+1,(j-1):(j+1)])}
  
  adj
}

Once we have a vector of surrounding seats, we can apply the rules in the problem to determine whether a given seat needs to change state. The needs_changing helper function does that. It’s overkill at this point to give options to specify the function for finding the vector of seats to check, and the maximum number of occupied seats people can tolerate around them, but (spolier alert) I put in these options when working on the challenge in Part 2.

Toggle the code

needs_changing <- 
  function(mat, i,j, get_surround = get_adj, max_occupied = 4) {
  
  surround <- get_surround(mat, i,j)
  n_occupied <- sum(surround == "#")
  
  if ((mat[i,j] == "L") & (n_occupied == 0)) return(TRUE)
  
  else if ((mat[i,j] == "#") & (n_occupied >= max_occupied)) {
    return(TRUE)
  }
  
  else return(FALSE)
}

Since floor spaces don’t change, we only need to consider seats. We save the indices of the seats into a data frame, so we can vectorise over it using tidyverse functions. However, when we’ve determined the seats that need changing, using our needs_changing function, we need to convert those indices from a data.frame into a matrix, in order to index into the layout matrix appropriately and make the changes.

Toggle the code

seats <- which(initial_layout != ".", arr.ind = TRUE)

seats_df <- as.data.frame(seats) %>%
  rename(i = row, 
         j = col)

Toggle the code

layout <- initial_layout
iters <- 0

# loop until there are no further changes
repeat {
  
  change <- 0
  
  seats_to_change <- 
    seats_df %>%
    rowwise() %>%
    mutate(change_seat = needs_changing(layout,i,j))   
  
  change <- sum(seats_to_change$change_seat)
  
  if (change == 0) break
  
  indices_to_change <- 
    seats_to_change %>%
    filter(change_seat) %>%
    select(i,j) %>%
    as.matrix()  

  layout[indices_to_change] <- 
    setdiff(c("L", "#"),  layout[indices_to_change])
  
  iters <- iters + 1
}

part_1_iters <- iters
sum(layout== "#")

[1] 37

On the test set, this takes 5 iterations. On the full data set, my answer is 2316, and it took 107 iterations.

Part 2: Looking further

Now, people look to the first seat they can see in each direction, and will change from occupied to unoccupied if five or more of them are occupied.

The plan is to write a function that extracts full vectors from a given seat to the edge of the layout matrix in each of the eight directions, then finds the first seat in each of those directions, and finally collects those into a vector of the seats under consideration when determining if a change is needed. Then I can reuse the loop from Part 1, just changing the arguments in the calls to needs_changing.

Here’s a helper function to get the first seat in a vector looking in one direction:

Toggle the code

get_first_seat_from_vec <- function(vec) {
  
  if (any(vec %in% c("#", "L"))) {
    return(vec[min(which(vec != "."))])
  }
  
  return(NA)
}

Now, if I thought getting adjacent seats to a given seat in Part 1 was fiddly, it’s nothing on getting a vector from a given seat to the edge of the matrix. There are many cases to consider to make we we don’t go out of bounds. In the diagonal directions, first we get a matrix of the indices of the matrix we need, then subset into the matrix accordingly.

Toggle the code

# takes a layout matrix (elements ".", "#", "L")
# returns vector with first "L" or "#" encountered in each direction
get_first_seat <- function(mat, i,j) {
  
  nr <- nrow(mat)
  nc <- ncol(mat)
  
  # North
  if (i == 1) N <- NA
  if (i > 1) N <- mat[(i-1):1,j]
  
  # South
  if (i == nr) S <- NA
  if (i < nr) S <- mat[(i+1):nr,j]
  
  # East
  if (j == nc) E <- NA
  if (j < nc) E <- mat[i, (j+1):nc]
  
  # West
  if (j == 1) W <- NA
  if (j > 1) W <- mat[i, (j-1):1]
  
  # how far in each direction to edge of matrix
  to_N <- i - 1
  to_S <- nr - i
  to_E <- nc - j
  to_W <- j - 1
  
  # North-West
  NW_length <- min(to_N, to_W)
  
  if (i == 1 | j == 1) NW <- NA
  else {
    mat_index <- 
      matrix(c((i-1):(i-NW_length), (j-1):(j-NW_length)), ncol = 2)
    NW <- mat[mat_index]
  }
  
  # North-East
  NE_length <- min(to_N, to_E)
  
  if (i == 1 | j == nc) NE <- NA
  else {
    mat_index <- 
      matrix(c((i-1):(i-NE_length), (j+1):(j+NE_length)), ncol = 2)
    NE <- mat[mat_index]
  }
  
  # South-East
  SE_length <- min(to_S, to_E)
  
  if (i == nr | j == nc) SE <- NA
  else {
    mat_index <- 
      matrix(c((i+1):(i+SE_length), (j+1):(j+SE_length)), ncol = 2)
    SE <- mat[mat_index]
  }
  
  # South-West
  SW_length <- min(to_S, to_W)
  
  if (i == nr | j == 1) SW <- NA
  else {
    mat_index <- 
      matrix(c((i+1):(i+SW_length), (j-1):(j-SW_length)), ncol = 2)
    SW <- mat[mat_index]
  }

  # vectors from mat[i,j] to the edge in each direction 
  all_vecs <- 
    (list(N = N, S = S, E = E, W = W, NW = NW, NE = NE, SE = SE, SW = SW))
  
  # the first seat in each direction, collapsed to a vector
  first_seats <- purrr::map_chr(all_vecs, get_first_seat_from_vec)
  
  # remove NAs from list and return
  # (these occur either when starting on an edge, 
  # or when there are no seats in a given direction)
  return(first_seats[!is.na(first_seats)])

}

Toggle the code

layout <- initial_layout
iters <- 0

# loop until there are no further changes
repeat {
  
  change <- 0
  
  seats_to_change <- 
    seats_df %>%
    rowwise() %>%
    mutate(change_seat = needs_changing(layout,i,j, get_first_seat, 5))   
  
  change <- sum(seats_to_change$change_seat)
  
  if (change == 0) break
  
  indices_to_change <- 
    seats_to_change %>%
    filter(change_seat) %>%
    select(i,j) %>%
    as.matrix()  

  layout[indices_to_change] <- 
    setdiff(c("L", "#"),  layout[indices_to_change])
  
  iters <- iters + 1
}

part_2_iters <- iters
sum(layout== "#")

[1] 26

On the test set, this takes 6 iterations. On the full data set, my answer is 2128, and it took 87 iterations. Given this is fewer iterations than in Part 1, it must be my code for getting the first seat that’s slowing things down.

I am unsatisfied both by how many lines of code this has taken as well as the time taken to run. The introduction to Advent of Code says that each challenge has a solution that will complete in at most 15 seconds on ten year old hardware. So clearly there’s a better way of doing this. Perhaps something to revisit in the future.

I was late to the game, and that was as far as I managed to get in December 2020. I’m looking forward to taking on the challenge again in 2021!

Last updated

2022-11-11 21:32:05 GMT

Details

source code, R environment

Session info

Toggle

─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.2.1 (2022-06-23)
 os       macOS Monterey 12.6.1
 system   aarch64, darwin20
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       Europe/London
 date     2022-11-11
 pandoc   2.19.2 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
 quarto   1.2.247 @ /usr/local/bin/quarto

─ Packages ───────────────────────────────────────────────────────────────────
 ! package     * version date (UTC) lib source
 P dplyr       * 1.0.10  2022-09-01 [?] CRAN (R 4.2.0)
 P forcats     * 0.5.1   2021-01-27 [?] CRAN (R 4.2.0)
 P ggplot2     * 3.3.6   2022-05-03 [?] CRAN (R 4.2.0)
 P purrr       * 0.3.4   2020-04-17 [?] CRAN (R 4.2.0)
 P readr       * 2.1.2   2022-01-30 [?] CRAN (R 4.2.0)
 P sessioninfo * 1.2.2   2021-12-06 [?] CRAN (R 4.2.0)
 P stringr     * 1.4.0   2019-02-10 [?] CRAN (R 4.2.0)
 P tibble      * 3.1.8   2022-07-22 [?] CRAN (R 4.2.0)
 P tidyr       * 1.2.0   2022-02-01 [?] CRAN (R 4.2.0)
 P tidyverse   * 1.3.1   2021-04-15 [?] CRAN (R 4.2.0)

 [1] /private/var/folders/xf/jb2591gj41xbj0c4y2d8_7ch0000gn/T/RtmpUYmkWT/renv-library-220258664caa
 [2] /Users/ellakaye/Rprojs/mine/ellakaye-quarto/renv/library/R-4.2/aarch64-apple-darwin20
 [3] /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library

 P ── Loaded and on-disk path mismatch.

──────────────────────────────────────────────────────────────────────────────

Reuse

CC BY-SA 4.0

Citation

BibTeX citation:

@online{kaye2020,
  author = {Kaye, Ella},
  title = {Advent of {Code} 2020},
  date = {2020-12-09},
  url = {https://ellakaye.co.uk/posts/2020-12-09_advent-of-code-2020/},
  langid = {en}
}

For attribution, please cite this work as:

Kaye, Ella. 2020. “Advent of Code 2020.” December 9, 2020. https://ellakaye.co.uk/posts/2020-12-09_advent-of-code-2020/.