Advent of code 2023 - day 3 (Haskell)

Categories: programming challenge, Haskell
The elves need to use a gondola lift to continue their adventure, but it's not working 😲. To fix the engine, I decided to give Haskell a go. I was heavily invested in Haskell a few years ago, but other priorities and projects made me lose interest. When I started this puzzle, I had forgotten most of it. I believe Haskell is beautiful as a pure functional language that is well suited for a few specific projects. But being able to write efficient, easy to read code is an art that requires a lot of time and practice. Practice that I have not had in a long time, as you will see. I am writing this article to describe my thought process.

Part 1

[...]

The engine schematic (your puzzle input) consists of a visual representation of the engine. There are lots of numbers and symbols you don't really understand, but apparently any number adjacent to a symbol, even diagonally, is a "part number" and should be included in your sum. (Periods (.) do not count as a symbol.)

Here is an example engine schematic:
467..114..
...*......
..35..633.
......#...
617*......
.....+.58.
..592.....
......755.
...$.*....
.664.598..
In this schematic, two numbers are not part numbers because they are not adjacent to a symbol: 114 (top right) and 58 (middle right). Every other number is adjacent to a symbol and so is a part number; their sum is 4361.

Of course, the actual engine schematic is much larger. What is the sum of all of the part numbers in the engine schematic?


Our input is a string but we need a a list of strings (technically speaking, a list of char lists) whereby each element represents a line. This will allow us to process the input line by line
There a nice utility module available to us that allows us to split any list based on a separator using Eq equality.

I decided to process one line at a time. However, since we need all adjacent symbols (in any direction), if we process a character, we need access to the surrounding characters. In other words, we need to process each line WITH the preceding and succeeding line.


Let's create a function for this.

Haskell
tuplify3Consecutive :: [[Char]] -> [([Char], [Char], [Char])]
tuplify3Consecutive (above:line:below:remainder) = (above,line,below):tuplify3Consecutive (line:below:remainder)
tuplify3Consecutive _ = []

Next, we can associate a boolean with each character to indicate whether it is adjacent to a symbol.
We do this for an entire group of lines and, as is expected for functional languages, use recursion. Expect a lot of recursion today!
Pattern matching makes it easy to access top left, top middle, top right, middle left etc. characters relative to the character to be checked (b2). I include ".." as a separate pattern because we prepend and append one additional line (and each lines gets '.' added in front and after it) to the input to ensure the first and last lines are properly processed.

Haskell
isAdjacentToSymbol :: [Char] -> [Char] -> [Char] -> [(Char, Bool)]
isAdjacentToSymbol (a1:a2:a3:a4) (b1:b2:b3:b4) (c1:c2:c3:c4) = (b2, any (\n -> n `notElem` ('.':['0'..'9'])) [a1, a2, a3, b1, b3, c1, c2, c3]):isAdjacentToSymbol (a2:a3:a4) (b2:b3:b4) (c2:c3:c4)
isAdjacentToSymbol ".." (b1:b2:b3:b4) (c1:c2:c3:c4) = isAdjacentToSymbol "..." (b1:b2:b3:b4) (c1:c2:c3:c4)
isAdjacentToSymbol (a1:a2:a3:a4) (b1:b2:b3:b4) ".." = isAdjacentToSymbol (a1:a2:a3:a4) (b1:b2:b3:b4) "..."
isAdjacentToSymbol _ _ _ = []

Finally, We get all the numbers with foldl. The function processes each characters recursively. We pass the current number being processed (prefix). Once we encounter a character that is no longer a digit, we reset the prefix and either add the number to the list (if it was a part number) or discard it.

Haskell
getPartNumbers :: [Char] -> [Char] -> [Char] -> [Int]
getPartNumbers line above below = 
    let _isAdjacentToSymbol = isAdjacentToSymbol (('.':above) ++ ['.']) (('.':line) ++ ['.']) (('.':below) ++ ['.'])
        (_, numbers, _) = foldl partNumbersFold ("", [], False) (_isAdjacentToSymbol ++ [('.', False)])
    in numbers

partNumbersFold :: (String, [Int], Bool) -> (Char, Bool) -> (String, [Int], Bool)
partNumbersFold (prefix, result, isPartNumber) (c, _isAdjacentToSymbol) = 
    if isDigit c then
        (if _isAdjacentToSymbol
        then (prefix ++ [c], result, True)
        else (prefix ++ [c], result, isPartNumber))
    else ("", if isPartNumber then (read prefix :: Int):result else result, False)

Now all we still have to do is use getPartNumbers for each line group and take the sum of the results.

Haskell
module Part1
    ( doPart1
    ) where
import Data.List.Split
import Data.Char (isDigit)

--other code omitted

getPartNumberSumForLine :: ([Char], [Char], [Char]) -> Int
getPartNumberSumForLine (above, line, below) = sum (getPartNumbers line above below)

doPart1 :: IO ()
doPart1 = do
    fileContents <- readFile "<path to input file>"
    let lineList = splitOn "\n" fileContents
    let partNumberSum = sum $ map getPartNumberSumForLine $ tuplify3Consecutive (("":lineList) ++ [""])
    print partNumberSum
    return ()

Part 2

[...]

The missing part wasn't the only issue - one of the gears in the engine is wrong. A gear is any * symbol that is adjacent to exactly two part numbers. Its gear ratio is the result of multiplying those two numbers together.

This time, you need to find the gear ratio of every gear and add them all up so that the engineer can figure out which gear needs to be replaced.

Consider the same engine schematic again:
467..114..
...*......
..35..633.
......#...
617*......
.....+.58.
..592.....
......755.
...$.*....
.664.598..

In this schematic, there are two gears. The first is in the top left; it has part numbers 467 and 35, so its gear ratio is 16345. The second gear is in the lower right; its gear ratio is 451490. (The * adjacent to 617 is not a gear because it is only adjacent to one part number.) Adding up all of the gear ratios produces 467835.

What is the sum of all of the gear ratios in your engine schematic?


This is a bit tougher, especially because only gears with exactly two part numbers are to be matched. Let's go over the strategy one step at a time.

Warning: this strategy works. But it is naive and can be much better optimized. We are matching each character with another (on one line) which has complexity O(n²) (disregarding the fact that each line of the input has a fixed length).

Saving adjacent '*' positions

Our function isAdjacentToSymbol will no longer suffice.
First of all, we are now only interested in adjacent '*' characters.
Secondly, we don't just need to know which characters match this. We also need to know the position of these symbols.

Writing the code for this puzzle might get a bit complex, so it would be better to define some types that represent:

Haskell
data LinePosition = Top | Middle | Bottom deriving Eq
type GearPosition = (Int, LinePosition)
type GearRatioCandidate = (Int, GearPosition) -- gear ratio, gear position
type NumberPosition = (Int, GearPosition)

Now that that's ready we can write the function to return all gear positions for all characters on a line. At this point gear positions are relative to the character, but those values will be transformed to the absolute index of the asterisk on a line later on.

Haskell
getGearPositions :: Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> [(Char, GearPosition)]
getGearPositions topLeft topMiddle topRight middleLeft middleRight bottomLeft bottomMiddle bottomRight =
    [(topLeft, (-1, Top)),
    (topMiddle, (0, Top)),
    (topRight, (1, Top)),
    (middleLeft, (-1, Middle)),
    (middleRight, (1, Middle)),
    (bottomLeft, (-1, Bottom)),
    (bottomMiddle, (0, Bottom)),
    (bottomRight, (1, Bottom))]

getAdjacentAsteriskPositions :: [Char] -> [Char] -> [Char] -> [(Char, [GearPosition])]
getAdjacentAsteriskPositions (a1:a2:a3:a4) (b1:b2:b3:b4) (c1:c2:c3:c4) =
    let gearPositions = map snd $ filter (\(c, _) -> c == '*') (getGearPositions a1 a2 a3 b1 b3 c1 c2 c3)
    in (b2, gearPositions):getAdjacentAsteriskPositions (a2:a3:a4) (b2:b3:b4) (c2:c3:c4)
getAdjacentAsteriskPositions ".." (b1:b2:b3:b4) (c1:c2:c3:c4) = getAdjacentAsteriskPositions "..." (b1:b2:b3:b4) (c1:c2:c3:c4)
getAdjacentAsteriskPositions (a1:a2:a3:a4) (b1:b2:b3:b4) ".." = getAdjacentAsteriskPositions (a1:a2:a3:a4) (b1:b2:b3:b4) "..."
getAdjacentAsteriskPositions _ _ _ = []

Once we have all gear positions of all characters we can calculate all number positions for all numbers on a line. Here we are making the gear positions absolute ( map (\(i, lp) -> (i+index, lp)) ).
All following digits make up a number, so all gear positions of these digits belong to the number. Hence, we have to append these lists ( allAsteriskPositions ++ map... )
We can have duplicate gear positions, as following digits can point to the same gear position, so we are taking the distinct gear positions (nub/nubBy) as the result.

Haskell
getNumberPositions :: [Char] -> [Char] -> [Char] -> [NumberPosition]
getNumberPositions above line below =
    let adjacentAsteriskPositions = getAdjacentAsteriskPositions (('.':above) ++ ['.']) (('.':line) ++ ['.']) (('.':below) ++ ['.'])
        (_, _, numberPositions, _) = foldl numberPositionFold (0, "", [], []) (adjacentAsteriskPositions ++ [('.', [])])
    in numberPositions

numberPositionFold :: (Int, String, [NumberPosition], [GearPosition]) -> (Char, [GearPosition]) -> (Int, String, [NumberPosition], [GearPosition])
numberPositionFold (index, prefix, result, allAsteriskPositions) (c, adjacentAsteriskPositions) =
    if isDigit c then (index+1, prefix ++ [c], result, allAsteriskPositions ++ map (\(i, lp) -> (i+index, lp)) adjacentAsteriskPositions)
    else
        let number = read prefix :: Int
        in (index+1, "", map (\gp -> (number, gp)) (nubBy (==) allAsteriskPositions) ++ result, [])

Getting and filtering out gear ratios

We consider "gear ratio candidates" asterisk symbols that are adjacent to at least two numbers. However since we are processing one line at a time it is possible that these gear ratio candidates become invalid once another number position (number - asterisk combination) occupies the same gear position. Only Middle and Top line positions can overlap with gear ratio candidates of the previous line.

Haskell
--exclude GRC's and NP that overlap
filterGearRatioCandidates :: [GearRatioCandidate] -> [NumberPosition] -> ([GearRatioCandidate], [NumberPosition])
filterGearRatioCandidates grc ((vNp, (iNp, lpNp)):np) =
    let filteredGearRatioCandidates = filter (\(_, (i, _)) -> lpNp == Bottom || i /= iNp) grc -- bottom NP can never overlap with GRC
        (_grc, _np) = filterGearRatioCandidates filteredGearRatioCandidates np
    in (_grc, if length filteredGearRatioCandidates == length grc then (vNp, (iNp, lpNp)):_np else _np)
filterGearRatioCandidates [] np = ([], np)
filterGearRatioCandidates grc [] = (grc, [])

Next, we have to find potential gear ratios by numbers between the current (top or middle) and the previous line. Only if an asterisk is adjacent to exactly two numbers should it be considered for a gear ratio candidate. Else the number position can no longer be considered for a gear ratio.

Haskell
--previous number positions (prev. line), current number positions -> returns new GRC
getNewGearRatioCandidates :: [NumberPosition] -> [NumberPosition] -> ([GearRatioCandidate], [NumberPosition])
getNewGearRatioCandidates ((lastvNp, (lastiNp, _)):lastNumberPositions) numberPositions =
    let potentialGearRatioCandidates = filter (\(_, (iNp, lpNp)) -> lpNp /= Bottom && iNp == lastiNp) numberPositions
        newNumberPositions =
            if length potentialGearRatioCandidates > 1
            then filter (\(_, (iNp, _)) -> iNp /= lastiNp) numberPositions
            else numberPositions
        (remainderGrc, remainderNp) = getNewGearRatioCandidates lastNumberPositions newNumberPositions
    in case potentialGearRatioCandidates of
        [(vNp, (iNp, lpNp))] -> ((vNp*lastvNp, (iNp, lpNp)):remainderGrc, remainderNp)
        _ -> (remainderGrc, remainderNp)
getNewGearRatioCandidates [] np = ([], np)

Lastly, we should consider the potential gear ratios between numbers on the current line. By design, by this point all number positions that could be part of a gear with three or more numbers have already been filtered out, so there's no need to cross-check with gear ratio candidates nor the previous line.

Haskell
getGearRatioCandidateByNumberPositionsOverlap :: [NumberPosition] -> GearRatioCandidate
getGearRatioCandidateByNumberPositionsOverlap [(vA, (iA, lpA)), (vB, (_, _))] = (vA*vB, (iA, lpA))
getGearRatioCandidateByNumberPositionsOverlap _ = (0,(0, Bottom))
--current number positions (practional), current number positions -> returns new GRC
getNewGearRatioCandidatesFromLine :: [NumberPosition] -> [NumberPosition] -> [GearRatioCandidate]
getNewGearRatioCandidatesFromLine [] _ = []
getNewGearRatioCandidatesFromLine _ [] = []
getNewGearRatioCandidatesFromLine ((_, (iNp, lpNp)):numberPositionsFraction) numberPositions =
    let potentialGearRatioCandidates = filter (\(_, (i, lp)) -> i == iNp && lp == lpNp) numberPositions
    in (if length potentialGearRatioCandidates == 2 then
        getGearRatioCandidateByNumberPositionsOverlap potentialGearRatioCandidates:getNewGearRatioCandidatesFromLine numberPositionsFraction numberPositions
        else getNewGearRatioCandidatesFromLine numberPositionsFraction numberPositions)

After all of this has been implemented we can just call the functions, in order, for every line to process to get the list of gear ratios.

  1. Filter out invalid gear ratio candidates
  2. Get new gear ratio candidates (from a number on the current line and previous line)
  3. Append new gear ratios from the current line
  4. Add filtered out gear ratio candidates to the list of gear ratios. INCLUDE top new gear ratio candidates (as these can't possbly be invalidated by the succeeding line)
  5. Pass the new gear ratio candidates + gear ratio candidates of numbers from the last two lines to the next line.
Haskell
-- each lastNumberPosition is coupled with an Int that indcates how many lines ago it was passed
module Part2
    ( doPart2
    ) where
import Data.List.Split
import Data.Char (isDigit)
import Data.List (nubBy)

--other code omitted

getGearRatios :: ([Int], [GearRatioCandidate], [(NumberPosition, Int)]) -> ([Char], [Char], [Char]) -> ([Int], [GearRatioCandidate], [(NumberPosition, Int)])
getGearRatios (gearRatios, gearRatioCandidates, lastNumberPositions) (above, line, below) =
    let numberPositions = getNumberPositions above line below
        (fiteredGearRatioCandidates, filteredNumberPositions) = filterGearRatioCandidates gearRatioCandidates numberPositions
        (newGearRatioCandidates, filteredNumberPositions2) = getNewGearRatioCandidates (map fst lastNumberPositions) filteredNumberPositions
        newGearRatioCandidates2 = newGearRatioCandidates ++ nubBy (==) (getNewGearRatioCandidatesFromLine filteredNumberPositions2 filteredNumberPositions2) -- add GRC from within current line with getNumberPositionOverlapWithNumberPositions
        --move TOP newGearRatioCandidates2 to fiteredGearRatioCandidates
        (newGearRatioCandidates3, fiteredGearRatioCandidates2) = foldl (\(ngrc, fgrc) (v, (i, lp)) -> 
                                                                        if lp == Top then (ngrc, (v, (i, lp)):fgrc) 
                                                                        else ((v, (i, lp)):ngrc, fgrc)) 
                                                                        ([], fiteredGearRatioCandidates) newGearRatioCandidates2
    in (gearRatios ++ map fst fiteredGearRatioCandidates2,
        newGearRatioCandidates3,
        map (\fnp -> (fnp, 1)) (filter (\(_, (_, lp)) -> lp == Middle || lp == Bottom) filteredNumberPositions2)
        ++ map (\(fnp, i) -> (fnp, i+1)) (filter (\((_, (_, lp)), i) -> i < 2 && lp == Bottom) lastNumberPositions)
        )

doPart2 :: IO ()
doPart2 = do
    fileContents <- readFile "C:\\Users\\Administrator\\source\\repos\\AdventOfCode\\inputFiles\\2023_3.txt"
    let lineList = splitOn "\n" fileContents
        (gearRatios, _, _) = foldl getGearRatios ([], [], []) (tuplify3Consecutive (("":lineList) ++ [""]))
    let gearRatioSum = sum gearRatios
    print gearRatioSum
    return ()