The Devil is in the Data website presents solutions to Project Euler problems in the R language for statistical computing. Project Euler (named after Swiss mathematician Leonhard Euler) is a competition with computational problems. Participants solve these problems using computer code, using many different languages.

This blog describes each problem in detail, including theoretical background and the complete code to find the solution. Some of the early problems are quite trivial, but they provide an excellent introduction to coding in R.

My aim is to complete the first 100 Euler problems and post a solution every Thursday (Australian time). I don’t claim to be an expert in R and love to read comments with improved alternative solutions.

If you are also solving Euler problems, then it would be fun to connect. My friend key is: 1009266_RbWqFI3r9PF44pD2d66syqRcmEaIqWi3.

Spoiler Alert: If you believe that the solutions to Project Euler problems should not be shared then do not read any further.

Euler Problem 29 is another permutation problem that is quite easy to solve using brute force. The MathBlog site by Kristian Edlund has a nice solution using only pen and paper.

Raising number to a power can have interesting results. The video below explains why this pandigital formula approximates to billions of decimals:

Euler Problem 29 Definition

Consider all integer combinations of: for and .

If they are then placed in numerical order, with any repeats removed, we get the following sequence of 15 distinct terms:

This code simply calculates all powers from to and determines the number of unique values. Since we are only interested in their uniqueness and not the precise value, there is no need to use Multiple Precision Arithmetic.

# Initialisation
target <- 100
terms <- vector()
i <- 1
# Loop through values of a and b and store powers in vector
for (a in 2:target) {
for (b in 2:target) {
terms[i] <- a^b
i <- i + 1
}
}
# Determine the number of distinct powers
answer <- length(unique(terms))
print(answer)

Euler Problem 28 takes us to the world of the Ulam Spiral. This is a spiral that contains sequential positive integers in a square spiral, marking the prime numbers. Stanislaw Ulam discovered that a lot of primes are located along the diagonals. These diagonals can be described as polynomials. The Ulam Spiral is thus a way of generating quadratic primes (Euler Problem 27).

Ulam Spiral (WikiMedia).

Euler Problem 28 Definition

Starting with the number 1 and moving to the right in a clockwise direction a 5 by 5 spiral is formed as follows:

It can be verified that the sum of the numbers on the diagonals is 101. What is the sum of the numbers on the diagonals in a 1001 by 1001 spiral formed in the same way?

Proposed Solution

To solve this problem we do not need to create a matrix. This code calculates the values of the corners of a matrix with size . The lowest number in the matrix with size is . The numbers increase by .

The code steps through all matrices from size 3 to 1001. The solution uses only the uneven sized matrices because these have a centre. The answer to the problem is the sum of all numbers.

size <- 1001 # Size of matrix
answer <- 1 # Starting number
# Define corners of subsequent matrices
for (n in seq(from = 3, to = size, by = 2)) {
corners <- seq(from = n * (n - 3) + 3, by = n - 1, length.out = 4)
answer <- answer + sum(corners)
}
print(answer)

Plotting the Ulam Spiral

We can go beyond Euler Problem 28 and play with the mathematics. This code snippet plots all the prime numbers in the Ulam Spiral. Watch the video for an explanation of the patterns that appear along the diagonals.

Ulam Spiral prime numbers.

The code creates a matrix of the required size and fills it with the Ulam Spiral. The code then identifies all primes using the is.prime function from Euler Problem 7. A heat map visualises the results.

Prime numbers are the engine of the Internet economy. One of the reasons prime numbers are so useful is that they cannot be generated through an algorithm. This impossibility has not stopped mathematicians from trying to find formulas to generate prime numbers.

Euler problem 27 deals with quadratic formulas that can be used to generate sets of prime numbers. We have already discussed this in the post about the Ulam Spiral. This Numerphile video discusses quadratic primes.

Euler Problem 27 Definition

Euler discovered the remarkable quadratic formula:

It turns out that the formula will produce primes for the consecutive integer values . However, when , is divisible by , and certainly when , is clearly divisible by .

The incredible formula was discovered, which produces 80 primes for the consecutive values . The product of the coefficients, and , is .

Considering quadratics of the form: ,

where and , where is the modulus/absolute value of , e.g. and .

Find the product of the coefficients, and , for the quadratic expression that produces the maximum number of primes for consecutive values of , starting with .

Proposed Solution

The only way to solve this problem is through brute force and reduce the solution space to optimise it for speed (source: mathblog.dk). Because the outcome of the equation must be prime for , also has to be prime. We can use the prime sieve from Euler Problem 3, which reduces it from 2000 to 168 options. When we insert it follows that a has to be an even number. If has to be prime and has to be a prime number, then a can only be an odd number. However, when , a has to be even.

Euler Problem 27 code

seq.a <- seq(-999, 1001, 2) # a has to be odd
seq.b <- (esieve(1000)) # b has to be prime
max.count <- 0
for (a in seq.a) {
if (a == 2)
seq.a <- seq(-1000, 1000, 2) # a has to be even
for (b in seq.b) {
n <- 0 # Find sequence of primes for a and b
while (is.prime(n^2 + a * n + b)) {
n <- n + 1 } # Store maximum values if (n > max.count) {
max.count <- n
max.a <- a
max.b <- b
}
}
}
answer <- max.a * max.b
print(answer)

A few years ago a fraction broke the internet. What happens when you divide 1 by 998001?

What is special about this fraction is that it lists every three-decimal number except for 998. Look carefully at the sequence to see that is 000, 001, 0002, 003, 004, 005 and so on. After it has reached 999, the sequence continues from the start. This fraction thus has 2997 recurring decimals. James Grime from Numberphile explains this mathematical oddity with his usual enthusiasm.

The decimal fraction of 1/998001 is a recurring decimal. These are decimal numbers with periodic digits (repeating its values at regular intervals). Euler problem 26 asks us to analyse recurring decimals (reciprocal cycles).

Euler Problem 26 Definition

A unit fraction contains 1 in the numerator. The decimal representation of the unit fractions with denominators 2 to 10 are given:

Where 0.1(6) means 0.166666…, and has a 1-digit recurring cycle. It can be seen that 1/7 has a 6-digit recurring cycle. Find the value of d < 1000 for which 1/d contains the longest recurring cycle in its decimal fraction part.

Solution

A051626 describes the length of the recurring numbers in 1/n in the On-Line Encyclopaedia of Integer Sequences. To solve Euler Problem 26, we need to generate the first 1000 numbers of this sequence and find out which number has the longest recurring cycle.

R can only display up to 22 decimals by using options(digits=22). The base R capability is unsuitable for solving this problem, so I wrote some code to perform long division the old-fashioned way.

The recur function divides 1 by any arbitrary integer. The code continues until the decimal terminates, for example 1/4 = 0.25, or when a recurring pattern emerges, e.g. 1/7 = 0.(142857).

The function has two arguments: n is the input number. The output argument determines the outcome of the function: “len” for the length of the recurring decimals. Any other value shows the result of the calculation. The output of the function is a string. Using the European notation, the recurring part of the decimals is shown between brackets, e.g. 1/14 = 0.0(714285).

recur <- function(x, output = "") {
# Prepare variable
if (x == 0) return(NaN)
if (x == 1) return(0)
x <- floor(abs(x))
# Initiate vectors to store decimals and remainders
dec <- vector()
rem <- vector()
# Initiate values
i <- 1
r <- 10
rem <- r
# Long division
repeat {
dec[i] <- floor(r / x)
r <- 10 * (r %% x)
# Test wether the number is terminating or repeating
if (r == 0 | r %in% rem) break
rem[i + 1] <- r
i <- i + 1
}
# Determine number of recurring digits
rep <- ifelse(r != 0, length(rem) - which(r == rem) + 1, 0)
# Output
if (output == "len")
return(rep)
else {
if (rep != 0) {
if (rep == length(dec))
l <- "("
else
l <- c(dec[1:(length(dec) - rep)], "(")
dec <- c(l, dec[(length(dec) - rep + 1):length(dec)], ")")
}
return(paste0("0.", paste0(dec, collapse = "", sep = "")))
}
}
A051626 <- sapply(1:1000, recur, "len")
answer <- which.max(A051626)
print(answer)
recur(998001)

You can view the latest version of this code on GitHub.

The Fibonacci Sequence occurs in nature: The nautilus shell.

Euler Problem 25 takes us back to the Fibonacci sequence and the problems related to working with very large integers.

The Fibonacci sequence follows a simple mathematical rule but it can create things of great beauty. This pattern occurs quite often in nature, like to nautilus shell shown in the image. The video by Arthur Benjamin at the end of this post illustrates some of the magic of this sequence.

Large Integers in R

By default, numbers with more than 7 digits are shown in scientific notation in R, which reduces the accuracy of the calculation. You can change the precision of large integers with the options function but R struggles with integers with more than 22 digits. This example illustrates this issue.

The first solution uses the GMP library to manage very large integers. This library also contains a function to generate Fibonacci numbers. This solution cycles through the Fibonacci sequence until it finds a number with 1000 digits.

library(gmp) # GNU Multiple Precision Arithmetic Library
n <- 1
fib <- 1
while (nchar(as.character(fib)) < 1000) {
fib <- fibnum(n) # Determine next Fibonacci number
n <- n + 1
}
answer <- n
print(answer)

This is a very fast solution but my aim is to solve the first 100 Project Euler problems using only base-R code. The big.add function I developed to solve Euler Problem 13.

t <- proc.time()
fib <- 1 # First Fibonaci number
cur <- 1 # Current number in sequence
pre <- 1 # Previous number in sequence
index <- 2
while (nchar(fib) < 1000) {
fib <- big.add(cur, pre) # Determine next Fibonacci number
pre <- cur
cur <- fib
index <- index + 1
}
answer <- index
print(answer)

This code is much slower than the GMP library but it was fun to develop.

Euler Problem 24 asks to develop lexicographic permutations which are ordered arrangements of objects in lexicographic order. Tushar Roy of Coding Made Simple has shared a great introduction on how to generate lexicographic permutations.

Euler Problem 24 Definition

A permutation is an ordered arrangement of objects. For example, 3124 is one possible permutation of the digits 1, 2, 3 and 4. If all of the permutations are listed numerically or alphabetically, we call it lexicographic order. The lexicographic permutations of 0, 1 and 2 are:

The digits 0 to 9 have permutations (including combinations that start with 0). Most of these permutations are, however, not in lexicographic order. A brute-force way to solve the problem is to determine the next lexicographic permutation of a number string and repeat this one million times.

nextPerm <- function(a) {
# Find longest non-increasing suffix
i <- length(a) while (i > 1 && a[i - 1] >= a[i])
i <- i - 1
# i is the head index of the suffix
# Are we at the last permutation?
if (i <= 1) return (NA)
# a[i - 1] is the pivot
# Find rightmost element that exceeds the pivot
j <- length(a)
while (a[j] <= a[i - 1])
j <- j - 1
# Swap pivot with j
temp <- a[i - 1]
a[i - 1] <- a[j]
a[j] <- temp
# Reverse the suffix
a[i:length(a)] <- rev(a[i:length(a)])
return(a)
}
numbers <- 0:9
for (i in 1:(1E6 - 1)) numbers <- nextPerm(numbers)
answer <- numbers
print(answer)

If no such index exists, then this is already the last permutation.

Find largest index such that and .

Swap and .

Reverse the suffix starting at .

Combinatorics

A more efficient solution is to use combinatorics, thanks to MathBlog. The last nine digits can be ordered in ways. So the first permutations start with a 0. By extending this thought, it follows that the millionth permutation must start with a 2.

From this rule, it follows that the 725761^{st} permutation is 2013456789. We now need 274239 more lexicographic permutations:

We can repeat this logic to find the next digit. The last 8 digits can be ordered in 40320 ways. The second digit is the 6th digit in the remaining numbers, which is 7 (2013456789).

This process is repeated until all digits have been used.

These are numbers for which the sum of its proper divisors is greater than the number itself.

12 is an abundant number because the sum of its proper divisors (the aliquot sum) is larger than 12: (1 + 2 + 3 + 4 + 6 = 16).

All highly composite numbers or anti-primes greater than six are abundant numbers. These are numbers that have so many divisors that they are considered the opposite of primes, as explained in the Numberphile video below.

Euler Problem 23 Definition

A perfect number is a number for which the sum of its proper divisors is exactly equal to the number. For example, the sum of the proper divisors of 28 would be 1 + 2 + 4 + 7 + 14 = 28, which means that 28 is a perfect number.

A number n is called deficient if the sum of its proper divisors is less than n and it is called abundant if this sum exceeds n.

As 12 is the smallest abundant number, 1 + 2 + 3 + 4 + 6 = 16, the smallest number that can be written as the sum of two abundant numbers is 24. By mathematical analysis, it can be shown that all integers greater than 28123 can be written as the sum of two abundant numbers. However, this upper limit cannot be reduced any further by analysis, even though it is known that the greatest number that cannot be expressed as the sum of two abundant numbers is less than this limit.

This solution repurposes the divisors function that determines the proper divisors for a number, introduced for Euler Problem 21. The first code snippet creates the sequence of all abundant numbers up to 28123 (sequence A005101 in the OEIS). An abundant number is one where its aliquot sum is larger than n.

# Generate abundant numbers (OEIS A005101)
A005101 <- function(x){
abundant <- vector()
a <- 1
for (n in 1:x) {
aliquot.sum <- sum(proper.divisors(n)) - n
if (aliquot.sum > n) {
abundant[a] <- n
a <- a + 1
}
}
return(abundant)
}
abundant <- A005101(28123)

The solution to this problem is also a sequence in the Online Encyclopedia of Integer Sequences (OEIS A048242). This page states that the highest number in this sequence is 20161, not 28123 as stated in the problem definition.

The second section of code creates a list of all potential numbers not the sum of two abundant numbers. The next bit of code sieves any sum of two abundant numbers from the list. The answer is determined by adding remaining numbers in the sequence.

# Create a list of potential numbers that are not the sum of two abundant numbers
A048242 <- 1:20161
# Remove any number that is the sum of two abundant numbers
for (i in 1:length(abundant)) {
for (j in i:length(abundant)) {
if (abundant[i] + abundant[j] <= 20161) {
A048242[abundant[i] + abundant[j]] <- NA
}
}
}
A048242 <- A048242[!is.na(A048242)]
answer <- sum(A048242)
print(answer)

Wacław Sierpiński was a mathematical genius who developed several of the earliest fractals. The Sierpiński triangle is an easy to conceptualise geometrical figure but it hides a fascinating mathematical complexity. Start by drawing an equilateral triangle and draw another one in its centre. Then draw equilateral triangles in the four resulting triangles, and so on, ad infinitum.

The original Sierpinski triangle will eventually disappear into Cantor dust, a cloud of ever shrinking triangles of infinitesimal size. The triangle is self-similar, no matter how far you zoom in, the basic geometry remains the same.

The Chaos Game

A fascinating method to create a Sierpinski Triangle is a chaos game. This method uses random numbers and some simple arithmetic rules. Sierpinski Triangles can be created using the following six steps:

Define three points in a plane to form a triangle.

Randomly select any point on the plane.

Randomly select any one of the three triangle points.

Move half the distance from your current position to the selected vertex.

Plot the current position.

Repeat from step 3.

This fractal is an implementation of chaos theory as this random process attracts to a complex ordered geometry. The game only works with random numbers and when selecting random vertices of the triangle.

Sierpinski Triangle Code

This code implements the six rules in R. The code first initializes the triangle, defines a random starting point and then runs a loop to place random dots. The R plot engine does not draw pixels but uses characters, which implies that the diagram is not as accurate as it could be but the general principle is clear. The x(11) and Sys.sleep() commands are used to plot during the for-loop.

# Sierpinsky Triangle
# Initialise triangle
p <- c(0, 500, 1000)
q <- c(0, 1000, 0)
x11()
par(mar = rep(0, 4))
plot(p, q, col= "red", pch = 15, cex = 1, axes = FALSE)
# Random starting point
x <- sample(0:1000, 1)
y <- sample(0:1000, 1)
# Chaos game
for (i in 1:10000) {
Sys.sleep(.001)
n <- sample(1:3, 1)
x <- floor(x + (p[n] - x) / 2)
y <- floor(y + (q[n] - y) / 2)
points(x, y, pch = 15, cex = 0.5)
}

This algorithm demonstrates how a seemingly chaotic process can result in order. Many other versions of chaos games exist, which I leave to the reader to play with. If you create your own versions then please share the code in the comment box below.

Euler problem 22 is another trivial one that takes us to the realm of ASCII codes. ASCII is a method to convert symbols into numbers, originally invented for telegraphs.

Back in the 8-bit days, ASCII art was a method to create images without using lots of memory. Each image consists of a collection of text characters that give the illusion of an image. Euler problem 22 is, unfortunately, a bit less poetic.

Euler Problem 22 Definition

Using names.txt, a 46K text file containing over five-thousand first names, begin by sorting it into alphabetical order. Then working out the alphabetical value for each name, multiply this value by its alphabetical position in the list to obtain a name score.

For example, when the list is sorted into alphabetical order, COLIN, which is worth 3 + 15 + 12 + 9 + 14 = 53, is the 938^{th} name in the list. So, COLIN would obtain a score of 938 × 53 = 49,714.

This code reads and cleans the file and sorts the names alphabetically. The charToRaw function determines the numerical value of each character in each name. This output of this function is the hex ASCII code for each character. The letter A is number 65, so we subtract 64 from each value to get the sum total.

# ETL: reads the file and converts it to an ordered vector.
names <- readLines("https://projecteuler.net/project/resources/p022_names.txt", warn = F)
names <- unlist(strsplit(names, ","))
names <- gsub("[[:punct:]]", "", names)
names <- sort(names)
# Total Name scores
answer <- 0
for (i in names) {
value <- sum(sapply(unlist(strsplit(i, "")), function(x) as.numeric(charToRaw(x)) - 64))
value <- value * which(names==i)
answer <- answer + value
}
print(answer)

We can have a bit more fun with this problem by comparing this list with the most popular baby names in 2016. The first section of the code extracts the list of popular names from the website. The rest of the code counts the number of matches between the lists.

# Most popular baby names
library(rvest)
url <- "https://www.babycenter.com/top-baby-names-2016.htm"
babynames <- url %>%
read_html() %>%
html_nodes(xpath = '//*[@id="babyNameList"]/table') %>%
html_table()
babynames <- babynames[[1]]
# Convert Project Euler list and test for matches
proper=function(x) paste0(toupper(substr(x, 1, 1)), tolower(substring(x, 2)))
names <- proper(names)
sum(babynames$GIRLS %in% names)
sum(babynames$BOYS %in% names)

Euler problem 21 takes us to the realm of amicable numbers, which are listed in sequence A259180 in the OEIS. Amicable, or friendly, numbers are the most romantic numbers known to maths. Amicable numbers serve absolutely no practical purpose, other than mathematical entertainment.

A related concept is a perfect number, which is a number that equals the sum of its proper divisors. Mathematicians have also defined sociable numbers and betrothed numbers which are similar to amicable numbers. But perhaps these are for another Euler problem.

Euler Problem 21 Definition

Let be defined as the sum of proper divisors of n (numbers less than n which divide evenly into n). If and , where , then and are an amicable pair and each of and are called amicable numbers.

For example, the proper divisors of 220 are 1, 2, 4, 5, 10, 11, 20, 22, 44, 55 and 110; therefore . The proper divisors of 284 are 1, 2, 4, 71 and 142; so, .

The first part of the code provides for a function to list all proper divisors for a given integer x. The loop determines the divisors for the numbers 220 to 10,000, calculates their sum and then checks if these numbers are amicable. When the code finds an amicable number, the counter jumps to the sum of the divisors to check for the next one.

proper.divisors <- function(x) {
divisors <- vector()
d <- 1
for (i in 1:floor(sqrt(x))) {
if (x %% i == 0) {
divisors[d] <- i
if (i != x/i) {
d <- d + 1
divisors[d] <- x / i
}
d <- d + 1
}
}
return(divisors)
}
answer <- 0
n <- 220
while (n <= 10000) {
div.sum <- sum(proper.divisors(n)) - n
if (n == sum(proper.divisors(div.sum)) - div.sum & n != div.sum) {
answer <- answer + n + div.sum
print(paste0("(", n, ",", div.sum, ")"))
n <- div.sum
}
n <- n + 1
}
print(answer)

Amicable numbers were known to the Pythagoreans, who credited them with many mystical properties. Before we had access to computers, finding amicable numbers was a task that required a lot of patience. No algorithm can systematically generate all amicable numbers, and until 1946 only 390 pairs were known. Medieval Muslim mathematicians developed several formulas to create amicable numbers, but the only way to be complete is using brute force.