Euler Problem 18 & 67: Maximum Path Sums

A pedigree is an example of a binary tree: Euler Problem 18

An example of a pedigree. Source: Wikimedia.

Euler Problem 18 and 67 are exactly the same besides that the data set in the second version is larger than in the first one. In this post, I kill two Eulers with one code.

These problems deal with binary trees, which is a data structure where each node has two children. A practical example of a binary tree is a pedigree chart, where each person or animal has two parents, four grandparents and so on.

Euler Problem 18 Definition

By starting at the top of the triangle below and moving to adjacent numbers on the row below, the maximum total from top to bottom is 23.

3
7 4
2 4 6
8 5 9 3

That is, 3 + 7 + 4 + 9 = 23. Find the maximum total from top to bottom of the triangle below:

75
95 64
17 47 82
18 35 87 10
20 04 82 47 65
19 01 23 75 03 34
88 02 77 73 07 63 67
99 65 04 28 06 16 70 92
41 41 26 56 83 40 80 70 33
41 48 72 33 47 32 37 16 94 29
53 71 44 65 25 43 91 52 97 51 14
70 11 33 28 77 73 17 78 39 68 17 57
91 71 52 38 17 14 91 43 58 50 27 29 48
63 66 04 68 89 53 67 30 73 16 69 87 40 31
04 62 98 27 23 09 70 98 73 93 38 53 60 04 23

As there are only 16,384 routes, it is possible to solve this problem by trying every route. However, Problem 67, is the same challenge with a triangle containing one-hundred rows; it cannot be solved by brute force, and requires a clever method! ;o)

Solution

This problem seeks a maximum path sum in a binary tree. The brute force method, as indicated in the problem definition, is a very inefficient way to solve this problem. The video visualises the quest for the maximum path, which takes eleven minutes of hypnotic animation.

A more efficient method is to define the maximum path layer by layer, starting at the bottom. The maximum sum of 2+8 or 2+5 is 10, the maximum sum of 4+5 or 4+9 is 13 and the last maximum sum is 15. These numbers are now placed in the next row. This process cycles until only one number is left. This algorithm solves the sample triangle in four steps:

Step 1:

3
7 4
2 4 6
8 5 9 3

Step 2:

3
7 4
10 13 15

Step 3:

3
20 19

Step 4:

23

In the code below, the data is triangle matrix. The variables rij (row) and kol (column) drive the search for the maximum path. The triangle for Euler Problem 18 is manually created and the triangle for Euler Problem 67 is read from the website.

path.sum <- function(triangle) {
    for (rij in nrow(triangle):2) {
        for (kol in 1:(ncol(triangle)-1)) {
            triangle[rij - 1,kol] <- max(triangle[rij,kol:(kol + 1)]) + triangle[rij - 1, kol]
        }
        triangle[rij,] <- NA
    }
    return(max(triangle, na.rm = TRUE))
}

# Euler Problem 18
triangle <- matrix(ncol = 15, nrow = 15)
triangle[1,1] <- 75
triangle[2,1:2] <- c(95, 64)
triangle[3,1:3] <- c(17, 47, 82)
triangle[4,1:4] <- c(18, 35, 87, 10)
triangle[5,1:5] <- c(20, 04, 82, 47, 65)
triangle[6,1:6] <- c(19, 01, 23, 75, 03, 34)
triangle[7,1:7] <- c(88, 02, 77, 73, 07, 63, 67)
triangle[8,1:8] <- c(99, 65, 04, 28, 06, 16, 70, 92)
triangle[9,1:9] <- c(41, 41, 26, 56, 83, 40, 80, 70, 33)
triangle[10,1:10] <- c(41, 48, 72, 33, 47, 32, 37, 16, 94, 29)
triangle[11,1:11] <- c(53, 71, 44, 65, 25, 43, 91, 52, 97, 51, 14)
triangle[12,1:12] <- c(70, 11, 33, 28, 77, 73, 17, 78, 39, 68, 17, 57)
triangle[13,1:13] <- c(91, 71, 52, 38, 17, 14, 91, 43, 58, 50, 27, 29, 48)
triangle[14,1:14] <- c(63, 66, 04, 68, 89, 53, 67, 30, 73, 16, 69, 87, 40, 31)
triangle[15,1:15] <- c(04, 62, 98, 27, 23, 09, 70, 98, 73, 93, 38, 53, 60, 04, 23)

answer <- path.sum(triangle)
print(answer)

Euler Problem 67

The solution for problem number 67 is exactly the same. The data is read directly from the Project Euler website.

# Euler Problem 67
triangle.file <- read.delim("https://projecteuler.net/project/resources/p067_triangle.txt", stringsAsFactors = F, header = F)
triangle.67 <- matrix(nrow = 100, ncol = 100)
for (i in 1:100) {
    triangle.67[i,1:i] <- as.numeric(unlist(strsplit(triangle.file[i,], " ")))
}
answer <- path.sum(triangle.67)
print(answer)

Trumpworld Analysis : Ownership Relations in his Business Network

Trumpworld by BuzzfeedYou do not need a machine learning algorithm to predict that the presidency of Donald Trump will be controversial.

One of the most discussed aspects of his reign is the massive potential for conflicts of interest. Trump’s complex business empire is entangled with national and international politics.

Buzzfeed has mapped many of the relationships between businesses and people in what they have dubbed Trumpworld. They provided the data to enable citizens data science into the wheelings and dealings of Donald J. Trump. The raw data set consists of three subsets of connections between:

  • Organisations
  • People
  • People and organisations

Trumpworld Analysis

This post analyses the connections between organisations using the mighty igraph package in the R language. The code snippet below converts the data to a graph that can be analysed using social network analysis techniques. I have downloaded the table of the raw data file as CSV files. This data is subsetted to contain only ownership relationships.

# Read data
trumpworld.org <- read.csv("TrumpWorld DataOrg.csv")
trumpworld.org.ownership <- subset(trumpworld.org, Connection=="Ownership")[,1:2]

# Create graph of ownerships
library(igraph)
org.ownership <- graph.edgelist(as.matrix(trumpworld.org.ownership))

# Analysis
nrow(trumpworld.org.ownership)
length(unique(c(trumpworld.org.ownership[,1], trumpworld.org.ownership[,2])))

# Plot Graph
par(mar=rep(0,4))
plot(org.ownership,
 layout=layout.fruchterman.reingold,
 vertex.label=NA,
 vertex.size=2,
 edge.arrow.size=.1
)

Trumpworld analysis - business ownership networkNetwork Analysis

This network contains 309 ownership relationships between 322 firms.

When we plot the data, we see that most relationships are between two firms. The plot is organised with the Fruchterman-Reingold algorithm to improve its clarity.

We can also see a large cluster in the centre. The names have been removed for clarity.

The Trumpland analysis continues with this conglomerate. The second code section excises this connected subnetwork so we can analyse it in more detail.

# Find most connected firm
which.max(degree(org.ownership))
# Create subnetworks
org.ownership.d <- decompose(org.ownership)
# Find largest subnetwork
largest <- which.max(sapply(org.ownership.d, diameter))
#Plot largest subnetwork
plot(org.ownership.d[[largest]],
 layout=layout.fruchterman.reingold,
 vertex.label.cex=.5,
 vertex.size=5,
 edge.arrow.size=.1
)

Digging Deeper

The node with the highest degree identifies the business with the most holdings. This analysis shows that DJT Holdings LLC owns 33 other organisations. These organisations own other organisations. We can now use the cluster function to investigate this subnetwork.

Trumpworld holdings

This Trumpworld analysis shows that the ownership network is clearly a star network. DJT Holdings LLC centrally controls all organisations. Perhaps this graph visualises the management style of the soon to be president Trump. Trump centrally controls his empire, which is typical for a family business.

Does this chart visualise Trump’s leadership style? Is the star network an expression of his lack of trust and thus desire to oversee everything directly?