Motivation
Breaking Bad is one of the finest television shows I’ve seen. I recently binge-watched the entire five seasons of the show. In doing so, I noticed the unique presentation of the opening credits where cast members have an elemental symbol from the periodic table highlighted in their name, reflecting the importance of chemistry in the show (and also calling back to the design of the title). I wondered how the show runners might have done this, and thought about writing a program to do this “elementalization” automatically.
Problem
From observing how these elements are highlighted in the cast names, I made the following observations -
- every name gets exactly one element
- the position of the element doesn’t matter (could be in the first name, last name etc.)
- the same element can be assigned to multiple cast members (e.g. in S2E7 ‘Negro y Azul’,
Ar
is assigned to both actor Aaron Paul and producer Stewart A. Lyons) - elements don’t cross name boundaries (e.g.
Ar
cannot be assigned to Cynthia Rodgers) - case doesn’t matter - wherever the symbol gets assigned, the first letter is capitalized, and the rest are in lowercase
The problem then is to assign elemental symbols (e.g. H
, Li
, Na
) to a list of cast member names if that symbol
exists in their name in a valid way, which matches my earlier observations. Some names might have multiple possible
assignments, and others might have none. The program should return all possible valid assignments for each name.
Solution
As presented, the problem is easy to solve. Simply iterate through the list of names and for each, check if every possible elemental symbol is present in it and return that list. I manually compiled a list of elemental symbols in the periodic table from Ptable, which you can find here. Since case is unimportant, all the cast member and elemental symbol names are preprocessed to be in lowercase for simplicity. There might be names in which an element can be assigned to multiple places in the name, however I chose to leave this to the person responsible for the credits. The relevant snippet in Python is given below -
|
|
For the first-billed cast from the Pilot, we get the following output -
|
|
We could stop here. However, it was interesting to me to think about the additional constraint of restricting an element
from being used for more than one cast member. It might become repetitive if the common symbols like H
, S
and O
are reused. What would an algorithm to obey this additional constraint look like?
Bipartite Graphs and Maximum Matching
We cannot simply use a greedy algorithm of assigning the first available symbol to a name. Consider two hypothetical
names Br and B. Br can use the symbols B
and Br
, while B can only use the symbol B
. If we first assign B
to Br, then we won’t have any symbol for B, even though we would have had one had we used Br
in the first place.
We could think of backtracking and trying different assignments if we encounter an impossibility, but we wouldn’t know
if there actually is an impossibility or not until we exhaust all options, which could take a long time.
We can instead model this problem using a bipartite graph. On one side, we have a set of nodes, each containing the name of a cast member. On the other side, we have another set of nodes, each containing an elemental symbol. Using the previous solution, we can construct edges between these two sets of nodes, connecting a name to every elemental symbol that can be assigned to it. We need to choose a set of edges (which connect a name to an elemental symbol which can be assigned to it) such that no two edges coincide (because that means we are assigning two elements to the same name) and every name-node lies on an edge in the selection (because we don’t want to leave a name unassigned if we can help it).
This is exactly the problem of finding a maximum
matching in a bipartite graph. There are many well-known
algorithms to solve this. We will use the Hopcroft-Karp
algorithm as implemented in the
networkx
package.
|
|
The output using the cast from the Pilot again is -
|
|
I can imagine the show runners might want to focus the more “dramatic” elemental symbols (i.e. the two-character ones)
on the first-billed cast displayed first (e.g. assigning Br
to Bryan Cranston instead of Betsy Brandt). I can also
imagine one might want to iterate through multiple assignments to find something more appropriate than the first one
returned (I don’t know how the minds of TV show creators work). Refer to Uno, T.
(1997) which proposes an algorithm for this, since it isn’t
directly implemented in networkx
.
Conclusion
The usage of a classic graph algorithm to solve a problem I noticed in the world was satisfying to me. I can think of a few additional questions to ask like -
- Which cast member can be mapped on to the most elements? What about the least number of elements?
- What is the longest name which cannot be mapped onto elements? Is this the same in names across various cultures?