Matt Daniels is a designer, coder, and data scientist at Undercurrent in New York City. His past works include the Etymology of “Shorty” and Outkast, in graphs and charts. He decided to examine the vocabulary of hip hop artists, and this is what he found.
Literary elites love to rep Shakespeare’s vocabulary: across his entire corpus, he uses 28,829 words, suggesting he knew over 100,000 words and arguably had the largest vocabulary, ever.
I decided to compare this data point against the most famous artists in hip hop. I used each artist’s first 35,000 lyrics. That way, prolific artists, such as Jay-Z, could be compared to newer artists, such as Drake.
Wu-Tang Clan at #6 is fucking impressive given that 10 members, with vastly different styles, are equally contributing lyrics. Add the fact that GZA, Ghostface, Raekwon, and Method Man’s solo works are also in the top 20 – notably, GZA at #2. Perhaps their countless hours of studio time together (and RZA’s mentorship) exposed each rapper’s vocabulary to one another.
Let’s take a deeper look at Wu-Tang five studio albums to better understand each member’s contribution. Here’s a breakdown of the number and percent of words used by each member.
To understand each rapper’s vocabulary (# of unique words) in Wu-Tang’s first five albums, I chose a 3,500 word threshold so that each person was on an equal footing. That way, we could include GZA, but unfortunately had to exclude Ol’ Dirty Bastard, Cappadonna, and Masta Killa, who have too few verses across Wu-Tang’s corpus.
U-God and GZA clearly bolster the group’s average. Raekwon and Method Man’s contributions have a lower average compared to other members, but recognize that their data points would exceed most artists in hip hop.