Wednesday, August 21, 2013

First blurb: ``Absolutely groundbreaking''


Credit for the first Who's Bigger? blurb goes to Dr. Eric Siegel, founder of  Predictive Analytics World and author of Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die
"Absolutely groundbreaking: The first full scale, data driven undertaking to weigh the historical and cultural impact of persons. This work injects a much needed dose of quantitative rigor into the field of history itself. How do the greatest legacies of yesteryear stack up, not only against one another, but against the power of today's celebrity royalty? This thorough treatment illuminates, validates, and even augments history as a discipline."

Thursday, August 8, 2013

Ranking TenGrade

There are many ways to rank the world.   Who's Bigger does what we do through an extensive computational analysis of Wikipedia and the content of scanned books. The company I co-founded, General Sentiment, does this through an NLP sentiment analysis of the text content of news and social media.

A new startup company, TenGrade, uses a different approach, explicitly asking people to give their opinion on where something (anything in the universe) ranks on a 0 to 10 scale.   They make this fun and easy using a mobile app, and then let you compare your rankings with various slices of whatever community you care about, and how they change over time.

It is a very interesting approach.  To paraphrase Yogi Berra, you can learn a lot from listening to people.  As I write this, Pizza is a 7.9, where as Grapefruit is a 6.8.  Barack Obama is a 6.0, which puts him ahead of disgraced baseball player Alex Rodriguez (Arod) at 1.0.

Their rankings have the advantage of clear numerical interpretability, although I think they will eventually discover the need to normalize individual rankings: a 7.0 given by a sourpuss means something different than a 7.0 granted by some easy mark.   It will be fascinating to see what kinds of things people will feel driven to state an opinion on.  As an academic, I would love to experiment with their data and see what we could do with it.

TenGrade may face a bit of a chicken-and-egg problem: the rewards from ranking something come from seeing what others say, but until they achieve critical mass there may not be enough content their to make it compelling.   Why will people bother to give their opinions ranking everything in the universe?   I don't know: but frankly I didn't know why anyone would tweet, post on Facebook or enter  Amazon reviews or complaints on TripAdvisor, either.

Friday, August 2, 2013

"Human Accomplishment", by Charles Murray

It is a strange feeling to finish writing a book, and then suddenly discover an earlier effort with similar interests, methodologies, and ambition.  This book is Human Accomplishment, published in 2003 by Charles Murray, best known for his more controversial work, The Bell Curve.

First, don't worry: this book does not make Who's Bigger? redundant.   We address different questions, about different people, in different ways, with different styles. But if you liked Human Accomplishment, you're going to love our book. :-) There are two basic similarities:
  • Both books measure the historical magnitude of important figures statistically, through analysis of the written record left behind in books and reference works.
  • Both books quest for themes of broader significance, while simultaneously enjoying the parlor-game thrills of deciding who ranks where.
We will give a more thorough analysis in future postings, but it seems useful to record quick impressions of the similarities and differences of our respective books.
    The ranking methodologies are similar in spirit but differ substantially in how they were done.   When writing his book between 1997 and 2002 Murray had access to an important lost technology, called a "secretary", who could manually curate a spreadsheet-scale data set.  By contrast, we are computer scientists who did a Big Data analysis of gigabytes of text.  These differences show up in the properties of our rankings:
    • We rank 850,000 people in all domains of interest, while Murray is interested in the top 4,000 figures in the arts and sciences.
    • We rank figures from the beginning of time until today, while Murray considers only those active before 1950 to eliminate contemporary biases.
    • Our rankings permit direct comparisons of significance to people across different domains. Was Shakespeare bigger than Newton?   We say yes, but Murray is not as interested in such comparisons.
    • Our rankings come from a computational analysis of Wikipedia, where he performs a statistical analysis of mentions in selected books and reference works.
    • We identify two different factors (celebrity and gravitas), permitting us to attribute historical significance appropriately for any given figure.   Murray is really only interested in the factor we call gravitas.
    The other differences between our book reflect the scope of questions which we believe can be addressed by this methodology.   Murray is interested in a set of big picture questions in comparing clusters:  like how much bigger are the accomplishments of the West over that of the East, or why certain religious/cultural groups punch above their weight.   His rankings seek to measure genius or greatness in an objective-enough manner to be accepted as a ground truth.

    We are less certain that our own rankings measure virtue unalloyed with notoriety.   The cultural biases inherent in depending on the English-language Wikipedia view of the world seem obvious.  The serious issues we care about more concern the processes of fame and recall.  Are women underrepresented in the historical record?  (Yes.)   Do textbooks and expert panels do a good job of recognizing historical significance, even in retrospect?  (Not really.)   How does interest in historical personages fade with time?  (In a generally predictable manner over 170 years from birth). We are more interested in historiography -- why are people remembered -- rather than history -- why should people be remembered.

    We will soon do a more detailed comparison of our rankings, but my sense is that Murray did a good job at what he sought to measure.   Both of our books are proud of the banality of our respective rankings, meaning that we expect the bulk of our readerships will agree with the bulk of where we position historical figures.   Our rankings appear to correlate quite highly with his on most of the figures I've checked.  I apologize that we did not get to include a discussion of his work in our book, but we hope to have the opportunity to chat with him sometime after our book appears in October.