In this post, I'd like to offer a couple of comments on the list of most central authors in computer science. First of all, how do Michael Kuhn and Roger Wattenhofer determine how central an author is? Here is the relevant excerpt from the paper.
Analogously to the construction of the Erdos number, we base our method on the co-author-graph. We then create the induced subgraph for each region of interest (PODC, STOC/FOCS/SODA, and computer science). For PODC, for example, this graph would only contain authors that have at least one PODC paper, and so on.
Other than in the construction of the Erdos number, we do not rely on shortest paths, but rather on the PageRank idea: We start several short random walks at different nodes of the graph, and count how often each author gets visited. This idea is then extended to time dependent centrality, by starting the random walks only at authors that have published in the last five years.
What are the results of this approach? Table 1 on page 62 of the paper gives the most central authors for computer science as a whole. Alberto Sangiovanni-Vincentelli (another Italian expat) tops the all-time list, and Noga Alon is the runner-up. Without doubt, the list contains only major players, and concurrency theorists will be happy to see Moshe Vardi at #7 on the list of most "central" authors in computer science. Some analysis of this table is provided in the paper. Here are a couple of quick thoughts and questions from yours truly.
- Computer scientists working in the area of databases feature prominently in the all-time most central authors in CS. Indeed, a look at the list of researchers with the largest number of entries in DBLP, the data set used by the authors, shows that some of the most prolific authors in CS work in that area. Could it be that people publish more and have more coauthors in the field of databases, broadly construed, than in other areas of CS?
- The list of most central authors does not include any Turing award winner, as far as I can tell at first sight. Does this mean that Turing award winners are not central? Of course not! To my mind, this just means that an analysis of the collaboration graph favours prolific authors with lots of coauthors. Consider, by way of example, two giants of CS research like Tony Hoare and Robin Milner, who are not even in the list of top 1000 CS authors. (Neither are Stephen Cook, Don Knuth, Gordon Plotkin or Leslie Valiant to name but a few giants, by the way :-)) By modern standards, their "number of papers" is not outstanding. However, their ideas and writings have had, and still have, enormous impact on computer science research. All of what I have done myself, for instance, is built on their original work, which has kept many computer scientists busy for about thirty years. (Of course, Robin and Tony are not responsible for the noise I have generated myself :-)) These tables are a fun read, and they do tell us something worthwhile about the players in our subject. However, as the authors point out themselves, "Without doubt the future will teach our evaluations a lesson, ultimately revealing in which direction computer science evolves, and maybe even discover the most influential computer scientist. After all, research is not about how many papers we write, or how many citations they get, but rather, what the best contributions are."
- I think that the right-hand side of Table 1 is strongly influenced by the phenomenon of name ambiguity. Consider, for instance, Wei Wang, who is ranked as #2. When I saw that Wei Wang has 86 DBLP entries for 2006, I asked myself the question: "Who is Wei Wang"? This Google search return over 1.8 million entries! It seems to me that Wei Wang is a (very large) disciple of Bourbaki or Lothaire :-)
- Table 2 on page 62 is a veritable who's who in the FOCS/STOC/SODA branch of TCS. It has a truly amazing number of Israeli computer scientists (six out of ten on the "all-time" list, and four out of ten on the "last-five-years" list, I believe).