One of the coolest parts about being a Librarian and Information Scientist is to meet and talk with fascinating people working on the cutting edge of tech. A few months ago, I had the privilege of meeting with teams from Google Scholar and ORCID to talk about Linked Open Data integrations.
Something a Google Scholar team member said stuck…
… that the way their engineers have learned to tell apart millions of scholars and their academic works is through signals analysis and clustering. Telling apart authors by mere names and even typical metadata is not nearly adequate when dealing with millions of publications. Our publications, though, carry distinctive signals, both within the metadata and the publications themselves.
That’s how Google Scholar tackles the horizontal tracing of signals produced by millions of scholarly authors who are publishing today.
What I’ve done–and am the first to do unless someone points out to me that I am in error about this claim–is to set forth publicly a methodology for doing this vertically, across time, through expert signals tagging and analysis.
I’m quite certain that machine algorithms, once proper instructions are developed, can and will do the same thing that I’ve done in my proofs. Whether my approach would fall under Natural Language Processing or not, I don’t really know, since that’s not my area of study or expertise.
This makes me wonder whether an NGO or major research university would be interested in hosting and building a Digital Humanities project/platform that uses signals analysis to clarify the journeys various signals take through the Gospels and other significant historical texts, using this signals tracing to distinguish and clarify distinct compiler voices and tendencies as discrete layers of sequenced historical-textual strata. I’m sure we could develop amazing ways to visualize and crowd-source it.