Channeling the Ghost of Thomas Kuhn (LODLIB v2.10)

This week’s LODLIB spices things up with numerous inspirational/illuminating quotations from Thomas Kuhn’s The Structure of Scientific Revolutions strewn throughout Part One of our LODLIB. These quotations inherently convey an outlandish confidence from someone who really thinks to be leading a scientific revolution in the study of the Gospels. Whether our five hypotheses, triangulation theorem, and various other proofs and methods are mostly right or mostly wrong, the field will eventually decide! All I can do is keep writing and moving forward.

This week’s LODLIB also contains our author’s submitted version of a short data paper introducing datasets based on Roth’s 2015 edition of GMarc. As the peer-review process begins, we publicly welcome Roth and/or Brill’s copyright office to discuss copyright permissions. If accepted for peer-review and publication, our normalized datasets have a very strong Fair Use argument (transformational use, non-profit educational purpose, proportionality, no negative–and quite likely positive!–commercial/market impact, etc.). Thus it may be completely unnecessary to get formal copyright permissions, but we want to practice good academic neighborliness and open up opportunities for dialogue and even collaboration, including the possibility of co-authorship credit for the normalized datasets. We are more than happy to share credit. Humanities scholarship–especially when it becomes Open Data Science–is inherently and inevitably interdependent and convivial work.

Our process here underlines the vital, constitutional good of scientific progress served from open, normalized datasets of GMarc being made available to the global public. Thus far, after two unanswered inquiries, Brill has tacitly declined to make Roth’s concluding reconstruction or entire book open access.

Also salient here is that Roth’s reconstruction is (according to an automated document comparison tool we ran) 77% identical to the Harnack datasets that are based on a public domain work and are now peer-reviewed and published through the Journal of Open Humanities Data and Harvard Dataverse. Our more detailed analysis shows very high levels of correlation between the Harnack and Roth datasets on numerous other metrics.

Thus there are two theories of the case: 1) Roth’s reconstruction is highly derivative of Harnack’s reconstruction, and thus any normalized datasets based on Roth’s reconstruction are actually far more derivative of Harnack’s work than Roth’s; 2) the data underlying both of these GMarc reconstructions is sufficiently clear and consistent that any reconstruction should be expected to overlap considerably with another reconstruction. Given the wide disparities between recent GMarc reconstructions, the first theory is the far stronger.