The DOI for this data paper is: https://doi.org/10.5334/johd.63. This makes the third data paper and batch of normalized datasets of major reconstructions of Marcion’s Gospel now published in JOHD, following after the papers and datasets based on the Harnack and Roth reconstructions. The fourth (based on Klinghardt’s and Nicolotti’s recent reconstructions) has been accepted, cleared for copyright, and typeset, so it should be published soon. Those four papers and datasets together represent all major Greek reconstructions of Marcion’s Gospel published thus far, comprising a new historical computational linguistic corpus of Postclassical Greek that contains 57241 tokens altogether.
We are also in talks with Jason BeDuhn about creating a Greek reconstruction derived from his 2013 English reconstruction of Marcion’s Gospel. In my view, BeDuhn’s reconstruction is the closest to the actual contents of Marcion’s Gospel in size and content, so a corresponding Greek edition and related normalized datasets would make for a very significant contribution to scholarship. We are also in talks with various international institutions about a collaborative project focused on the restoration of Marcion’s Gospel and resolution of the Synoptic Problem using data science and computational linguistics methods. Interested parties are welcome to connect.
Following up on today’s publication in Journal of Open Humanities Data of my data paper and accompanying normalized, lemmatized, morphologized, born-digital, and peer-reviewed version of Harnack’s reconstruction of the Gospel of Marcion (GMarc), in v2.13 of my LODLIB I’ve now released a lemmatized and morphologized dataset of August Hahn’s 1832 reconstruction of GMarc. After many grueling months of work on these Greek texts in parallel, I’ve also completed lemmatizing and morphologizing the reconstructions of GMarc by Zahn, Klinghardt, and Nicolotti. Since the latter two are based on works still under copyright, we will start conversations to see how best to publish these and would like to take this opportunity to invite Klinghardt and Nicolotti publicly to join as collaborators on these datasets. The Zahn dataset will appear in next week’s LODLIB, and I will soon submit both the Hahn and Zahn datasets and accompanying data papers for peer-review and formal publication.
In other related news, Jason BeDuhn and I are in talks about how best to structure next month’s Westar SBL session on Q and the Gospel of Marcion. If any scholars specializing in GMarc and/or Q would like to be respondents in the session, please let me know.
Today’s LODLIB update reflects datatype normalization and quality control checks across all of our GMarc datasets (Hahn, Zahn, Harnack, Tsutsui, BeDuhn, Roth, Klinghardt, Nicolotti). While we have only released the full text of the first three, since their print works are in the public domain, we have made use of all of this normalized data in our new data tabulations (3.7) and data visualizations (3.8). While our own iterative critical edition is still in progress, the counts and graphs for all earlier editions should now remain static, thus we are now comfortable building these data tabulations and visualizations into forthcoming journal articles and book reviews.
In other related news, Jason BeDuhn and I are meeting later today to discuss the Westar SBL session on Q and the Gospel of Marcion. Given our overlapping scholarly work, I’m very much looking forward to the conversation. I also received just today the proofs of my forthcoming data paper for the Journal of Open Humanities Data. It’s always nice to see one’s work as it’s about to go to (digital) press.
This week’s version initiates data normalization for the study of the Gospel of Marcion in concert with our freshly revised datasets for the fourth round of review of a short data paper and related datasets we have submitted to the Journal of Open Humanities Data, whose Editor-in-Chief is Barbara McGillivray at the Alan Turing Institute at Cambridge. The peer-review process has been wonderful and indeed transformative in my thinking and methodology.
The normalization of GMarc data (transforming past messy/noisy reconstructions into standardized data) will—mark my words—prove the tipping point in the transformation of the scholarly study of the canonical and non-canonical gospel strata into legitimate Data Science. In concert with our new normalization standards and normalized datasets of public domain reconstructions, we also release a slew of data visualizations illustrating the contents and relationships of all past GMarc reconstruction datasets. These visualizations clearly reinforce our scientific hypotheses and proofs that GMarc was in fact the third gospel stratum, based on two sources (the first gospel stratum, Qn, and an early version of Mark).
The age of hagiographical controlling bias and assumptions in Gospel Studies is over. The age of Gospel Data Science is upon us. Scholars can either get on board or get out of the way, but no matter what you do, you can’t stop this.
This week’s version puts us over 400,000 words. In concert with the peer-review of our Harnack 1924 datasets for the Journal of Open Humanities Data, we have compiled datasets for other closely related, public domain reconstructions of Marcion’s Gospel. Today’s release features Zahn’s 1892 reconstruction, the second major reconstruction in the history of scholarship. Zahn’s edition totals 10571 10572 words, far less than Hahn’s 14400 14442, yet far more than Harnack’s 4207 4338. The disparity between these reconstructions exemplifies how much the results of reconstruction are determined by a priori assumptions and methodologies. We anticipate adding granular word counts by passage and tradition type (single, double, triple) for the editions of Hahn and Zahn in the Data Dictionary (DD 1.6) of next week’s LODLIB update.