It’s been a year today since we first publicly released our findings about discovering a scientific solution to the Synoptic Problem and reconstruction of the first Gospel (Qn). Given that, we are transitioning from version 1 to version 2 in the numbering of our LODLIB (Linked Open Data Living Informational Book). Over 1000 pages (many of them 11×17) and almost 350,000 words is pretty amazing progress to have made in a year’s time. While there is still a lot more work to do, it’s good to celebrate this milestone.
Today’s upload has significant improvements, especially to the “Popular Script Translation of the First Gospel” and the “Iterative Critical Edition and Translation of the Third Gospel Stratum”. Many other updates are to be found across the book, following our cycles of continuous improvement. A lot of our work recently has involved checking the wording and manuscript variants of the critical editions of Tertullian’s treatise Against Marcion and Epiphanius’ Panarion (our two main sources of attestations of Marcion’s Gospel) and inserting specific page references to these scholarly texts in our footnotes. We hope our readers and reviewers–present and future–appreciate this scholarly rigor and attention to detail.
As always, constructive feedback and opportunities for collaboration are most welcomed! The great thing about a LODLIB, especially one built on scientifically testable hypotheses, is that it can evolve, not only to correct errors, but also to respond to legitimate critique and to build out new proofs.
Today’s upload has many new updates, most notably several new pages on the history of statistical and stylometric scholarship on Marcion’s Gospel, from William Sanday to John Knox to Joseph Tyson and most recently, Daniel A. Smith of Huron University, who earned his PhD under Kloppenborg and whom I had the pleasure of meeting at KU Leuven several years ago. This history of scholarship culminates in a close comparison of Smith’s work and mine, summarized in this table:
The subsequent conclusions explain and evaluate the differences between Smith’s numbers and mine, conceiving of each approach (passage counts, verse counts, word counts) as lenses with increasing levels of magnification or granularity. Smith’s findings dovetail significantly with my own, that Q (in some version) and Mark (in some version) were both sources of GMarc.
Following continuous cycles of improvement, we have made many other content and formatting updates but will leave these for our readers to discover and enjoy. Every week our LODLIB gets a little better!
If there are institutions interested in hosting this work in a short- or long-term research fellowship or position, please let me know.
Today’s upload has numerous updates. The most significant is a complete reformatting of the Iterative Critical Edition of Lk1/GMarc to tabloid landscape, both to facilitate reading and to allow for columns with cross-references to other recent editions of GMarc. We have also started adding specific page references to the SourcesChrétiennes critical edition of Tertullian’s Contra Marcionem by Braun and Moreschini to the footnotes after having checked these texts against those in Evans and Roth. One significant decision new to this version is the removal of A253, Children welcomed, from QnLk1. Given the unreliability of Adamantius Dialogue, we now read that signal cascade as originating in Lk2 (117-138), picturing Jesus as a rabbi practicing circumcision in defiance of the Hadrianic proscription against circumcision, only for later strata of Mark and Matthew in the 140s to reframe the story as about the baptism of children as an early-orthodox substitution for circumcision. Lots of other new and interesting insights and updates are there for scholars and lay readers interested in combing through the reconstructions and notes.
If an international university or research center is interested in hosting a short-term or long-term fellowship or research position that allows me to focus completely on this groundbreaking work bringing together Classics, Religious Studies, Digital Humanities, and Computational Linguistics, please contact me. I love working as a faculty Librarian, but the research that I’m pioneering deserves to have the full support of an institution that can provide not only the basics of salary and benefits, but also close collaborators in Humanities and Computer Science, research assistants, and funds for travel and presentation.
This week’s edition puts us over 730 pages and 303,000 words. The main addition this week is the Lk2-CINP dataset. CINP stands for “Clear and Implicitly Not Present.” Like Lk2-CENP, this dataset records the redactor of Late Luke (Lk2) speaking freely without noise from prior gospel strata and is roughly the same size, representing about 20% of the total word count of Lk2. While we may make additions or subtractions from this dataset in future editions, depending on our restoration and signal transmission tracing work, we are confident that overall this dataset is a high fidelity representation of the Lk2 vocal stratum and thus ideal for modeling and training. We have already started incorporating the Lk2-CINP dataset into our Computational Linguistics analysis and visualizations, which now also include Acts and the Gospel of John for comparison.
This week’s edition puts us at nearly 720 pages and 300,000 words. This is the week where our research really started to integrate with RStudio. We spent quite a bit of time troubleshooting Greek unicode and UTF-8 encoding issues in RStudio on our main Windows machine and getting Microsoft Linux Subsystem up and running so we can move back and forth between RStudio in both environments. Rather than build unicode points throughout our scripts, we decided to front load this work.
Thus our Code Repository debuts with two major scripts: one that transliterates all Greek unicode characters into ASCII English letter equivalents; and another that loads both Greek and English UTF-8 txt files, then quickly and cleanly parses six vectors for use in deep Computational Linguistics analysis (whole, lemma, and morphology for both languages). With the in-book datasets and code, experts and novices in Gospel Computational Linguistics can start to evaluate and build on our research. Our Data Visualizations section (freshly reformatted to tabloid layout) also features a new section that builds on this: Top Ten Words tables and graphs for the Harnack, Roth, and CENP datasets.
Identification of an additional 20 signature features showing statistically significant variance between Lk1/GMarc and Lk2 that will be used in future proofs of the Schwegler hypothesis and our five hypotheses. These now include several features with disproportionately high frequencies in Lk1/GMarc compared to Lk2, not just vice versa. Many of these newly listed features are morphologically nuanced bigrams, trigrams, and quadigrams we’ve been identifying over the past several editions of our LODLIB in DD 1.2.
Forked three sections (Computational Linguistics and the Synoptic [Signals] Problem; Data Visualizations; Excursus on Related Topics) from other areas to have their own sections.
Hundreds more “clear” vocal signal tags are now assigned across any and all strata throughout the entire reconstruction in anticipation of the future compilation of NLP training datasets for each vocal stratum.
Dozens of new entries to the Data Dictionary, adding further clarification and disambiguation of the Qn, Lk1, and Lk2 vocal strata.
A new section on the history of scholarship on Computational Linguistics and the Synoptic Problem. Ever wonder why we couldn’t solve the Synoptic Problem before? Faulty understanding and modeling of the problem and only using a fraction of the relevant datasets!
New additions and numerous corrections to our statistical proofs. What happens when you bring together statistics about GMarc’s abundance of triple tradition passages with statistics about its lack of Markan and Lukan passages? Hint: if this were judo or MMA, this would be the submission hold that ends the match against defenders of the early orthodox hypothesis that GMarc is derived from Luke.
A new Lk2 clean vocal stratum training dataset for Natural Language Processing and Computational Linguistics. Ever wonder what the redactor of Late Luke (Lk2) unfiltered without synoptic noise sounds like? Any of the coders out there eager to have lemmatized and morphologically tagged datasets to test our hypotheses? Here ya go!
This week’s edition puts us at almost 650 pages and over 270,000 words. Lots of new additions have been made to the Comparative Restoration (esp. for chp 12) and to the Data Dictionary. We’ve also made some significant corrections to previous chapters as we continue to follow a cycle of continuous improvement, simultaneously tracing the transmission and syntheses of vocal signals across time and clarifying discrete vocal strata from specific moments in time. I’ve also been enjoying reviewing the scholarly literature in Computational Linguistics about authorship attribution and recognition and figuring out how to adapt the methods of other scholars and also develop new ones specifically customized for ancient Greek texts and the Synoptic Problem. Should have some important scientific findings to announce in the next month or two.
Our first edition of the new year puts us over 610 pages and over 265,000 words. The big addition for this version is a twofold digital edition of Harnack’s reconstruction of the Gospel of Marcion in our in-book Dataset and Code Repository. The first is untagged Greek text for human readers and the second is lemmatized with full morphological tagging for deep Computational Linguistics analysis. We welcome and encourage other scholars to use this dataset to evaluate our hypotheses and come to your own conclusions about whether the Gospel of Marcion is in fact the third gospel stratum, composed partly of early Mark and mostly of the first Gospel (Qn).