The First Gospel (LODLIB v1.38 release notes)

This week’s edition puts us at nearly 720 pages and 300,000 words. This is the week where our research really started to integrate with RStudio. We spent quite a bit of time troubleshooting Greek unicode and UTF-8 encoding issues in RStudio on our main Windows machine and getting Microsoft Linux Subsystem up and running so we can move back and forth between RStudio in both environments. Rather than build unicode points throughout our scripts, we decided to front load this work.

Thus our Code Repository debuts with two major scripts: one that transliterates all Greek unicode characters into ASCII English letter equivalents; and another that loads both Greek and English UTF-8 txt files, then quickly and cleanly parses six vectors for use in deep Computational Linguistics analysis (whole, lemma, and morphology for both languages). With the in-book datasets and code, experts and novices in Gospel Computational Linguistics can start to evaluate and build on our research. Our Data Visualizations section (freshly reformatted to tabloid layout) also features a new section that builds on this: Top Ten Words tables and graphs for the Harnack, Roth, and CENP datasets.

Read More »

The First Gospel (LODLIB v1.35 release notes)

This week’s edition puts us at almost 680 pages and over 280,000 words. Major highlights:

  • A new section on the history of scholarship on Computational Linguistics and the Synoptic Problem. Ever wonder why we couldn’t solve the Synoptic Problem before? Faulty understanding and modeling of the problem and only using a fraction of the relevant datasets!
  • New additions and numerous corrections to our statistical proofs. What happens when you bring together statistics about GMarc’s abundance of triple tradition passages with statistics about its lack of Markan and Lukan passages? Hint: if this were judo or MMA, this would be the submission hold that ends the match against defenders of the early orthodox hypothesis that GMarc is derived from Luke.
  • A new Lk2 clean vocal stratum training dataset for Natural Language Processing and Computational Linguistics. Ever wonder what the redactor of Late Luke (Lk2) unfiltered without synoptic noise sounds like? Any of the coders out there eager to have lemmatized and morphologically tagged datasets to test our hypotheses? Here ya go!
Read More »