about the project

The Origins

Originating as a capstone project for a digital humanities course at the University of Pittsburgh, this site serves as a linguistic exploration of Jules Verne’s famous Vingt milles lieues sous les mers.

As "fanatiques" of the French language, developers Sean Stewart and Beth DeVito have decided to utilize computational methods as the central tool to perform textual analysis within the language’s literary form. This particular work represented an appropriate choice for further examination for several reasons. First, its author Jules Verne remains one of the most beloved fiction writers of modern French history, as his writing style and genre garnered much success from audiences throughout his own nation and the world community. Second, the sheer length of this novel serves as an ideal corpus for extensive data collection. This size presents a great opportunity to take full advantage of digital coding resources and to profit from their virtually limitless scope of application.

An investigation of linguistic aspects of the French literary domain led to the consideration of verbal tense as a unique and mysterious aspect of the language’s structure. As one of many French verb forms, the literary tense, also known as the passé simple (simple past), has historically developed as an expressive marker of fiction. Absent from colloquial speech as well as from other types of writing, its usage has been restricted to this fiction genre alone, making the literary past a puzzling phenomenon that deserves further research.

The Questions

Stemming from our decisions of central text and linguistic focus of scrutiny, we have developed several research questions that we hope to answer through the implementation of digital tools. In what situations is the literary tense principally used? How does its context and frequency compare to the appearance of other verbs within the fictional text? Essentially, we hope to explore the prevalence and usage of the literary past within Verne’s Vingt milles. Specifically, we will trace the contexts in which this tense appears, and thus determine the particular circumstances that dictate its usage. Accordingly, we will note the distribution of other verb forms that appear throughout the story, and compare the usage of these tenses with the literary past to receive an indication of the overall verbal structure within French literature. From our previous knowledge of French grammar, we surmise that the literary tense will appear more often in the narrative portion of the text, whereas other tenses may pervade quotations.

The Techniques

The process of data collection will reflect a multifaceted approach, where we will implement certain familiar techniques in addition to employing previously uncharted software. After having found an online version of this large text, we will transform the file into an XML document, the new base from which we will perform more specific and complex structural coding and reorganization.

Check our Source page for access to transformation sheets and XML files

Research Progress

Our project is a work in progress. Certain aspects are either finished, in progress or otherwise incomplete.

  • general
  • Create Hypothesis
  • Acquire Digital Copy
  • Generate Unique Verb List
  • Mark-up All Chapters
  • Generate Data
  • Present Data
  • website
  • Create GIT Repository
  • Blog Page
  • Source Page
  • About Page
  • Online Corpus
  • Validate Markup Valid XHTML 1.0 STRICT
  • Corpus Viewer Interface
  • Data Page
  • bugs
  • Paragraph Elements
  • Quotes
  • Corpus Punctuation
  • Course Presentation
  • Plan Outline
  • Mark-up Chapter Premier
  • Generate Data from Chapter
  • Display Data Graphically

Researchers

Image of Beth DeVito
Beth DeVito
Linguistics and French
ekd9]at[pitt.edu
university of pittsburgh

Image of Sean Stewart
Sean Stewart
Linguistics and Computer Science
ses119]at[pitt.edu
university of pittsburgh