From Ah! to Little Z¶
Clustering Spelled Language Sounds in Early Modern Dutch Theatre Plays (1570-1800)¶
Fieke Smitskamp
Series Digital History¶
This article is part of a series on digital history in the Netherlands and Belgium. Eleven years after the publication of the widely-read BMGN-issue on digital history in 2013 (https://bmgn-lchr.nl/issue/view/31), this series aims to provide a new state of the field. It comprises four serially published articles, which collectively emphasise the diversity of researchers, questions, methods and techniques that define digital history in 2024. The articles are published online in a new, HTML-based format that better showcases the methods and visualisations of the research published here. With regard to this article, Fieke Smitskamp authors the methodological appendix. The reader can read the methodological sections by clicking on the words in the article text that are marked in yellow, and then a sidebar on the left will pop up. Moreover, the article is enriched by two sound clips.
Serie digitale geschiedenis¶
Dit artikel is onderdeel van een serie over digitale geschiedenis in Nederland en België. Elf jaar na het veelgelezen BMGN-nummer over digitale geschiedenis uit 2013 (https://bmgn-lchr.nl/issue/view/31) maken we een nieuwe tussenstand op. De serie bestaat uit vier serieel gepubliceerde artikelen, die tezamen de veelzijdigheid accentueren van de onderzoekers, de vragen, de methoden en technieken die anno 2024 digitale geschiedenis definiëren. Deze artikelen worden online in een nieuw, op HTML gebaseerd format gepubliceerd, waardoor de methodologische toelichting en visualisaties van het hier gepubliceerde onderzoek beter tot hun recht komen. Met betrekking tot voorliggend artikel is Fieke Smitskamp ook de auteur van de methodologische toelichting. De lezer vindt deze uitleg door te klikken op de geel gemarkeerde woorden in de artikeltekst, waarna links een zijbalk verschijnt. Bovendien is het artikel verrijkt met twee geluidsfragmenten.
Introduction[1]¶
O ongelooflijck ding! die swinck die gaf my tevens In een blick tijts meer vreuchts, als al de lust mijns levens. Onmogelijck is het my, uytdrucklijck met bescheyt Te schilderen het beeldt van haar uytnementheyt’.[2]
These are the words of Baron, expressing to his friend Adellaar his great admiration for Lucelle, the woman Baron adores. Imagine the Schouwburg in Amsterdam around 1620 and these opening words of Bredero’s famous play Lucelle passing from the lips of the actor to the ears of the audience. The early modern theatre was the ultimate place for vernacular language to resound. You can to an artist's impression of this fragment from Lucelle, as it presumably sounded in the seventeenth century.
The seventeenth century was a period full of developments both in Dutch theatre and in the Dutch language. Theatre developed roughly from a largely oral tradition through the chambers of rhetoric – private societies where members held discussions and trained in poetry and playwriting – to the opening of theatres and official stages, coinciding with the expansion of stage productions and the emergence of new genres such as comedy.[3] At the same time, the Dutch language was also being influenced by the major economic developments of the Republic, which by then had become a world power. The Republic, which was emerging as a new independent nation from the Dutch Revolt (1568-1648), was characterised by the arrival of immigrants – such as workers from Germany and political refugees from the Southern Netherlands or even Southern Europe – the increasing need for cooperation between the provinces, and the growing feeling of unity derived from the status as a new, independent state.[4] These factors pressed the need and wish to make agreements about the written language.[5] The process of language regulation gradually went from a supra-regional uniform language to a certain degree of standardisation during the Renaissance, first merely addressing spelling.[6]
The availability of ever-larger digital corpora of historical texts makes it possible to search them using new methods that could lead to fresh insights into, for example, finding connections between historical people, their possible whereabouts at certain times, collaborations, or co-authorships, and could help to reconstruct an oeuvre or the portfolio of a certain publishing house. Digital methods of style analysis have focused primarily on the word level of texts and have achieved great success.
This article, however, explores the possibilities of a new digital method in the field of computational linguistics, examining historical texts at the level of spelled language sounds – a more detailed level than the word level – to investigate to what extent the patterns of language sounds have a distinctive capacity with which they reveal differences and similarities between texts or groups of texts that are invisible to the naked eye and can help to gain knowledge about the period at hand. The main questions are as follows: How can the distinctive character of spelled language sounds in early modern Dutch theatre plays be studied? To what extent can theatre texts be distinguished from each other based on language sound patterns? And, finally, how do these spelling patterns reflect the characteristics or features of the different plays?
To apply this new method, I used historical texts that have a strong connection with the sounds of language, namely theatre texts, dating from an interesting historical period when the Dutch vernacular was undergoing crucial developments, against the background of professionalisation in the theatre world. Historical theatre texts are exactly at the intersection of spoken and written language and therefore the ultimate texts to turn to for the analysis of language sounds.
I analysed a digitised volume of 167 early modern theatre plays originating from 1570 to1800 (henceforth, the corpus), strictly focusing on presumed language sounds. To do so I used a self-designed computerised tool and validated it, in close cooperation with Dr. Ruben Vosmeer, who took care of the programming. Below, I will first sketch the background of the Dutch theatre scene with its great variety of styles over the period in question, and developments related to spelling and pronunciation of early modern Dutch language. After that, I will describe how digital research methods have offered new options for the study of historical texts. Then, I will argue how the new digital method that I present here, dedicated to the analysis of language sounds on stage, allows me to study the most detailed level of historical texts. In the enriched HTML-version of this article, technical aspects of the used method are further explained.
Early modern Dutch theatre and vernacular Dutch language¶
Originating from an oral tradition, at the end of the sixteenth century and the beginning of the seventeenth century, plays were performed in living rooms, on the street or in the chambers of rhetoricians. Every city, large or small, had its own chambers of rhetoric.[7] By the early seventeenth century, the activities of the Dutch rhetoricians had developed in such a way that there was a growing need for a professional platform. In 1617, Samuel Coster (1579-1665), physician and playwright, together with Pieter Cornelisz. Hooft (1581-1647), historian, poet and playwright, and Gerbrandt Adriaensz. Bredero (1585-1618), poet and playwright, took the initiative to establish the first real theatre, the Nederduytsche Academie, in Amsterdam. The Academie had a particular focus on works in the vernacular language instead of Latin. It therefore became an accessible place where science could be popularised, and offered vernacular writers a wide audience for their work.[8]
In 1637, a new theatre, the Amsterdamse Schouwburg, built by the Dutch architect and painter Jacob van Campen (1596-1657) succeeded the Nederduytsche Academie (See Figure 1). Chambers of rhetoric and street plays did not immediately cease to exist. Although plays were also produced in, for example, The Hague, Leiden, Rotterdam and Delft, the Amsterdamse Schouwburg in particular was the central place where theatre flourished and continued to innovate and develop, as it housed a large number of productions and a growing number of writers, publishers and actors.[9] This ratio is also reflected in the available digitised plays that are included in this study: plays produced in and around Amsterdam are therefore at its heart.
For the themes and design of their plays, seventeenth-century authors reverted to writers from classical antiquity. In their tragedies they attempted to display their literary qualities and profile themselves as continuers of the classical tradition. In the early days of the theatre, comedies and farces were less established than the tragedy genre: At first the comic theatre mainly supplemented tragedies as an after-piece, or a minderemanstoneel (comic intermezzo with socially lower-ranking characters) as part of a tragedy.[10] From 1620, after Bredero had produced comic pieces that could stand on their own feet with elaborate leading roles, farces developed into a full-fledged genre, comparable in scope and maturity to a tragedy, but with an incomparable nature and style.[11]
The Amsterdamse Schouwburg was financially linked to care institutions in the city, such as the oudemannenhuis (old men’s home).[12] That was the price for the political support that enabled the theatre to resist the church, which perceived theatrical activities as threatening and repeatedly tried to have the Schouwburg closed.[13] This arrangement also influenced the theatre’s repertoire: regents of the Schouwburg, appointed by the mayor, among other things, had the task of interfering with the programming, reading plays, and assessing whether plays offered acceptable views on the state or the church. Blockbusters were more than welcome. These were not infrequently translated plays by writers who had had many successes in Spain.[14]
Halfway through the seventeenth century, plays increasingly produced their effect on audiences through visually-oriented displays. After a major renovation in 1664-1665, the Schouwburg provided space for so-called kunst- en vliegwerk (trickery and stagecraft), or technical masterpieces that made spectacles possible, according to the requirements of well-selling plays.[15] The staging of plays took on completely different dimensions, with significant success at the box office.
Shortly after the renovation of the Amsterdamse Schouwburg in 1664, the theatre landscape changed drastically as the new art society Nil Volentibus Arduum was founded, which followed a French classical doctrine.[16] Eventually, Nil Volentibus Arduum gained the upper hand in the Amsterdamse Schouwburg and made a major mark on its repertoire.[17] Many of the founders and members of Nil Volentibus Arduum came from the medical or legal world; they were in fact scientists who, with Descartes’ new approach in mind, unleashed reason on literature: no more Senecan drama with spectacular discoveries, but morality and ‘good taste’, soberly designed as a rationally structured stage that followed Corneille’s interpretation of Aristotle’s Poetics.[18] There were binding regulations for structure, choice of material, character drawing and moral tenor. Hurting or shocking was out of the question. Translations of French plays dominated the Amsterdamse Schouwburg, in addition to theatre with ballet and music.[19] For purely financial reasons, there was some leeway in allowing visual special effects (See Figure 2).
The idea of employing the vernacular language, which was also accessible to a non-Latin-speaking audience, for artistic purposes, already started to advance in the Renaissance. The early Enlightenment made a strong plea to also give platters (people who had no knowledge of foreign and classical languages) the opportunity to pick up on language developments. In the eighteenth-century Enlightenment, the ideal to engage all layers of society in the pursuit of civilisation became more widespread. The aim to bring knowledge, culture and education to the citizens and to educate the lower classes was also reflected on stage through experimentation with new forms of plays.[20]
The language style of plays¶
Early modern Dutch plays vary enormously in style and atmosphere. Comic plays by Bredero, for example, in which all kinds of figures from the street made themselves heard, had a completely different atmosphere from the serious tragedies by Joost van den Vondel (1587-1679). In addition, the plays varied in the way language was spelled. Not only did authors have their own ideas about spelling, but also people involved in the process of producing theatre texts, such as editors, typesetters and revisors, had an influence.[21] And then, of course, the actors bringing the words alive influenced their sound and pronunciation.[22]
The spelled language is all we have to get as close as we can to the sound of historical Dutch theatre language. Therefore, this project focuses on the language sounds as they were spelled. When we look at language, the seventeenth century is known as a decisive period for the Netherlands, as the first attempts to regulate spelling and grammar date from this period.[23] These initiatives arose from a need to understand each other better. The flourishing world trade, with Amsterdam as an important centre, attracted many traders and workers with their own dialects from other areas within the Republic. With the closing of the Scheldt river as a supply route to Antwerp in 1585, textiles and colonial goods were now also traded through Amsterdam. Moreover, this development resulted in a stream of migrants from the Southern Netherlands to Amsterdam, bringing their own dialect. Also, in the fields of art and science, the Republic had become a serious player on the world stage. As a result of growing business activities, migration and integration were everyday themes. For the Republic, identifying as a new independent nation after the Dutch Revolt, the Dutch vernacular language became more and more of a symbol of the unity. As early as 1582, it became mandatory to use Dutch in administrative treatises.[24] The more formal unity was achieved and the more economic and political immigrants came to the Republic, the greater the need for regulations for, first of all, the written language.
From the end of the sixteenth century, members of the chambers of rhetoric produced writings on language regulation, intended primarily for their own circle. Spelling and pronunciation were the subject of extensive discussions, particularly about the desired, civilised language for the Dutch Republic. Of great influence was the first printed grammar, Twe-spraack vande Nederduitsche letterkunst (1584), by rhetorician Hendrik Laurensz. Spiegel (1549-1612), who, among other things, introduced a permanent distinction between the articles de and den, and advocated, albeit in vain, that education at the newly founded Leiden University should be provided in the vernacular.[25] Well known is also the contribution of Jacob van der Schuere from 1612 in his Nederduydsche Spellinge.[26] A more specific example is the influence of Renaissance grammarians on the reputation or prestige of the use of /z/:[27] they seemed to prefer pronouncing /z/ wherever possible, even if /s/ was also possible. In that case, a /z/ also had to be spelled out. This preference for /z/ was rooted in the belief that the use of /z/ would be more prestigious.[28]
By the beginning of the eighteenth century, there were all sorts of regulations in the field of style, spelling, grammar and rhetoric, written by ministers, classicists and poets, who often took Vondel’s style and ideas as a guideline for the desired, civilised language.[29] Vondel, as is evident from his Aenleidinge ter Nederduitsche dichtkunste (1650), strove for the highest quality of language use.[30] Although his parents were from Antwerp, and he was a member of the Brabant rhetoricians’ chamber ‘t Wit Lavendel in Amsterdam, from 1625 onwards he had a preference for the language used by ‘people of good upbringing’, thereby abandoning earlier Brabantian phenomena.[31]
Bredero jumped to the rescue for the ‘truest’ Dutch language. He believed that influences from foreign languages had no place in it. He advocated this message in 1611 in clear words on a platform provided by the Amsterdam chamber of rhetoric d’Eglentier, of which he would soon become a member. Unlike Vondel, Bredero did not portray historical heroes in his plays, but citizens and country people, and he used a more phonetic spelling to come close to the language of his characters.[32]
True spelling rules only appeared at the beginning of the nineteenth century.[33] Since there are no audio recordings of historical Dutch, much is uncertain about the pronunciation of the written words. It has even been stated that it is unlikely we would be able to understand a seventeenth-century Dutch ancestor speaking.[34] Thinking about the differences in ideas about spelling and pronunciation among authors, and therefore their plays, and the influence of people involved in the editing process, the question arises whether analysing the language characteristics of these texts based on the very detailed level of spelling rather than style could distinguish specific patterns that are typical for a certain literary period, genre or even an individual author.
Digital approaches in text analysis¶
Searching historical texts at different levels can help us to gather new information: for instance, to uncover possible connections between works or historical actors, or by reconstructing a friendship, an oeuvre or the portfolio of a famous publishing house.
Digital text analysis is very common these days. From around 2013, when new possibilities for research rapidly opened up with the increasing availability of historical sources in digital formats, the Digital Humanities took a leap forward.[35] However, the actual practice of the Digital Humanities goes back a long way. I will briefly illustrate this with some examples. In search of the origins of methodologies that would later be applied in the Digital Humanities, some may refer to the Index Thomisticus by Jesuit scholar Roberto Busa (mid-twentieth century), in which the most important words out of 179 texts around Thomas Aquinas are listed alphabetically.[36] Or, much earlier, to Leon Battista Alberti – Italian poet, linguist, painter, philosopher, architect and musician – who in the fifteenth century counted the vowels of Latin poems and orations (public speeches) to connect the different genres on this point. The poems turned out to contain considerably more spelled ‘a’, ‘e’ and ‘y’ than the orations.[37] In 1925, classical philologist Alexander Shewan discovered striking alliterations and assonances in Homer’s Iliad and Odyssey, and the first thing he did was count them and then compare the numbers with the numbers of lines.[38] Their search for cultural patterns turn Busa, Alberti and Shewan into practitioners in Digital Humanities avant la lettre.
Nowadays, successful approaches in linguistics to detect cultural trends in large corpora are computational. Style analysis methods generally look at characteristic aspects of the writing style at word level. J. Berenike Herrmann, Karina van Dalen-Oskam and Christof Schöch have developed a definition of ‘writing style’ that transcends the word level analysis by including a qualitative and computational aspect: ‘Style is a property of texts constituted by an ensemble of formal features which can be observed quantitatively or qualitatively’.[39]
Many studies that use larger research corpora and computational methods that can calculate on a large scale aim to trace the origin of texts, determining their period of origin or authorship. In this approach, an unknown text is compared with a large number of known texts that have similar properties to the anonymous text, and preferably also share the same genre. Australian literary scholar John Burrows’s ‘Delta’ method, for example, which measures differences in word usage and frequency, has proven to be quite accurate in comparing writing styles to attribute a text to a likely author.[40]
Patterns in language use can be made visible with computer software, with the frequency of recurrently used words in particular proving to be important in unmasking authors.[41] An example in which stylometric research – meaning the analysis of the writing style at word level – contributed to detect an incorrect distribution of credit for a book is the study of authorship in the work written by Betje Wolff and Aagje Deken. While the prevailing view initially was that Wolff was the main author of the piece, analysis using computational linguistics appeared to support both authors’ claims that they had contributed equally to the work.[42] Another example is the revealing of the correct author of the Dutch national anthem, the Wilhelmus, in which digital techniques contributed to the great precision of the text analysis and had an enormous impact by proposing a new author.[43]
This article explores how text analysis on the even more detailed level of presumed sounds can contribute to existing digital research methods. I examine a digital corpus of historical texts that had an important relationship with sound because they were intended to be delivered loudly from a stage to the audience. Moreover, these texts date from the period in which the vernacular underwent important developments. I will now focus on the source texts in the corpus that provide the written language sounds to which I will apply my new method.
The corpus and the digital tool LAPA¶
The selection of plays included in my corpus not only depended on physical transmission – text on paper that has stood the test of time – but also on digitisation. Based on the availability of in 2014, at the time of the construction of the computerised tool, I was able to use a preselected volume of 167 plays that had also been employed for other research.[44] Of course, the corpus could have been extended as more digitised plays became available. However, functioning as a proof of concept, the tool was built and tested around the initial corpus.
The aim was to achieve as much representation as possible of various aspects of early modern plays to cover the literary historical movements of the Dutch Renaissance (1570-1669), French Classicism (1669-1730) and eighteenth-century Enlightenment (1730-1800), but also for different genres (tragedy, comedy, farce), metres, authors and editors. These aspects of the plays are considered ‘features’ in this article. The focus is on plays from the region of Holland, and particularly Amsterdam. This bias is prompted by the dominance of Holland as the publishing and theatrical centre in the seventeenth century.[45]
Based on their fixed features, the plays are divided into groups: literary period, genre, type of metre, author and publisher. Each play belongs to different groups, and the groups overlap each other. One play, for example, might belong to the groups Renaissance, tragedies and Vondel, while another play might belong to the groups Renaissance, comedies and Bredero. The plays within the corpus are written by 89 different authors. There are 29 farces, 36 comedies, 92 tragedies and 10 plays in other genres. There are at least 44 different publishers; it is not known or not stated by whom 27 texts were published. 124 works were originally published in Amsterdam. The 27 texts whose publisher is unknown do not mention a place of publication. Of all the pieces, 123 are written in alexandrines, a line of verse that consists of a combination of rhythm, stress and intonation in a six-footed iambic, with a pause – diaeresis – usually occurring after the third foot.[46] The metre of the other 44 texts varies: sometimes there is a recognisable metre; sometimes the plays are in prose style. Bredero and Vondel are represented in the corpus with relatively numerous plays (8 by the former and 10 by the latter), which allows us to analyse their theatrical pieces within their own cluster of texts. Possibly, in that relatively early stage of the digitisation of historical texts, the most famous ones were chosen first to be digitised.
A major challenge was to extract historical spelled sounds from words and texts and to be able to count and cluster these sounds. To do this for the entire corpus of 167 plays, Ruben Vosmeer and I designed a digital tool called LAPA (Language Pattern Analyser).[47] LAPA is a computer code written in the programming language Python.[48] It converts spelled words into presumed language sounds (presumed phonemes) and maps their frequencies to the various groups of play features. follows a strict protocol, particularly designed for this purpose: texts are entered, LAPA follows the protocols, returns frequencies of language sounds and connects these frequencies to the
Spelled letters on paper do not directly represent a language sound one to one. For example, in Dutch a vowel at the end of a word is usually pronounced differently from when the same vowel occurs in the middle of a word or occurs in combination with another vowel (diphthong). Compare, for example, /au/ and /a/ in Laura, the different /a/’s in dag and dagelijks or /t/ and /d/ in the example hij beleeft to hij is beleefd. Counting all /a/’s, /d/’s and /t/’s in a text would therefore just lead to counting letters.
This article uses the term ‘presumed phonemes’ or ‘presumed language sounds’. Phonemes are mental representations of language sounds that speakers of a language have. In the case of historical texts, we are relying on the written texts and the knowledge about the presumed phonemes, spelling and possible pronunciation, built by historical phonologists. Also, while regulations were in flux, different spellings were used and no dictionaries are available with a phonological representation of the contemporary spelling. Because of these uncertainties, it is crucial that all plays in the same way, so that only one measure is used. Uncertainties about the way words may have sounded are then approached .
LAPA has a modular structure. This means that the tool is flexible, because the various functions can be replaced independently by an alternative. The current version of LAPA has been adapted for historical Dutch plays. With adjustments to the LAPA modules, the tool can be made suitable for converting other types of text, texts from other periods and even texts in other languages into SAMPA, for example, for the analysis of language sounds in modern newspaper articles, shopping lists, literary prose or historical lottery rhymes.[49]
LAPA provides frequencies of for the entire corpus of 167 plays, resulting in almost 1 million numbers in all, and maps the frequencies of the language sounds to the various groups of play features such as period, genre, author et cetera. The frequencies are then related to the average numbers in the entire corpus and expressed in percentages. Based on their frequency of occurrence, sounds are classified as , for the latter I also use the term ‘deviant’. ‘Normal’ means that a sound has a regular frequency: compared to its frequency in other plays within the corpus, the normal sound does not occur significantly often or significantly fewer times in the play at hand. Normal language sounds make up 95.4% of all counts. ‘Distinctive’ or ‘deviant’ sounds exceed the scope of normality by occurring significantly often or significantly little. These distinctive sounds make up the remaining 4.6% of all counts.
The frequencies of language sounds are expressed relatively, in percentages that indicate how large the share of a certain spelled sound is compared to the other 34 measured sounds. The long /a/, for example, has an average share per play of 2.94% of all language sounds, whereas /z/ has an average of 1.84%. There are 8 plays out of 167 where the long /a/ is considered distinctive or deviant. In those plays, /a/ makes up more than 3.75% or less than 2.13% of all counted language sounds. Also, 19 out of 167 plays have a very high or low frequency of /z/ (more than 3.44% or less than 0.25%). Therefore, in those plays, /z/ is considered distinctive, whereas in the other 148 plays the frequency of /z/ fits the normal scope.
Focusing on how the groups of plays differ based on the relative frequencies of spelled language sounds, the so-called normal group is not only very large to study but also its numbers behave within the same range of normality and are therefore not directly as informative as the distinctive sounds. Hence, to characterise groups of theatre plays, I have chosen to use the distinctive sounds.
Frequencies characterising the features of plays¶
Comparative study of the distinctive sounds of the groups demonstrates changes in the frequencies of language sounds of historical theatre over time – for example, how the percentage of plays containing distinctive language sounds diminishes from Renaissance (71.43%) to French Classicism (52.87%), and then increases to 80.65% in the eighteenth-century Enlightenment plays. Also, if we look at the numbers of plays for these groups in which the deviant language sounds occur, we find that the Renaissance has a somewhat more erratic picture with high numbers of plays in which the deviant sounds occur, while for French Classicism the sound that deviates most often only does so in a small number of plays. In the eighteenth-century Enlightenment, the numbers go up again. Tables 3a and 3b display a short overview.
It is remarkable that the frequencies of distinctive sounds differ so much from each other from period to period. In general, many distinctive sounds can be found in the Renaissance period. This could be explained by the idea that in that period the regulation of language was in full swing, and there was a lot of discussion about it, as demonstrated by the examples of Bredero and Vondel. The French Classicism period was dominated by Nil Volentibus Arduum. That movement influenced theatrical themes and also had specific views on language use.[50] In the period from around 1730, new forms and styles were added. The fact that the numbers of distinctively occurring sounds differ so much from each other in each period could be an indication that language sounds reflect the various contemporary influences during these periods, such as the role of Nil Volentibus Arduum in regulating the theatre’s style and repertoire. To interpret this correctly, further research is needed that looks into a possible relationship between specific language sound frequencies over time and the rise and fall of specific influencing factors, such as shifts in themes based on specific involvement by church or the city administration, language rules that were prescribed from the perspective of Nil Volentibus Arduum, and the development of new forms in the eighteenth-century Enlightenment.
The position of the spelled language sound /z/ also deserves highlighting. /z/ is at the top of the list of deviant language sounds for the Renaissance period, occurring significantly little in these plays. However, in both the French Classicism period and the eighteenth-century Enlightenment we can see that this scarcity of /z/ has disappeared: apparently, its frequency has normalised. An emerging frequency of /s/ as a possible replacement is not evident from the results of this study.
However, the most noteworthy finding is that all groups have their very own list of deviant language sounds. Overlap is scarce. Apparently, the frequencies of language sounds have the ability to clearly distinguish groups of sounds from one another, even if the groups have a lot of overlap. This is reflected in the plays by Vondel and Bredero, for example.
Vondel and Bredero had quite their own ideas when it comes to their choices in the use of language and spelling, and the types of characters they presented on stage. Thanks to a sufficient number of titles written by Bredero and Vondel in the corpus, it was possible to create their own group, clustering and analysing the frequencies of their personal language sounds alone. The same could be done for titles published by Jacob Lescailje (1611-1679), poet, bookseller, publisher and printer in Amsterdam, and Isaak Duim (1696-1782), poet and publisher in Amsterdam.
Bredero has a remarkable profile: not one of his plays is without deviant sounds. In other words, all his plays in this selection contain sounds that occur particularly little or particularly often compared to the rest of the corpus, whereas 61 plays in the entire corpus – nearly 37% – do not contain any deviant language sounds at all. There is an average of 2.6 deviating sounds per Bredero play, and a total of 12 sounds whose frequency deviates from the normal in his works, with /g/, /z/ and /s/ being his most frequently deviant sounds.
Vondel’s plays show different patterns: no deviant language sounds are found in 40% of the included plays, but in the other 60% they are. The average number of deviating sounds per play is 1.2 for Vondel – not even half of the 2.6 for Bredero. Vondel’s most frequently deviating sounds are /o/ (as in pot) and /s/. In total there are 8 language sounds with notable frequencies in Vondel’s plays. Vondel and Bredero therefore both have a very individual way of using language.
It should also be noted that the patterns of language sounds in Vondel’s plays are clearly different from the patterns for the group of tragedies. This is interesting, because Vondel’s tragedies represent a large proportion of the group of tragedies (10 out of 92) and therefore also influence the sound patterns for that group. Although some language sounds occur distinctly in both groups, the ranking is rather different, and there are also distinct language sounds in tragedies that do not occur at all in Vondel. Apparently, the numbers of the other tragedies were so different that they limited Vondel’s influence on the group of tragedies. The result is that Vondel’s plays, also within the group of tragedies, particularly stand out in terms of the frequencies of language sounds.
A notable difference between the Vondel and the Bredero plays is the use of /z/, which is particularly scarce in Bredero and in the Renaissance group that both authors are part of, while /z/ is not even one of the deviant language sounds in Vondel at all. This might illustrate Vondel’s preference for the use of the more prestigious /z/ in his serious tragedies, resulting in a normalised frequency of /z/, as opposed to the infrequent use of /z/ by Bredero, who portrayed street characters on stage and made less prestigious language choices in his comedies and farces, as a result of which he used the ‘posh’ /z/ much less.
Validation through identification of blind and orphaned plays¶
LAPA appears to generate promising results that characterise the plays from the corpus on the basis of the frequency of their sounds. However, as a newly designed tool, LAPA needs to be tested for quality, also called validation. To do that, I first tested whether the results of clustering and analysing language sounds also work for plays within the corpus whose features such as period, genre, author et cetera had been blacked out (‘blind’ plays), and subsequently for plays outside the corpus also with the descriptive data blacked out (‘orphaned’ plays). All plays were provided by DBNL (KB).[51] For each blind or orphaned play, the frequencies of the individual language sound were compared to the frequencies of the various groups for period, genre, metre, author and editor, and predictions were made as to which group the play was most likely to belong. Also, predictions were made to exclude Bredero or Vondel as authors, and Lescailje or Duim as editors.
To make these predictions, the focus was on the language sounds in each group whose frequency deviated most from the average. The percentage of the deviating language sound was considered as well. A detailed example is described . In both cohorts of plays, from within and outside the corpus, predictions were made for all features. For the first group, a total of 48 predictions were made, of which 45 were correct. Table 5 shows the distribution of correct and incorrect predictions per feature of the plays from within the corpus.
Predictions regarding the period were most often successful. Also, predictions concerning genre and metre were often correct. Predicting the author or editor of a play is much more specific and therefore much more difficult. However, many predictions that excluded Vondel or Bredero as authors, and Lescailje or Duim as editors, or that attributed works to Bredero, Vondel, Lescailje and Duim turned out to be correct.
Evaluating the second group – ten orphaned plays from outside the corpus – we learn that the more deviant sounds occur, the more difficult it is to make accurate assessments, as the spread of options is increasing, whereas for the blind plays within the corpus the presence of more deviant sounds would make the task easier, as there was more information to exclude options. For these reasons, it is difficult to make strong claims here. It has been found, however, that the plays behave in a comparable way to the plays from within the corpus.
Table 6 shows the distribution of correct and incorrect predictions per feature of the plays from outside the corpus. For this group, 55 predictions were made, of which 49 were correct and 6 incorrect. The periods were correctly predicted for all plays. For the metre, almost all predictions were correct. The score looks different for the predictions of genre: there were more incorrect predictions in this respect. Furthermore, it turns out to be easier to judge that a play was not authored by Vondel than whether or not it was written by Bredero. Apparently Vondel’s language sounds are so specific that it is easier to tell from the frequencies that a play was not written by him than is the case with Bredero. Finally, it turned out to be quite possible to indicate based on the sound patterns that it was very unlikely that a play was edited by Lescailje or Duim.
There is one prediction that needs highlighting. It concerns the claim about an orphaned play from the second cohort in the test case – plays from outside the corpus – that according to its language sound patterns was attributed to Bredero. Its particularities in terms of frequencies of language sounds were so close to Bredero’s that this text could very well have been written by him. The prediction was incorrect though: the author of the piece was Jan Jansz. Starter. However, the similarities between Bredero and Starter are not too surprising: Starter was the author who was approached by Bredero’s publisher, Van der Plasse, to complete the last two scenes of Bredero’s Angeniet when the author died halfway through. The attribution to Bredero for this play is incorrect, but the mistake also shows that Starter’s language use is very close to Bredero’s.[52]
Conclusion¶
Early modern Dutch theatre texts date from a period when the Dutch vernacular language was undergoing crucial developments towards standardisation. In the meantime, the Dutch theatre world was rapidly expanding and evolving: for instance, the comic genre matured, both state and church influenced the repertoire, and there was a shift to visual aspects and classical doctrines. Theatre plays are situated at the intersection of spoken and written language and are therefore the ultimate texts to turn to for the analysis of presumed language sounds. Digital analysis of these historical sources provides an opportunity to gain new knowledge about the past.
This study of a cohort of 167 seventeenth- and eighteenth-century theatre plays shows how spelled language sounds, derived from written words by the self-designed digital tool LAPA, seem to have both distinctive and characteristic qualities. The results demonstrate that language frequency patterns can characterise specific literary periods, genres, metres, authors and editors. There are shifts in the frequencies of use of certain spelled language sounds between the Renaissance, the French classical period and the eighteenth-century Enlightenment, which seem to be connected with contemporary discussions about language use. For example, there is a notable scarcity of /z/ in the Renaissance, which appears to have normalised in the course of the seventeenth century.
LAPA seems to be a useful instrument to cooperate with or support existing tools for text analysis and computational linguistics. This new method analyses and characterises historical texts not at the word level but at the detailed level of spelled language sounds, which allows the researcher to consider the heterogeneity of spelling within a corpus. A small validation study of 20 blind or orphaned theatre plays confirmed the predictive value of this tool. Respectively, 48 and 55 predictions were made on various characteristics of the plays based on differences in presumed language sounds, of which 45 and 49 were correct. Further research is needed to confirm these indications.
This study has several limitations. When LAPA was developed, the number of digitised plays was restricted. This could have had a negative influence on the reliability of the results. To increase the reliability and perhaps improve the quality of the analysis and its predictive value, it might be worthwhile to use LAPA on a larger scale by expanding the research corpus. This expansion and a comparison with other types of text, or plays from other periods, is an obvious option. LAPA’s modular structure is already designed for this purpose.
Additionally, to refine the results of LAPA and improve the tool’s historical accuracy, further development of the historical phonological framework is needed. The limited number of language sounds used in this research can be expanded based on a more comprehensive historical phonological foundation. This may involve layering knowledge about the pronunciation of historical Dutch into categories, organised according to the degree of certainty we have about them.
Furthermore, SAMPA is a phonetic alphabet developed for modern-day Dutch language and current dialects and does not contain a wide span of language sounds. It did however fulfill the need to have a concise and transparent model for generating countable entities for the limited number of language sounds included into this first edition of LAPA. To refine the historical accuracy in future research, a switch to X-SAMPA (Extended Speech Assessment Methods Phonetic Alphabet) could be made.
Characterising theatre plays based on language sounds using LAPA offers numerous starting points for further research. It would, for example, be interesting to use LAPA to deepen the investigation into changes in language sounds frequencies over time, such as the increasing use of /z/ from the French Classicism period onwards and how that relates to the rise of prestige around /z/ initiated by Renaissance grammarians. Also, reconstructions of the repertoire of various chambers of rhetoric or theatre companies can be considered, as well as achieving further depth in the research if more layers are added to the types of language sounds that are distinguished. This would offer an opportunity to explore whether there is a difference in distinguishing capacity between, for example, vowels and consonants or between plosives and labials. Also, distinguishing between various levels of certainty about the historical pronunciation of language sounds across the various literary periods could sharpen the results and could help us to touch upon the magic of the early modern Dutch theatre with our ears.
In the seventeenth century, there was a strong engagement between the performers on stage and their audience.[53] In keeping with that style, I conclude this article by letting the words of Bredero resonate once again:
Nu hoort, ghy Heeren hoort: Heeft iou dit spel verheucht, Beweecht, of wel gesmaackt? bewijset ons met vreucht. Met hant-geklap verblijt, en doet mijn alle nae, En soo’t u wel behaecht, soo roept een-stemmich Jae?[54]
With these final words, which you can also hear in the below, Bredero reaches out to his audience in the voice of the character Ian Neef, inviting the people to confirm they have enjoyed the play by shouting in unison ‘Yes!’. We will not find this level of accessibility at the end of a Vondel tragedy, but it fits seamlessly with Bredero’s vision of ‘true’ language. As the curtain closes, the seventeenth-century Dutch spectators ensure a generous expression of their enthusiasm (see Figure 3).
[1] I would like to thank Ruben Vosmeer, Karina van Dalen-Oskam, Janneke van der Zwaan and Inger Leemans for their contribution to the research, Toon de Zoeten (voice and recording) and Piet van Reenen (phonological support) for the realisation of sound clips, KB/DBNL for their cooperation, and Sander Anten, Antske Fokkens, Piet van Reenen, the editorial board of BMGN – LCHR and the anonymous reviewers for their feedback, which helped to improve the paper.
[2] Gerbrandt Adriaensz. Bredero, Lucelle (Culemborg: Tjeenk Willink-Noorduijn 1972) 69. ‘Oh, incredible thing! That brief glance gave me, in the blink of an eye, more joy than all the pleasures of my life. It is impossible for me, with precise and thoughtful words, to paint the image of her extraordinary beauty.’ (All translations are my own).
[3] Frans Blom, Podium van Europa (Amsterdam: Querido 2021) 17-46.
[4] Erika Kuijpers, Migrantenstad. Immigratie en sociale verhoudingen in 17e-eeuws Amsterdam (Hilversum: Verloren 2005) 15-20.
[5] Jozef van Loon, Historische fonologie van het Nederlands (Schoten: Acco 2014) 254.
[6] Nicoline van der Sijs, Taal als mensenwerk. Het ontstaan van het ABN (Den Haag: SDU 2004) 22.
[7] Blom, Podium van Europa, 17-23.
[8] Mieke B. Smits-Veldt, ‘24 september 1617: Inwijding van de Nederduytsche Academie. De opbloei van het renaissance-toneel in Amsterdam’, in: M.A. Schenkeveld-van der Dussen a.o. (eds.), Nederlandse literatuur, een geschiedenis (Groningen: Noordhoff Uitgevers 1993) 196-201, 196.
[9] Blom, Podium van Europa, 28-77.
[10] René van Stipriaan, ‘Bredero laat in zijn komedie Moortje de carnavaleske maskerade herleven. Komisch toneel en vermaakscultuur in de noordelijke Nederlanden in de zeventiende eeuw’, in: Rob Erenstein a.o. (eds.), Een theatergeschiedenis der Nederlanden. Tien eeuwen drama en theater in Nederland en Vlaanderen (Amsterdam: AUP 1996) 162-167, 164.
[11] René van Stipriaan, Het volle leven. Nederlandse literatuur en cultuur ten tijde van de Republiek (circa 1550-1800) (Amsterdam: Bert Bakker 2002) 168-170.
[12] Blom, Podium van Europa, 26-28.
[13] H. Duits, ‘11 november 1621: De Amsterdamse kerkeraad stuurt twee afgezanten naar de burgemeesters om te klagen over een opvoering van Samuel Costers Iphigenia in de Nederduytsche Academie. De moeizame relatie tussen kerk en toneel in de zeventiende eeuw’, in: Erenstein, Een theatergeschiedenis, 180.
[14] Kim Jautze, Léonor Àlvarez Francès and Frans Blom, ‘Spaans theater in de Amsterdamse Schouwburg (1638-1672): kwantitatieve en kwalitatieve analyse van de creatieve industrie van het vertalen’, De Zeventiende Eeuw 32:1 (2016) 14.
[15] Karel Porteman and Mieke B. Smits-Veldt, Een nieuw vaderland voor de muzen. Geschiedenis van de Nederlandse literatuur 1560-1700 (Hilversum: Verloren 2013) 525; Ton Amir, ‘26 mei 1665: De opening van de verbouwde Schouwburg te Amsterdam. Van suggestie naar illusie; kunst- en vliegwerken in de Amsterdamse Schouwburg’, in: Erenstein, Een theatergeschiedenis, 258-265.
[16] Porteman and Smits-Veldt, Een nieuw vaderland, 525.
[17] Blom, Podium van Europa, 403.
[18] Porteman and Smits-Veldt, Een nieuw vaderland, 692.
[19] Ibid., 711.
[20] Inger Leemans and Gert-Jan Johannes, Worm en donder. Geschiedenis van de Nederlandse literatuur 1700-1800: de Republiek (Amsterdam: Bert Bakker 2013) 344-353.
[21] Van der Sijs, Taal als mensenwerk, 280; Pieter van Reenen, ‘Twee problematische foneemopposities door de eeuwen heen: /s/-/z/ en /f/-/v/ in het Nederlands’, Taal en Tongval 73:2 (2021) 189. DOI: https://doi.org/10.5117/TET2021.3.VANR.
[22] See, for example, Bettina Noak, ‘Vondel as a Dramatist: The Representation of Language and Body’, in: Jan Bloemendal and Frans-Willem Korsten, Joost van den Vondel (1587-1679): Dutch Playwright in the Golden Age (Leiden: Brill 2012) 115-138. DOI: https://doi.org/10.1163/9789004218833_008.
[23] Van der Sijs, Taal als mensenwerk, 156.
[24] Van Loon, Historische fonologie, 254.
[25] Ibid., 252-253.
[26] Jacob van der Schuere, Nederduydsche Spellinge. Uitgegeven, ingeleid en toegelicht door Dr. F.L. Zwaan (Groningen/Djakarta: Wolters 1957).
[27] Phonemes (mental representations of language sounds) are usually presented between / /. In this article both ‘presumed phonemes’ and spelled language sounds of which we may not know the historical phoneme are presented between / /.
[28] Van Reenen, ‘Two Problematic Phoneme Oppositions’, 190-191.
[29] Gijsbert Rutten and Rik Vosters, ‘Language Standardization “from above”’, in: Wendy Ayres-Bennett and John Bellamy (eds.), The Cambridge Handbook of Language Standardization (Cambridge: CUP 2021) 68-69. DOI: https://doi.org/10.1017/9781108559249.003.
[30] Joost van den Vondel, Aenleidinge ter Nederduitsche dichtkunste (Werkgroep Utrechtse Neerlandici 1977).
[31] Van Loon, Historische fonologie, 255.
[32] René van Stipriaan, De hartenjager (Amsterdam: Querido 2018) 52-53.
[33] Andreas Krogull and Gijsbert Rutten, ‘“Lowthian” Linguistics Revisited: Codification, Prescription, and Style in a Comparative Perspective’, in: Luiselle Caon, Marion Elenbaas and Janet Grijzenhout (eds.), Language Use, Usage Guides and Linguistic Norms (Newcastle upon Tyne: Cambridge Scholars Publishing 2021) 133-134.
[34] Van der Sijs, Taal als mensenwerk, 203.
[35] See also: BMGN – Low Countries Historical Review (BMGN – LCHR) 128:4 (2013), Special Issue on Digital History. https://bmgn-lchr.nl/issue/view/31.
[36] Gerben Zaagsma, ‘On Digital History’, BMGN – LCHR 128:4 (2013) 3-29, 7. DOI: https://doi.org/10.18352/bmgn-lchr.9344.
[37] Bernard Ycart, ‘Alberti’s Letter Counts’, Literary and Linguistic Computing 29:2 (2014) 255-265. DOI: https://doi.org/10.1093/llc/fqt034.
[38] Alexander Shewan, ‘Alliteration and Assonance in Homer’, Classical Philology 20:3 (1925) 193-209. DOI: https://doi.org/10.1086/360690.
[39] J. Berenike Herrmann, Karina van Dalen-Oskam and Christof Schöch, ‘Revisiting style, a key concept in literary studies’, Journal of Literary Theory 9:1 (2015) 25-52. DOI: https://doi.org/10.1515/jlt-2015-0003; Karina van Dalen-Oskam, Het raadsel literatuur: Is literaire kwaliteit meetbaar? (Amsterdam: AUP 2021) 32.
[40] Ibid., 23-39; Hugh Craig and Brett Greatley-Hirsch (eds.), Style, Computers, and Early Modern Drama: Beyond Authorship (Cambridge: CUP 2017); Mike Kestemont, Gunther Martens and Thorsten Ries, ‘A Computational Approach to Authorship Verification of Johann Wolfgang Goethe’s Contributions to the Frankfurter gelehrte Anzeigen (1772-73)’, Journal of European Periodical Studies 4:1 (2019) 115-143. DOI: https://doi.org/10.21825/jeps.v4i1.10188.
[41] Van Dalen-Oskam, Het raadsel literatuur, 26; John Burrows, ‘“Delta”: A Measure of Stylistic Difference and a Guide to Likely Authorship’, Literary and linguistic computing 17:3 (2002) 267-287. DOI: https://doi.org/10.1093/llc/17.3.267.
[42] Van Dalen-Oskam, Het raadsel literatuur, 29.
[43] Mike Kestemont a.o., Van wie is het Wilhelmus? De auteur van het Nederlandse volkslied met de computer onderzocht (Amsterdam: AUP 2017).
[44] The corpus was previously used for the Embodied Emotions project. See Janneke M. van der Zwaan, Inger Leemans, Erika Kuijpers and Isa Maks, ‘HEEM, a Complex Model for Mining Emotions in Historical Text’, in: 2015 IEEE 11th International Conference on e-Science (Munich 2015) 22-30. DOI: https://doi.org/10.1109/eScience.2015.18.
[45] Willem Frijhoff and Marijke Spies, Bevochten eendracht (Den Haag: SDU 1999) 106.
[46] Anonymous author, Algemeen letterkundig lexicon, DBNL, https://www.dbnl.org/tekst/dela012alge01_01/dela012alge01_01_01995.php. Afzonderlijk verschenen publicaties (2012-…).
[47] Fieke Smitskamp and Ruben Vosmeer, LAPA (Language Pattern Analyser): A Digital Tool for the Analysis of Language Sound Patterns in Historical Dutch Theatre Plays. https://github.com/fiekesmitskamp/LAPA_Language-Pattern-Analyser.
[48] https://www.python.org/doc/essays/blurb/.
[49] More about the digital analysis of lottery rhymes: Marly Terwisscha van Scheltinga, Sara Budts and Jeroen Puttevils, ‘(Fe)male Voices on Stage: Finding Patterns in Lottery Rhymes of the Late Medieval and Early Modern Low Countries with and without AI’, BMGN – LCHR 139:1 (2024) 4-28. DOI: https://doi.org/10.51769/bmgn-lchr.13872.
[50] Leemans and Johannes, Worm en donder, 325-327; Van der Sijs, Taal als mensenwerk, 276.
[51] DBNL (www.dbnl.org) is a collaboration between the Taalunie, the Flemish Heritage Libraries and the Koninklijke Bibliotheek te Den Haag (KB).
[52] See also: Van Stipriaan, De hartenjager, 104-105.
[53] J.A. Worp, Geschiedenis van den Amsterdamschen Schouwburg 1496-1772 (Amsterdam: S.L. van Looy 1920) 70.
[54] Bredero, Lucelle, 195. ‘Now listen, my lords, listen: If this play has delighted, moved, or pleased you, show us your joy. Rejoice with applause, and follow my lead, and if it has pleased you well, then call out in unison, “Yes!”’ (All translations are my own).