(4.1.1) The "Biblioteca Italiana Telematica" Project (http://cibit.unipi.it): a Progress Report
Mirko Tavoni
University of Pisa, Italy
Eugenio Picchi
CNR, Italy
The Biblioteca Italiana Telematica (Italian Library Online) is a digital library of representative texts of the Italian cultural tradition from the Middle Ages to the 20th Century (literature, language, history, philosophy, art, music and cultural history in its general sense). The library aims to be a major research and teaching facility at the service of researchers and students of Italian language and culture throughout the world.
The library has been set up by the CIBIT (Centro Interuniversitario Biblioteca Telematica Italiana), which includes interdisciplinary research groups in 16 Italian universities and also sees to the library's management. The project's software implementation has been entrusted to Eugenio Picchi of the Institute of Computational Linguistics (ILC) of the Italian National Research Council (CNR).
The project is financed by the Italian Ministry of Higher Education (MURST): research programmes of relevant national interest, Biblioteca Telematica Italiana: the Italian cultural tradition on the Internet (1998-99), and Textual memory: editions, studies and tools for computer analysis of the Italian heritage (2000-2001).
The design of the Biblioteca Telematica Italiana has been quite complex due to the many different functions that had to be developed for and integrated within it. The purpose of the present paper is to illustrate:
The objectives established for the project can be summarised as follows:
The project includes three main kinds of specialised expertise:
Particular attention has been paid to the quality of the services offered and their optimisation. Besides offering the possibility of reading and downloading texts, we have also aimed to provide online all of those text-analysis tools offered in stand-alone linguistics, philology and computational lexicography applications. We hold such tools to be crucial features for the very concept of an online library.
The procedures for storage, analysis and querying of the chosen text corpus is the DBT system (Data Base Testuale) developed at the Pisa ILC by Eugenio Picchi. The main efforts in integrating the text-analysis tools with the telematics library have been directed at providing the same functionality and uses as the local programs over a network connection.
Procedures for reading and querying texts are by far the services demanded most frequently. The utmost attention has therefore been devoted to response times, the ability to serve the greatest number of users contemporaneously, and to offering the maximum possible guarantee of recovery from most problems that may arise. Such considerations underlie the decision to develop the consultation and query system in the Java language, as it is able to satisfy a wide range of requirements: it is a stable tool on both the server and client ends; it makes the service accessible to all Internet-capable hard- and software platforms through a suitable browser; it provides state-sensitive sessions, that is, able to maintain information regarding the dynamics of the varied requests made by each user. A set of specialised applets is therefore able to guarantee access to the library's consultation functions: searching the catalogue of the available e-texts in order to select the desired text and/or sub-corpora using a whole series of pre-set bibliographical selection keys; reading single texts; querying the selected texts through the typical querying and text-analysis procedures of DBT. The applets have been developed with the aim of maintaining the greatest possible compatibility with the DBT system, to which many users are accustomed.
While for the foregoing, most-used functions, our efforts were largely directed at providing optimal service, for other, more specialist functions, it was decided instead to favour simplicity of use, a feature furnished more readily by a development and access technology other than Java. The techniques adopted here are ISAPI/CGI scripts, which enable establishing lighter client/server connections, with on-the-fly creation of HTML pages in response to different user requests and interactions. Such scripting has been applied to the querying of lemmatised texts, that is, texts for which morpho-syntactic analysis and classification have been performed beforehand. Queries can thereby be launched also on the para-textual data containing a wide range of information on each word, such as: its lemma, the lemma's grammatical classification, the form's morpho-syntactic classification. Such a wealth of information stored in a text allows for qualitatively more precise and far more productive search functionality.
Particular tools for the online querying of texts with metric markers have been developed using MS-Access. In anticipation of the forthcoming Java or ISAPI applications necessary to make such tools directly available online, they have for the moment been implemented using the Windows NT Terminal Edition that, through a special plug-in, allows browser access to the server-side OS through an Internet connection.
Finally, by integrating the contributions of the different specialists outlined in the preceding section, we have set up the Biblioteca Telematica Italiana Web site (http://cibit.unipi.it), within which easy access to the different functions has been organised ergonomically from four fundamental points of entry: Reading, Catalogue, Collections and Advanced Searches.
Results
The textbase currently online (Dec. 1999) contains about 900 texts. The Biblioteca Italiana Telematica currently offers the following services through the Internet:
Greg Lessard
Michael Levison
Queen's University, Canada
1. Introduction
One of Chomsky's early distinctions separated 'rule-governed creativity' and 'rule-changing creativity'. In the first, applications of a rule within a formal system produce new output, while in the second, some mechanism is used to change the set of rules themselves. While the overall concept of language as a rule-governed creative system has now gained general currency in linguistics, the distinction itself has found less favour. One group, however, which has used the concept, if not the term, is OuLiPo, the 'Ouvroir de Littérature Potentielle' (Potential Literature Working Group), a group of French poets, novelists and mathematicians which for the past several decades has dedicated itself to the exploration of constraints as a device for enhancing poeticity. Most notable among them have been authors like Raymond Queneau and Georges Perec. Two well-known examples of their work are 'Cent mille milliards de poèmes' (One hundred billion poems) by Queneau, which uses sheets of substitutable lines of a sonnet to produce 10 to the 14th power individual poems, and 'La Disparition' (The Disappearance) by Perec, a 300 plus page novel in which the letter e never appears. (See references for details.)
More generally, OuLiPo has set as its goal the formulation and exemplification of a wide range of formal constraints which may be used to enhance and enrich poetic production. In what follows, we will survey some of the basic types of constraints. We will then show how some of these constraints may be embodied in a natural language generation system. The goal will be to illustrate how such a system may provide a prosthetic extension to humans' ability to exhaustively draw out the implications of a particular rule. Finally, we will show how such a system may also be used to modify the constraints themselves.
2. A brief overview of some OuLiPo constructs
The operations described in OuLiPo texts may apply at all linguistic levels, from phonemes, to letters, to words, sentences and texts. On the systemic level, they include addition and subtraction of elements and restriction of choices, while on the serial level they include concatenation and permutation. Operations may be purely formal, or involve semantics as well. Finally, the elements involved may be comparable or disparate. Let us consider some examples:
Phonemes or letters
The examples described above are formalisable, but they have been generated by humans, using their intuition at a particular point in space and time. From the computational perspective, it is interesting to ask to what extent the computer may provide us with a prosthetic device to exhaustively explore the consequences of a particular construct. In this vein, we have used the VINCI generation environment to model examples such as those above.
Briefly, VINCI is a collection of metalanguages which allows a linguist to specify grammatical information, and an interpreter which produces utterances based on the specification provided. It includes the following components:
Most of the preceding examples can be captured by the VINCI formalism.
Shared initial letter
The following rule produces a sequence of masculine singular noun phrases, where each noun begins with the letter 'm'.
SN = DET[masc,sing,déf] N[masc,sing]/ "m*"
Sample output includes:
le moyen, le maillot, le malheureux, le moment, le mètre, le mal
Lipograms in e
These are more complex, since they require several steps, including a pre-sorting of the lexicon, to remove words containing 'e' internally, as well as those words whose morphology rule adds an 'e' (cf. grand - grande). An additional step can include a device for choosing words which begin with a vowel, thereby allowing the masculine article (cf. l'ami), or which are feminine singular (la condition). The following simple rule illustrates a specification of a small part of this:
SN =
{Masculine singular human nouns starting in 'a'} ( DET[masc, sing] N[masc, sing, humain]/ "a*" ADJ[masc, sing, humain]
| {or}
{Feminine singular human nouns followed by adjectives whose feminin is identical to the masculine form ($13) thus avoiding 'e'} DET[fém, sing] N[fém, sing, humain] ADJ[fém, sing, humain]/$13
| {or} {etc. etc.}
Sample output includes:
l'ami obscur, l'amant amusant, l'avocat assis, la condition citron
S+7
This can be achieved by pre-processing the lexicon so that each noun points to its seventh successor. A transformation can then be applied which replaces each noun by the word pointed to. If the pointers are in lexical field 13 with tag s7, we may write:
ROOT = S7 : SN
SN = choose Ge : Genre, No : Nombre; DET[Ge,No] N[Ge,No]
S7 = TRANSFORMATION DET N : 1 2/ @13:s7
For the sake of clarity, we ignore a possible gender change.
Word formation
VINCI possesses a rich set of devices for forming new words. For example, given the example 'dèvisager' described above, the following rule systematically searches a lexicon for nouns having the semantic trait 'partieducorps' (bodypart) and produces appropriate output:
"*e"|N|partieducorps.suj|?|?|?| _makes_ ["dé" + #1 + "r"]|V| | | | | %
dévisager déventrer détêter dépoitriner dépatter déorganer déoreiller démembrer délanguer délèvrer déjouer déjamber dégorger défoier défacer déboucher débarber déailer déépauler
Again, we have left the rule underspecified. Additional steps are required to add an -s before nouns beginning with a vowel.
Homosyntactic structures
It is clear that all the previous rules can be used iteratively to produce multiple occurrences of the same syntactic structure.
4. Meta-generation
In the cases discussed above, the goal has been to generate utterances based on grammatical descriptions, the rationale being that humans are less good than computers at exhaustively enumerating the products of a grammatical rule. It is however possible to carry the process one step further. Since VINCI grammatical specifications are themselves only text files, it is possible to use VINCI itself to generate grammatical rules which in turn generate new utterances. The advantage of such an approach comes in cases where the computer can be made to generate a large number of potential new rules, each of which generates a large number of possible products. The result can be to cause the human to see previously unthought-of rule possibilities.
To illustrate this, consider the following problem. Languages like French (and English) allow for multiple prefixes to be added to a base form. For example, Queneau uses the word 'archidyssymètrique', which has two prefixes, while a form like 'non-anti-defoliant' is a possible English form. One way of capturing such possibilities would be to enumerate all rule combinations by hand. This would be tedious and prone to error. A better alternative would be to allow a meta-rule to generate a large number of possible prefixation rules and then to use these to produce typical examples. For example, consider the following meta-rules:
MANYPREFIXES = (PREFIX: MANYPREFIXES | PREFIX )
PREFIX = (PREFIX_NON | PREFIX_RE | ...)
ROOT = (MANYPREFIXES: N | MANYPREFIXES: V | MANYPREFIXES : ADJ )
where PREFIX_NON etc. are transformations.
When applied, these meta-rules produce grammatical rules like:
ROOT = PREFIX_NON : N
ROOT = PREFIX_NON : PREFIX_RE : V
ROOT = PREFIX_RE : PREFIX_NON : ADJ
ROOT = PREFIX_NON : PREFIX_NON : PREFIX_RE : ... V
and so on, which in turn can be used to generate actual utterances. Obviously, additional elements would be required to control more precisely the nature of the prefixes, but the essential principle should be clear.
An intermediate step is also possible, in which VINCI hands control temporarily back to a user. So, for example, a researcher might have put in front of him or her a partially developed meta-rule and be asked to insert particular values.
5. Conclusions
We assume, with OuLiPo, that poeticity hinges at least partially on dynamic playing with constraints. Similarly, we have as a premise that at least some aspects of inspiration or creativity involve finding previously unseen patterns. Given these premises, a device which puts potential patterns before us (which prosthetically increases our power to envisage new devices) is of interest.
In the proposed paper, we will illustrate these assumptions with a richer range of examples and with a more detailed discussion of the theoretical issues which underpin them.
6. References
Bens, J. (1980) OuLiPo 1960-1963. Christian Bourgeois,
Paris.
Bergens, A. (1963) Raymond Queneau. Droz, Genève.
Lessard, G. and Levison, M. (1995). Le logiciel VINCI: lexigrammaire
et génération automatique. Lexiques-grammaires comparès
et traitements automatiques, (Édité par J. Labelle.)
Université du Québec à Montréal, pp. 175-185.
Levison, M. and Lessard, G. (1995). New Words from Old: A Formalism
for Word Formation. Computers and the Humanities 29:463-479.
OuLiPo (1981) Atlas de littérature potentielle.
Gallimard, Paris.
OuLiPo (1981) La bibliothèque oulipienne. Slatkine,
Paris.
Perec, G. (1969) La Disparition. Denoel, Paris.
Queneau, R. (1989) Oeuvres complètes, tome I.
Edition établie par Cl. Debon.Gallimard, Paris.
Queneau, R. (1969) Chêne et chien. Gallimard, Paris.
(4.1.3) An American National Corpus: a Large Balanced Text Corpus for American English
Catherine Macleod
New York University, USA
Nancy Ide
Vassar College, USA
Introduction:
The importance of corpora as resources has become more and more accepted over the years. Many types of corpora have been used for various different purposes but if one is searching for examples of "general" application and not restricting oneself to a particular sub-language, the development of a balanced corpus is of primary importance. Of equal importance is the adoption of a uniform standard annotation. The main areas of application of a text corpus are lexicography (also computational lexicography) and natural language processing, including specifically, adaptation to different domains and genres. For these purposes the corpus must be large (at least 100 million words), contemporary, heterogeneous, uniformly annotated and, for use in the United States, must contain American English. The size will ensure the adequate representation of infrequent words. The selection of contemporary texts is important for both lexicography and NLP, particularly in view of the significant changes in common text genres over the last few years brought about by electronic communication. Heterogeneity ensures that the range of language usage needed for the creation of "general language resources" is represented, and that one can explore a wide spectrum of language genres for NLP. Uniform annotation is paramount in any corpus and the collection of American texts ensures that the grammatical and lexical differences found in British English will not interfere with the classifying of American English.
Background:
The first American text corpus that strived for this balance was the Brown Corpus developed by Kucera and Francis at Brown University in the 1960's. It was the model for many corpora that followed and is still being used today. However, it is a small corpus (one million words) and somewhat dated (the texts are at least 30 years old). It is true that a written language changes rather slowly over time with regard to grammar but there are changes in the structure and there are quite frequent additions of new lexical items.
Recently, the British National Corpus (BNC) was released. It is a rather carefully balanced corpus and a very large corpus (one hundred million words). It also has the advantage of covering the time period from where the Brown Corpus left off until 1993. There are, nonetheless, two distinct disadvantages for Natural Language researchers and dictionary producers in the United States: (1) the corpus is, as yet, unavailable for use outside of Europe and (2) the corpus contains texts of British not American English.
Differences between American and British English:
The grammar of American English (A.E.) varies from British English (B.E.) quite significantly. For example, British English often makes use of a to-infinitive complement where American English does not. In the following examples from the BNC, "assay", "engage", "omit" and "endure" appear with a to-infinitive complement; there were no examples found in our corpus of this construction although the verbs themselves did appear.
Examples:
B.E. "Jerome crept to the foot of the steps, and there halted, baulked, rather, like a startled horse, drew hard breath and ASSAYED TO MOUNT, and then suddenly threw up his arms to cover his face, fell on his knees with a lamentable, choking cry, and bowed himself against the stone of the steps."
B.E. "A magnate would ENGAGE TO SERVE with a specified number of men for a particular time in return for wages which were agreed in advance and paid by the Exchequer." B.E. " 'What did you OMIT TO TELL your priest?' " A.E. "`What did you OMIT TELLING your priest?'"
B.E. "But Carteret's wife, who frequented health spas, could not ENDURE TO LIVE with him or he with her: there were no children."
A.E. "But Carteret's wife, who frequented health spas, could not ENDURE LIVING with him or he with her: there were no children."
For the first two verbs, one can argue that there is not an equivalent verbal meaning in A.E. but, for the last two, the meaning can be paraphrased in A.E. by the gerund.
Adverbial usage is also different. The B.E. use of "immediately" in sentence initial position is not allowed in A.E. For example, B.E. "Immediately I get home, I will attend to that." is incorrect in A.E. where we would say "As soon as I get home, I will attend to that."
Other syntactic differences are formation of questions with the main verb "have". In B.E., one can say, "Have you a pen?" where A.E. speakers must use "do" ("Do you have a pen?"). Support verbs for nominalizations also differ. Note the B.E. "take a decision" vs the A.E. "make a decision".
With these considerable differences and the fact that lexical items may be over- or under-represented or not present at all, it is clear that a corpus of American English is needed.
The proposed American National Corpus:
As seen above, the corpora we have been working with are inadequate and the BNC although meeting our standards of size and balance does not deal with our language. In 1998, at the first LREC conference a proposal was made to create an American National Corpus (ANC) much on the lines of the BNC (Fillmore et al, 1998 [1]).
The corpus should be as far as possible, contemporary (1990's). It should be both static (like the BNC) and dynamic (COBUILD). We will add regular increments but retain the capability to return to the initial corpus as well as the static stages between increments.
The corpus will be both balanced and heterogeneous. The collection of more than 100 million words will make this possible. 100 million words of the ANC should be comparable in balance to the BNC to enable cross linguistic studies between British and American English. There is no set definition for what it means for a corpus to be balanced. The BNC made a principled effort to balance their corpus (see the BNC User's Reference Guide [2] for a break down of their corpus). The ANC will use this as a model. However, since it is also desirable to provide significant components from a wide range of styles, the remaining text will be varied rather than balanced (i.e. we will not try for differing percentages of texts according to their representative importance in the language but will try for smaller samples of a greater variety of texts). The corpus will be annotated at two levels, which serve two different user groups. Base Level will be annotated fully automatically with document, paragraph, sentence, token with POS marking. Level 1 will be heavily manual with the added text structure (titles, headers, footnotes, tables, captions, lists, etc.) which follow the CES standard (Ide, et.al [3]).
Progress towards the creation of the ANC:
The ANC has progressed since its genesis at LREC 98. In May of 1999 the first ANC meeting preceded the Dictionary Society of North America (DSNA) meeting at the University of California at Berkeley. It was attended by a number of representatives of publishing houses. The idea of an American National Corpus was well received and plans for a second meeting were agreed upon.
The second meeting took place at New York University. Invitees to this meeting included not only those present at the May meeting but publishers from Japan and representatives from various software companies from the U.S. and Europe. More substantial issues were discussed including the structure of the consortium, questions of balance in the corpus, funding, time schedules and licensing agreements. Some questions were decided, others such as balance and licensing were referred to committees for further discussion.
The shape of the consortium and future plans:
The licensing and base level annotation is to be done through LDC (UPenn). UPenn will obtain licenses from text providers and provide licenses to users. With regard to data rights, there will be multiple classes. The expectation is that there will be some subset of the data which can be made available under a form of general public license, and hence can be freely redistributed under this license.
The membership agreement provides for paid memberships from commercial organizations. These members will receive the data as soon as it is processed and have exclusive rights to this data for a period of three years. They are expected to make monetary as well as data contributions. The data will be freely available to non-profit educational and research organizations (aside from a nominal fee for licensing and distribution).
Our plan is for the base level to be paid for with consortium fees. We have a 3-year time-frame starting Jan. 2000, with 10% of the corpus deliverable by summer 2000. Level 1 annotation which will require external funding, will proceed dependent on this funding. Therefore, this may lag as much as a year behind the base level corpus. Our goal is a fully annotated level 1 corpus compliant with the CES standard.
References
[1] Fillmore, C., Ide, N., Jurafsky, D. and Macleod, C. "An American
National Corpus: A Proposal". The Proceedings of LREC, Granada,
Spain, May 28-30, pp. 965-969.
[2] Burnard, L. (ed) (1955). "British National Corpus: User's
Reference Guide for the British National Corpus", Oxford University Computing
Service, May, 1955, pp. 13-19.
[3] Ide, N., Romary, L. and Bonhomme, P. (submitted). "CES/XML:
An XML-based Standard for Linguistic Corpora". Submitted to the Second
International Language Resources and Evaluation Conference.
[4] Francis, W.N. and Kucera, H. (1964). "Manual of Information
to Accompany 'A Standard Sample of Present-Day Edited American English,
for Use with Digital Computers'". Department of Linguistics, Brown University,
Providence, RI (revised 1979).
A 'New' Computer-Assisted Literary Criticism?
Chair: Raymond G. Siemens
Malaspina University-College, USA
(4.2.1) Electric Theory (Truth, Use, and Method)
Tamise J. Van Pelt
Idaho State University, USA
First Wave critics of the electronic environment - especially those critics who discuss hypertext in the late '80s and early '90s - make an interesting discovery about theory and computing: electronic literacy confirms post/structural theories of reading and writing. Given a computer environment, theory finds that its principles are indeed true.
Defining semiosis as the reading of codes and the writing of signs, J. David Bolter (1991) argues that "the theory of semiotics becomes obvious, almost trivially true, in the computer medium" (196). Similarly, George P. Landow (1992) extends Bolter's claim to encompass the theories of Barthes, Foucault, Bakhtin, and Derrida, pointing to the bilocation of ideas of textual openness, of the network, of polyvocality, and of decenteredness both in post-structural theory and computerized hypertextual writing. Thus, Landow concludes: "something that Derrida and other critical theorists describe as part of a seemingly extravagant claim about language turns out precisely to describe the new economy of reading and writing with electronic virtual, rather than physical, forms" (8). Even the hypertext novel can be an exemplar of literary criticism; for instance, Michael Joyce's Afternoon embodies both the content and the practice of psychoanalytic theory, using resistance as a literal compositional principle and placing desire at the center of the novel's reader-text interface.
This amazing convergence between post/structural theory and the computing medium suggests that hypertext's ability to stage the principles informing diverse (and even contradictory) literary theories defines a property of the electronic medium itself. Since the computer environment offers theory a self-validating medium, the question that once determined the authority of a theory - can its principles help the reader discover the truth of a text? - seems futile to ask. If Dillon is right and our technologies are actually embodiments of our theories (in Rouet, 8, emphasis mine), then this self-validating function of electrified theory undermines, by inevitable affirmation, the critical position of the reader-theorist. Because the electronic medium reconfigures the relation between the reader and the text, it literalizes the way that the reader brings theory to the text: In a medium where "the text is a stage and reading is direction" (Douglas, title), the very agency of the "reader" tends to strip away rather than develop critical distance. Thus, Espen Aarseth replaces the idea of the "reader" with the more ambivalent term "user" to describe the person who interacts with the electronic medium. User, Aarseth writes, "suggest[s] both active participation and dependency, a figure under the influence of some kind of pleasure-giving system" (Cybertext 174). Since a user-theorist of e-text operating under the print assumptions connecting theory to authoritative truth finds in electric theory the headiness of addiction to confirmation, electric theory necessitates attention to use itself.
To better define issues of user dependency and control in the convergence
of theory with the electronic medium, I will examine two very different
methods of computerized theoretical study that solve the problem of literalness:
Earl Jackson, Jr.'s semiotics and psychoanalysis website and Havholm &
Stewart's computer modeled structural narratology. Both methods arise from
the computing environment and would be impossible without it. Whereas the
former site places the user of electric theory in a webbed environment
of emergent meanings and open-ended exploration, the latter practice programs
theoretical principles in order to radically constrain theoretical outcomes.
Either method's willingness to replace truth-value with use-value provides
a way out of the self-confirming impasse created by the encounter between
First Wave theoretical assumptions and the computer medium.
(4.2.2) French Neo-Structuralist Schools and Industrial Text Analysis
William Glen Winder
University of British Columbia, Canada
Parallel to, and to some degree in reaction to French post-structuralist theorization (as championed by Derrida, Foucault, and Lacan, among others) is a French "neo" structuralism built directly on the achievements of structuralism using electronic means. We will begin this talk by examining some exemplary approaches to text analysis in this neo-structuralist vein that have appeared over the past 10 years.
Some of these approaches have specific "deliverables" and are promising because of the well-defined focus of their research: Sator's topoi dictionary, E. Brunet's statistical software, and E. Brill's grammatical tagger will serve to illustrate projects of this type. Other research is more theoretical in nature and represents over-arching models of (electronic) textual study. Two examples we will consider are Jean-Claude Gardin's expert systems approach and François Rastier's interpretative semantics.
These practical and theoretical approaches have in common a fundamental hypothesis: archives of natural language texts are a valuable and as yet untapped resource for any project to formalise human understanding, whether that project be industrial or traditionally humanistic in nature. (Thus, for example, the Brill tagger uses the Frantext literary database to generate the rule base for the tagger.) In a very real and practical sense, authors are painstaking programmers. They formalise meaning and create, through their writing, databases of expertise in various domains, which range from how we use language to how we perceive the world and exchange information about it. That expertise is precisely what computers must acquire if they are to perform the more advanced tasks increasingly asked of them, whether in the context of humanities research or in an industrial setting.
Textual archives which combine texts and expertise are destined to play an important role in our increasingly electronic society because programmers face an information barrier. Advanced programming projects require that programmers describe real-world objects with exponentially increasing detail and precision. Such massive requirements for description cannot be met by the efforts of any single group of programmers: it may well be that only the mass of textual material, accumulated over the centuries in literary texts and scientific writing, has enough descriptive weight to allow programs to break the information barrier and perform qualitatively more advanced tasks.
Textual research itself faces the same kind of information barrier. In this paper, we will consider how this "Wissenschaft" accumulation of expertise is related to and complements the neo-structuralist approach. Ultimately, electronic critical studies will be defined by their strategic position at the intersection of the two technologies shaping our society: the new information processing technology of computers and the representational techniques that have accumulated for centuries in written texts. Understanding how these two information management paradigms complement each other is a key issue for the humanities, for computer science, and vital to industry, even beyond the narrow domain of the language industries. It will be the contention of this paper that the direction of critical studies, a small planet long orbiting in only rarefied academic circles, will be radically altered by the sheer size of the economic stakes implied by industrial text analysis.
(4.2.3) A Theory for Literature (Created for the World Wide Web, E-Mail, Chat Spaces, Databases, and Other Electronic Technologies)
Dene M. Grigar
Texas Woman's University, USA
Beginning with Michael Joyce's seminal hypertext, 'afternoon, a story', and moving to the recent webtext by Kathleen Yancey and Michael Spooner, 'Not (Necessarily) a Cosmic Convergence', Dene Grigar provides examples of the new literary writing generated by electronic non-print technologies, such as World Wide Web, MOOs, databases, and other types of computer-generated media, and discusses the theories that have emerged to explain them. Specifically, she looks at theories of hypertext, posited by Jay David Bolter, George Landow, Johndan Johnson-Eilola; of synchronous or real-time writing, found in MOOs and MUDs, developed by Cynthia Haynes and Jan Rune Holmevik, Mick Doherty, and Sandye Thompson; and of online writing and webtexts, articulated by John Barber, Victor Vitanza, and others. Although electronic writing remains at the early stages of development in this late age of print (Bolter 2) and early age of electronic writing (Barber and Grigar 12), it is fast becoming an important medium of literary expression. By bringing these examples and ideas together, the author suggests some guidelines for understanding and discussing electronic writing that will serve as the starting point for the development of an overarching literary theory for these emergent literary texts.
(4.2.4) Computer-Mediated Discourse, Reception Theory, and Versioning
Susan Schreibman
University College Dublin, Eire
This paper will address how computer-mediated discourse provides new opportunities and challenges in two areas of literary criticism, Reception Theory and Versioning. Although extremely different critical modes, they can be viewed as belonging to opposite ends of the space-time continuum, with Versioning taking advantage of the computer's ability to enhance our understanding of literature through space, and Reception Theory through time.
Versioning is a relatively new development in the area of textual criticism. Since the end of the Second World War, the basic theory under which most textual critics operated was to provide readers with a text that most closely mirrored authorial intention. This philosophy of editing produced texts which, by and large, never existed in the author's lifetime. They were eclectic texts: the editor, armed with his intimate knowledge of the author and the text, assumed the role of author-surrogate to create a text which mirrored final authorial intention. To do this the editor swept away corruption which had entered the text through the publication process by well-meaning editors, compositors, wives, heirs, etc. He also swept away any ambiguity left by the author herself. Thus, in the case of narrative, choosing a bit here from the copy text, a bit there from the first English edition, a sentence there from the second American edition, and a few lines from the original manuscript, that elusive but canonical authorial intention could be restored. In the case of poetry, editors were forced to choose one published version of a text over another, and substantively ignored authorial ambiguity, such as Emily Dickinson's, who left in many of her poems alternative readings of certain words.
By the mid-1980s, the monolithic approach to textual editing began to lose favour with a new generation of textual critics, such as Jerome McGann and Peter Shillingsburg, who, responding to new critical discourses, including Reception Theory, viewed the text as a product, not of corruption, but of social interaction between several of any number of agents: author, editor, publisher, compositor, scribe, translator. It was also recognised that authorial intention was often a fluid state; particularly in the case of poetry it was possible to have several "definitive" versions of any one work which represented the wishes of the poet at a particular point of time.
One reason, I would argue, that newer theories of textual criticism took so long to be developed was that until the advent of the HTML, the World Wide Web, and the spatial freedom of the Internet, textual critics had no suitable medium to display a fluid concept of authorship. Any attempts to demonstrate anything but a monolithic text which represented final authorial intention was doomed to failure. As early as 1968 William H. Gillman, et al. undertook what was to be a definitive edition of Emerson's Journals and Notebooks in six volumes. Lewis Mumford writing in the New York Review of Books put paid to this editorial method with his review article entitled "Emerson Behind Barbed Wire":
The cost of this scholarly donation is painfully dear, even if one puts aside the price in dollars of this heavy make-weight of unreadable print. For the editors have chosen to satisfy their standard of exactitude in transcription by a process of ruthless typographic mutilation.
In 1984 Hans Walter Gabler's Synoptic edition of Ulysses encountered the same resistance from both the editing community and the Joyceans. These early efforts at representing the fluidity of authorship were doomed to failure because of the two-dimensionality of the printed text. 'Reading' such texts as Gillman's and Gabler's became impossible for all the arrows, dashes, crosses, single underlining, double underlining, footnotes, endnotes and asterisks. I would thus argue that the theoretical stance to present anything but a monolithic text representing someone's final intention (which was more likely than not the editor's) was a product as much of the medium as of concurrent theoretical modes, eg New Criticism.
Hypertext has the spatial richness to overcome the limitations of the book's two-dimensionality to present, not only works in progress, but the richness and ambiguity of authorial intention. No longer do editors have to choose between the three Marianne Moore poems entitled "Poetry", but all three can be accommodated within the hypertextual archive. Furthermore, all of Moore's revisions can also be displayed. Depending on the skill, expertise and needs of the user, a single monolithic "Poetry" can be displayed (possibly for a secondary school class), two versions, or even all three versions could be viewed (for a Freshman poetry course), or all three versions and several of the manuscript drafts (for a graduate-level course on research methods). By utilising a markup language such as SGML, all these texts could be encoded to create an "Ur" version of the poem in which lines across versions can be displayed and compared.
No longer does space and the cost of publishing images in traditional formats have to guide edited editions. Facsimile versions which were only produced for the most canonical of authors now can be produced and "published" at extremely low cost. Furthermore, languages such as SGML or XML facilitate the linking of image and text files, annotation, notes, links to other relevant texts, and so on. As with my previous example, all or some of this apparatus can be turned on depending on the needs of the audience.
The needs of the audience brings me to my second theoretical mode than can find richer expression in digitisation, Reception Theory. Reception Theory seeks to provide present-day readers with a snapshot of a text's history across time, and how previous generations of reader-response have gone into shaping our conception of the text. It also seeks to make available to present-day audiences the historical, psychological, social and/or semantic codes of the text as it was received at some point in the past. If, meaning of a work is created in the interaction between text and reader, hypermedia has the potential to create a three-legged stool in which present-day readers overhear the dialogue created between past readers and the text.
As with my previous example, no longer does the cost associated with the production of printed material have to be the prime consideration in constructing a reception theory text. As it is now economically feasible to produce facsimile archives for un-canonical authors, reception theory archives can be constructed across time providing the reader access to objects that would have been unthinkable only a generation ago. For example, take the construction of a reception theory archive of W. B. Yeats's poetry. No longer does the author of the text have to content herself with providing one or two black and white reproductions of the first Cuala Press editions of Yeats's early poetry to demonstrate the semantic codes embedded in those early editions. In a digital edition, it is possible to include full colour shots of these texts, in addition to the Macmillan editions, which, as a standard trade publication, are stripped of those codes. Furthermore, the early reception of these poems was no doubt influenced by other objects of the Arts and Crafts movement in Ireland and England. These objects, prints, paintings, even wallpaper, could be digitised to create a lexia of non-textual meaning. As with my previous example on Versioning, it is possible to conceive a digital archive that could serve many audiences, with features turned on or off as necessary.
Hypertext can provide a vehicle for accessing the ways in which readers realised the aesthetic interpretation of the text if those readers left a record of that experience; that trail can be textual: critical (reviews, articles, critical texts), personal (letters, diaries), creative (the re-writing of a creative work). It can also involve other media, a painting or ballet based on a poem, myth or folktale. These acts of interpretation can be presented to present day users in much the same format as we are used to in a two-dimensional article, i.e., a block of text with hyperlinks (rather than footnotes) to relevant primary sources.
On the other hand, we have not yet realised a form of critical discourse which does not mirror the two-dimensional spaces we have used for the last 500 years to express ideas. Hypertext criticism in future will, no doubt, embody a new form for critical discourse which is shaped by the new medium. Criticism which takes advantage of the three-dimensional space of the computer is so new that the various paradigms of editorial/authorial intervention have not been fully realised, no less understood by both the creators and users. A case in point is that we still do not have appropriate language to describe these new objects of computer-mediated critical discourse: terms like "article", "book", "collection of essays", "review" etc. have meaning appropriate to the printed word, but possibly not to the digitised one.
And indeed there are costs associated with this new medium which will govern, to a large extent, the scope and content of archives: copyright costs, the production of machine-readable versions of texts, the cost of digitising images, hardware, software, etc. One's ideal archive is governed by a balancing of costs, and tradeoffs, such as deeper encoding vs encoding of a greater number of texts, will play a significant part in the creation of resources.
In addition, the searching, retrieval and display of objects in a digital archive will, by default, reflect the bias of both the editor/author and the system designer. Yet, unlike critical theory presented in a two-dimensional space, much of the bias will be invisible to the user, as it will be buried, for example, in SGML or XML encoding. Thus there will be new challenges in learning to "read" this new model of literary discourse, which will, in turn, no doubt, foster new theoretical modes.
(4.3.1) The FALMER Project: Toward an Electronic Critical Edition
Michel Bernard
Université de la Sorbonne-Nouvelle (Paris III), France
What will be the critical editions in the electronic era? Hubert de Phalèse, a research center in La Sorbonne-Nouvelle University (Paris III), in accordance with its pragmatic approach to literary computing problems, decided to launch this debate by putting on line a critical edition of the complete works of Lautréamont / Isidore Ducasse (http://www.cavi.univ-paris3.fr/phalese/hubert1.htm). This edition is an integral hypertext (in which nearly every word of the text is linked with a comment), which gathers all that one usually finds in the critical editions, but on a scale which does not have equivalents on paper: variants, philological, literary and encyclopedic comments, biography, bibliography, iconography, index, etc.
This prototype poses, concretely, a certain number of problems, on several levels:
Technical: Which interface is to be used? The purely automatic search engines (including the uses of Java and other script languages) appeared unsuited and a new device of computer-assisted indexing was developed. It makes it possible to provide to the user a lemmatized index and, especially, lexical cards which can be enriched at will. The current solution of setting on line presents some inconveniences but it has the advantage of proposing to the greatest number of users the consultation of the edition and of inviting them to take part in it.
Contents: The new support is, virtually, infinite. What is a critical edition to contain now that we are not concerned any more with its volume? All the versions of the text, for example, can now be proposed with the reading. But does one have to publish the intertexts, contemporary works, criticism, etc? How can the interconnection, in network, of several resources enrich a critical edition? Under which scientific and legal conditions?
Validation: Can this type of edition be regarded as more reliable than the paper editions? According to which protocols will such editions be judged? One of the risks is the apparition of a great quantity of work without scientific guarantee. How will the possibilities of collective work and permanent updating will modify our design of what a philological work should be?
Publication: Who will deal with the building and the diffusion expenses of such electronic products? Will the redistribution of the budget headings in this type of edition lead the academics to transform themselves into diffusers or will the traditional editors change their practice? In addition, new prospects open with the critical edition, which it will be necessary to evaluate and explore to know the real potentialities of them.
Work in group: Data processing and the Internet support the participation of a growing number of speakers around an intellectual work. Which will be the roles of each one (project director, data processing specialists, humanists, students, active readers, etc.)? The concepts even of authors and readers will not have any more the same direction.
Real time: The possibility of permanent update offered by an Internet site makes it possible to revalue the traditional concepts. It is not indeed essential any more to put on line a completely completed work, and the noted errors can be immediately corrected. In addition, this type of edition makes it possible to account for the topicality of research in the field, which connects it with a review (of which the periodicity is much higher besides than for any scientific review).
Interactivity: The possibility of putting in contact creators and users of electronic publishing, by means of the electronic mail, also connects this type of edition to a permanent conference. It is possible, in the long term, that scientific communities (specialists in an author, for example) gather around great electronic projects, that they would make live by publishing the results of their work there.
Cost: The very low cost of setting on line such an edition (I except the working time of researcher)s makes it possible at the same time to consider some undertakings in the face of of which the traditional editors move back (very large corpus, work of interest only for few specialists) but also to allow researchers to publish under some good working conditions works of weak size or that don't fit in the framework of current university editions.
Multi-media: What can be the contribution of multi-media to a scientific work like a critical edition? All in this field remains to be invented, because the traditional edition accustomed us to purely textual tools, primarily for reasons of cost. The sound and visual illustrations will bring to the literary text a very interesting dimension (publication of manuscripts, interpretations, iconographic documents, contemporary pieces of music, etc), from the teaching point of view as in the research field, but it is necessary to be wary of the easy effects which accustomed us, the general public, to electronic publishing. It is all the more urgent to answer these questions that the share of electronic documentation in literary studies would have, as in the other documentary fields, to increase until gradually replacing the traditional supports. Consequently, the survival of the texts and their formal characteristics will be closely related to the devices which will ensure their transmission, their conservation and their reading.
(4.3.2) The Miguel Cervantes Digital Library: The Hispanic Voice on the WEB
Andrés Pedreño
Universidad de Alicante, Spain
This paper describes the philosophy behind what represents one of the most ambitious projects of its kind ever to have been undertaken in the Spanish-speaking world: The Miguel Cervantes Digital Library (http://cervantesvirtual.com/). It explains the reasons behind its creation, the private-public sector alliance which has made it possible, and the new ground being explored by its creators in terms of innovative application of digital methods and of new services it offers to its audience world-wide.
The Miguel Cervantes Digital Library is the result of a unique collaboration between Alicante University and Spain's biggest bank, the Banco Santander Central Hispano who have joined forces to create the world's biggest digital library containing Spanish-speaking works. It represents an example of successful partnership between university and business, with the Santandar Central Hispano Bank providing complete sponsorship for the full development of the project. The University, on the other hand, provides the academic expertise, technological know-how and qualified workforce necessary to fulfil objectives and ensures international use of the Library's resources by way of collaboration agreements with universities and institutions all over the world. The paper will address the issue of this partnering of academia and private enterprise as a case study of how two vastly different institutions have successfully worked together in the overall management and vision of a large, global project.
The Miguel Cervantes Digital Library hopes to act as inspiration to
other non-English speaking cultures to create their own novel digital tools
which can be used by a multiracial and multilingual student and academic
community of Internet users world-wide. Far from being a static collection
of digitised books, the Library is envisaged as a vehicle for the Hispanic
academy to promote their works, as a window to Hispanic literature and
culture for scholars of Hispanic languages and cultures, and as a voice
for the Hispanic university community world-wide. The actual content of
the Library reflects this ambition as it includes sections such as:
During the past two years we have learned much from other worthy initiatives that are being undertaken in Spain, Latin America and further afield to digitise material reflecting Hispanic languages, literature and cultures. The author will provide a brief overview of the most notable projects in this area, and will offer a picture of the state of the art as to what techniques are being used to produce electronic resources in the Hispanic community at present and what the future holds.
The final section of the paper will deal with the technical underpinnings
of this project at present and in the future. Techniques used for digitisation
of texts, manuscripts, images and voices will be discussed, and advances
in this area made possible by certain strategic partnerships with universities,
libraries and other institutions in Spain and abroad will be acknowledged.
(4.3.3) Textual Variation, Electronic Editions and Hypertext
Edward Vanhoutte
Office for Scholarly Editing and Document Studies (BEB/OSEDS), Belgium
The advent of the electronic paradigm to the field of scholarly editing and textual criticism opens up new possibilities for both the production process and the delivery of products which may herald a new era in scholarly editing. A new practice including text encoding, automated tagging, automatic collation, the use of scripting languages, etc. creates new kinds of editions in which the record of textual variation becomes a central point of attention, both on the markup- and on the delivery-side. In this paper I will address both sides through a problematization of the apparatus criticus or variorum as an essential part of an electronic edition.
The new reality of scholarly editing in which for instance archives become editions and editions include archives (McGann 1996 and Robinson 1996), and hybrid editions come into being as a combination of critical, diplomatic, facsimile and reading editions (De Smedt & Vanhoutte 2000), calls for a new theoretical framework and a thorough critique of the theory and practice of paper-based critical philology. The existing rationales of textual criticism cannot simply be transposed from the hard-copy to the electronic paradigm. This would for instance mean a transposition of the unpracticality and illegibility of the apparatus criticus or variorum to a medium which is essentially structured differently. The editor of a paper-based edition tries to design the apparatus as an economic and compact model in which to store textual variety, often through a combination of variants. This more than once results in an unsuperable density and thus a malfunctioning of a tool which should above all be transparent and consultable. The apparatus criticus or apparatus variorum of a printed critical edition fails, through its form and formality, in what it intends to do, that is to provide a substitute and a documentation of each complex source in such a physical form that it is usable for the interested scholar (Vanhoutte 1998). The possibility to include digital facsimiles in an electronic edition discharges the apparatus from the theoretical imperative of being a substitute. The need for documentation of textual variety, however, remains, but in electronic editions, compactness is being overruled by explicitness. On the markup-side, this means the use of a system which can tag every reading as well as genetic commentary, meta-data variation and variation over structural boundaries (Smith 1999). On the delivery-side, this means the use of a system which can supply the user with the possibility to consult every witness on its own and - facultatively - in combination with a suggested orientation text to an acceptable level of granularity.
Over the past couple of years, several gentle solutions have been put into practice, which either function on the markup- or on the delivery-side (corresponding roughly with what Vanhoutte 1999 respectively calls the Archive- and Museum-function), but none of which provides a full answer to this documentation-maxim. Taking the three fundamental requirements of electronic scholarly editions - accessibility, longevity and intellectual integrity (Sperberg-McQueen 1994) - as parameters for an evaluation of possible designs of electronic editions, I will argue that there are at least three sorts of editions in the electronic paradigm: electronic editions, electronic editions with hypertext functionality and hypertext editions which do not meet the requirements of electronic editions. Further, this paper will (re)formulate additions to these requirements, focussing on modern literature.
Because hypertext is the visualization of linking which DeRose & Van Dam (1999) define as "the ability to express relationships between places in a universe of information" and which are explicitly marked or can be generated automatically by making use of some sort of markup, the syntax of this markup and the markup-language become essential in designing a hypertext and/or an electronic edition.
On the basis of the three fundamental requirements for electronic editions, the syntax of the markup (language) and the orientation towards a markup- or a delivery-side, I distinguish three sorts of editions in the electronic paradigm:
A. hypertext editions:
From B and C it follows that the explicit documentation of textual variation in an apparatus criticus or variorum linked to a base text is no prerequisite anymore for an electronic edition. Textual variation can be documented implicitly by the textual description of each witness or document source. The extraction of alternative views of the witnesses, facultatively projected together with one version which functions as an orientation text, can be a possible solution to the problems concerning the instability of a (base) text. This paper will conclude with some notes on the use of XML and XSL in a construct which both caters for the markup- and the display-side.
References
DeRose, Steven, J. and Van Dam, A. (1999). Document Structure
and markup in the FRESS hypertext system. Markup Languages: Theory &
Practice. 1/1, 7-32.
De Smedt, Marcel and Vanhoutte, E. (2000). Stijn Streuvels.
De teleurgang van den Waterhoek. Elektronisch-kritische editie. KANTL/Amsterdam
University Press, Gent/Amsterdam.
Sperberg-McQueen, C.M. (1994). Textual Criticism and
the Text Encoding Initiative. Paper presented at MLA '94, San Diego, 1994
<http://www.uic.edu:80/orgs/tei/talks/mla94.html>.
Sperberg-McQueen, C.M. and Burnard, L. (1994). Guidelines
for Electronic Text Encoding and Interchange. (P3). Text Encoding Initiative,
Chicago and Oxford.
McGann, J. (1996). "The Rationale of HyperText.", TEXT,
9, 11-32.
Robinson, P. (1996). Geoffrey Chaucer. The Wife of Bath's
Prologue. Cambridge University Press, Cambridge.
Robinson, P., Burnard, L., Proffitt, M. and Driscoll, M. (1999).
Initiatives Towards a Standard Encoding for Manuscript Descriptions. Session
at DRH99, London, 1999.
Smith, D. (1999). "Textual Variation and Version Control in
the TEI.", Computers and the Humanities, 33, 103-112.
Vanhoutte, E. (1998). "Where is the editor? Resistance in the
creation of an electronic critical edition." Paper presented at DRH98,
Glasgow, 1998.
(4.4.1) Computing in the School of English and Scottish Language And Literature (SESLL) at the University of Glasgow
Jean G Anderson
University of Glasgow, UK
The STELLA project (Software for Teaching English and Scottish Language and Literature) has served the School of English and Scottish Language and Literature at the University of Glasgow since 1987. We have produced teaching packages and brought research databases into use in teaching. All our programs have been thoroughly tested in the classroom, and have been subjected to comment and detailed criticism by thousands of students, and their teachers, since their first appearance.
Teaching Packages
English Grammar: an Introduction
This revised program takes the university student of English from identifying parts of speech in context through the more complex processes involved in parsing phrases and clauses. The central feature of the teaching package is the exercise. There are five types, from naming parts of speech in context through to parsing phrases and clauses. The grammatical model is Hallidayan, but the primary reference is to the online coursebook. (Professor CJ Kay & Dr J.B. Corbett)
The Basics of English Metre:
This package uses a traditional approach to introduce students to the main conventional metres used in English verse, especially iambic pentameter. It can be used both with students who require a straightforward grounding in the subject and with those who study it in greater depth. (Professor CJ Kay)
ARIES Assisted Revision in English Style:
ARIES is designed to help the user achieve competence in written style. It consists of interactive exercises, each focusing on problem areas in punctuation, grammar and spelling, with explanations and examples. (Professor CJ Kay)
The Essentials of Old English:
These courses teach the rudiments of Old English from scratch. The courses consist of graded sessions, each a combination of different kinds of materials, including gap-filling, parsing, word lists and comprehension exercises. There is substantial help and texts with translations and notes. The current set of programs falls into two parts: Basics - a single series of short exercises designed for beginners in the subject; Plus - a suite of five distinct sets of exercises, designed to support a structured learning-programme whereby students of Old English develop a thorough grounding in the principles of Old English grammar. (Dr JJ Smith)
A Guide to Scottish Literature:
Essays on Scottish Literature, covering the period from 1350 to 1920. The essays are followed by notes, questions and topics for discussion, and are suitable for use in courses from sixth-year studies upwards. The material is divided into three sections: Medieval and Renaissance Literature 1375-1700; Poetry and Fiction 1700-1900; Poetry, Fiction and Drama Since 1920. (Professor DG Gifford)
A Guide to Older Scots:
This package consists of a reference book for students of the Older Scots language. It contains an introduction, an historical outline, a summary of the chronology, information on spelling, pronunciation, vocabulary, grammar and style, and comprehensive reading lists. (Dr JJ Smith)
An Anthology of 16th and Early 17th Century Scots Poetry:
The anthology contains works of twenty of the major Scottish poets of the period. It is linked to online teaching materials for a course in medieval and renaissance Scottish Literature. (T. Van Heijnsbergen)
STELLA also maintains a web site for Scots texts: STARN the Scots Teaching and Research Network <http://www2.arts.gla.ac.uk/COMET/level2.htm> and email discussion lists for Scottish Literature and Language <http://www.mailbase.ac.uk/lists/scotlanglit-all/sub.html>.
A Guide to Piers Plowman:
This is a full hypertext edition of the B- text, complete with notes and supplementary materials. It is intended for a spectrum of undergraduate users and provides assistance in areas such as glossing and explanatory notes. (DM O.Brien)
The Historical Thesaurus of English:
This is a major research database project which is creating the first historical thesaurus to be compiled for any of the world's languages, and will include almost the entire recorded vocabulary of English from Old English to the modern period. It will resemble works like Roget's Thesaurus in that words will be arranged according to their meanings rather than listed alphabetically. It will, however, differ from any other thesaurus so far produced by listing obsolete words and obsolete meanings of current words as well as treating contemporary English comprehensively.
(4.4.2) Experience and Expertise: The Humanities Advanced Technology and Information Institute, University of Glasgow
Ann Gow
University of Glasgow, UK
The mission of the Humanities Advanced Technology and Information Institute (HATII) is to encourage actively the use of information technology and information to improve research and teaching in the arts and the humanities.
The Humanities Advanced Technology and Information Institute builds on over 15 years experience in humanities computing at the University of Glasgow. The current expertise advances research and teaching practices through an expanding academic programme in humanities computing at introductory, honours, and postgraduate level. The undergraduate courses include Introduction to Digitisation for Research and Preservation and Cultural and Heritage Computing.
In addition to supporting collaborative research projects within the Faculty of Arts of the University of Glasgow, the Humanities Advanced Technology and Information Institute manages its own research programme in the area of humanities and heritage computing. Examples of research projects include:
The six main areas of activity in HATII are:
(4.4.3) A Case Study of an Evaluation Questionnaire Concerning the Integration of a Hypermedia Project in University Courses
Liliane Gallet-Blanchard
Marie-Madeleine Martinet
Université de Paris-Sorbonne, France
Summary
The presentation will develop the results of a questionnaire survey following the introduction of hypermedia products in cultural history and humanities computing courses. This experiment concerns the use of a hypermedia CD-ROM on Georgian Cities (developed by the Research Centre 'Cultures Anglophones et Technologies de l' Information' CATI at the Université de Paris-Sorbonne), a prototype of which was demonstrated in a paper at ACH-ALLC 1999; in the academic year 1999-2000 it is used in teaching in four universities, Paris-Sorbonne, Nanterre, Lille and Mulhouse, both in classes on eighteenth-century cultural history and on humanities computing. Concurrently, a website was created, meant to serve as supporting material for the courses, containing chapters presenting the CD-ROM, sections on documentation methods and gateways to relevant websites on the 18th century (in particular 'virtual cities' sites developed by the Universities of Bath and Missouri), on literary resources, and on architectural history. This was part of a strategy to develop the teaching of documentation methods and electronic resources in humanities courses, with which the research centre was entrusted.
Presentation
The presentation will consist of a demonstration of the products based on the analysis of the questionnaires, highlighting the sections which were shown by this survey to be significant in terms of learner use of media. Illustrative material such as syllabus outlines, students' worksheets and documents used concurrently will be available, as well as samples of website pages.
The survey
It was based on four questionnaires, two for teachers and two for students, in each case one pre-program and one post-program. The questionnaires were worked out on the models recommended by the Oxford CTI (Porter) and by the Open University's Programme on Learner Use of Media (PLUM). The CTI questionnaire was used especially for its structure relating the general background of humanities computing courses and the details of the current experiment, and for its format. The PLUM recommendations which were particularly relevant were those on alternative conditions of use in the instructional context, and on the interaction between interface and contents, assessing improvements both in approach and in knowledge. The points emphasized in the present questionnaires focused on two main types of questions: 1) the integration of IT resources in the syllabus (Deegan) - apart from the present CD-ROM, and also including it 2) interface/navigation procedures for the present CD-ROM, and their relation with the contents, focusing on the potential of multimedia presentation for the development of interdisciplinary approaches (Zuern). The survey also included monitoring reports by the teachers, where they recorded the stages in the students' use of the CD-ROM, and their methods for resorting to other materials used in conjunction (for instance comparison with websites or paper documents, note-taking).
The results
A detailed analysis of the survey will show the results sorted according to several criteria, e.g. level of studies, or subject. It will also correlate the answers according to previous experience of IT in the curriculum.
References
Deegan, Marilyn (2000). 'From innovation to integration.' Villes
visite virtuelle. PUPS, Paris. 7-19.
PLUM <http://iet.open.ac.uk/Plum/evaluation/contents.html>
Porter, Sarah, and Lisa McRory (1998). 'Digital Text in Humanities
Teaching.' In Lou Burnard, Marilyn Deegan and Harold Short (eds)
The
Digital Demotic: A Selection of Papers from Digital Ressources in the Humanities
1997. King's College, Office for Humanities Communication, London.
Talarico, Kathryn Marie (1999). 'Cyberspace without Tears: Fundamental
Approaches to the Uses of Technology in the Classroom.' Literary and
Linguistic Computing 14.2 (June 1999)199-210.
Zuern, John (1999). 'Timelines OnLine: Hypermedia and Information
Architecture in the Representation of Intellectual History.' Paper at ACH-ALLC,
University of Virginia, 12 June 1999.
(4.4.4) Collaborative Campus: Rhetoric, Technology and Classroom Re-Composition
Lissa Holloway-Attaway
Georgia Institute of Technology, USA
Patricia Worrall
Gainesville College, USA
Angela Mitchell
Laura Andrews
Christy Desmet
University of Georgia, USA
Greg VanHoosier-Carey
Georgia Institute of Technology, USA
Our group proposes to demonstrate the technologies, methodologies, and assessment criteria used to develop a University System of Georgia grant-funded, multi-campus distance education project. The project, initiated in September 1999, links faculty and students on three campuses in a collaborative first year writing course through the use of synchronous and asynchronous Internet-based applications. The three schools participating in the project are uniquely diverse in student population and institutional and disciplinary objectives: Gainesville College (a two-year liberal arts school), the University of Georgia (a four-year liberal arts research university), and Georgia Institute of Technology (a four-year technical/engineering institute). The virtual learning community created with the technologies enables students to conduct group discussions, complete critical reading and writing assignments, and design hypertext research projects. The students' practical experience using these collaborative writing tools is reinforced by the course content: a cultural analysis of scientific and technological rhetorical representations. Using fiction, non-fiction, film, and new media, students explore how cultural perceptions of identity and community are transformed by innovative theories of "progress." In our presentation, we will 1) discuss the project's disciplinary and pedagogical objectives, 2) demonstrate the synchronous and asynchronous collaborative environments used to facilitate group discussion, conduct research, and implement hypertext documents, 3) discuss the methods used to promote the desired collaboration via these tools, and 4) present the preliminary results from our assessment of the course.
Collaborative Campus was motivated by our interest in exploring more efficient and pedagogically effective technologies for hybrid conventional/distance learning education, specifically for technology-infused writing courses. Developed in Fall Semester 1999 and delivered and assessed in Spring and Summer 2000, the project is one that promises to extend recent developments in technology and humanities research. Currently, a significant amount of the university interest in educational technology is being directed towards distance education initiatives. This is a mixed blessing for many of us in the computers and writing community. On the one hand, it can mean increased funding for equipment and for teaching initiatives focused on integrating computing technology into current curricula. On the other hand, it often requires vigilance in resisting the delivery-oriented, technology-based format that has become commonplace in many distance education courses. Although many humanities instructors have experience teaching with technology that promotes collaboration, the reality is that much of the current technology is either not technically suited or not sufficiently robust to facilitate distance education writing courses. Furthermore, the relationship between learning and pedagogy in local computer lab classrooms vs. delivery in virtual environments has been insufficiently explored. At present programs, such as Daedalus, are not internet-capable, and text-based environments, such as MOOs, are not conducive to modes of collaboration beyond informal conversation. In order to conduct truly collaborative computer-intensive writing courses in campus labs and in virtual classrooms, we need both tools and methods suited to this more dispersed teaching arrangement.
Georgia Institute of Technology's School of Literature, Communication, and Culture (LCC) has been committed to researching and developing technologies that will promote collaboration and increase rhetorical skills in both local and distance learning environments. From 1996-1999, the Pilot Writing Program in Electronic Environments enabled faculty and students to explore in increasing depth the use of Internet technologies in the humanities. Faculty working in LCC, including those in the New Media Center for Education and Research and in the Information Design and Technology graduate program, worked in the Pilot Program to develop strategies that combined pedagogical interests with technological development. Using the Pilot Program as a model and extending it to include a larger institutional community, the Collaborative Campus project continues to refine the use and design of educational technologies in computer lab and distance learning settings. Although Gainesville College and the University of Georgia had clearly established technologies in the writing classroom as a priority and had technical/pedagogical support services in place for faculty, our project allows for the continuing development of effective tools and teaching methods for computer-intensive education. The project faculty, teaching a diverse student population on each of the University System of Georgia campuses, is positioned to provide a broad spectrum of experience about the effectiveness of educational technologies in humanities courses and to advise others interested in developing interdisciplinary technology-infused curricula.
The collaborative writing technologies utilized by project participants (140 students and 7 faculty) support a variety of rhetorical and pedagogical purposes, while challenging the students to adapt their communication styles to the distinct capabilities of each. TechLINC, our custom-designed synchronous application, is a virtual campus with classrooms, design studios, and "outdoor" meeting spaces that adapts the Palace software, a popular Internet graphical chat environment, to the educational needs of our project. Like much of the traditional collaboration software used in writing courses, TechLINC allows students to conduct and log synchronous discussions; however, whereas most other programs work only on local-area networks, TechLINC is accessible from any computer on the Internet. Thus it facilitates the multi-campus collaboration necessary among project participants. More importantly, TechLINC radically changes the nature of distance collaboration by augmenting text-based discussions with visual and audio resources. TechLINC's ability to interface with web browsers and multimedia Internet applications allows students simultaneously to access web pages or to hear and see streaming audio and video as they conduct their discussions. These capabilities, in turn, make it possible for students, in real-time and at a distance, to do joint Internet research, collaboratively develop web pages, or listen and discuss a recorded news program or interview. They also provide a means for distance collaboration between instructors and students. Instructors teaching with TechLINC each have a virtual office in which they can meet with students to review assignments. Because of TechLINC's ability to interface with web browsers, instructors and students can also jointly examine and discuss assignments uploaded to the Web, or to Web Crossing, our asynchronous application. Web Crossing allows students to respond in threaded discussions to study questions, post drafts for collaborative peer-review, and link to Internet sites that document their research or preview their own hypertext projects. Because Web Crossing is accessible through a Web Browser, requiring no specialized software, it provides a very cost-efficient means for providing students with collaborative writing and research opportunities. As a counter-point to the more informal discussions generated in TechLINC, Web Crossing archives the real-time discussions of the chat application and provides a more structured electronic environment to post extended responses to critical reading, writing, and design-related issues.
Successfully integrating technology and teaching is a primary objective for the project participants, and the course content, along with the writing, on-line research, and hypertext design assignments were developed in conjunction with evaluative criteria to measure student learning using different pedagogic modes suited to each technology. Assessment methodologies, including pre- and post-course surveys, focus groups, and independent writing evaluations by non-participating teaching faculty will critique a number of areas relevant to collaborative computing technologies and rhetorical studies: synchronous vs. asynchronous communication styles, the rhetorical capabilities of hypertext vs. conventional essay writing, and student attitudes about technology as a writing and research tool. In general, the project is designed to foster among students a more rigorous imperative for discovering how electronic writing is used to communicate disciplinary content, while providing faculty with precise data to measure student success in diverse electronic environments.