Welcome to the African Anaphora Project
This website was initiated with support of NSF grant BCS 0303447 and is currently supported by NSF BCS 0523102, Ken Safir, Principal Investigator
The main goal of the African Anaphora (Afranaph) Project, as it is presently constituted, is to develop rich descriptions of a wide range of African languages in order to serve the interests of linguistic research into the nature and distribution of anaphoric effects. Anaphoric readings, in the sense intended here, are those readings where one linguistic form, such as a pronoun, reflexive or reciprocal, refers back to a previously mentioned form in the sentence or in the discourse. It appears that every human language has at least some specialized forms that achieve such effects. This website explores those forms and effects for every African language that native speaker linguist consultants are willing to help us with.
Although our project is informed specifically by the research goals of generative grammar, it is our intention to make the data we collect as accessible as possible to any linguist with an interest in these languages or more general issues in crosslinguistic comparison. The data we present is collected on the basis of complex and comprehensive questionnaires that are to be filled out by native speaker linguist consultants. This project has become more feasible at this point in history not only because there are an unprecedented number of trained African linguists who are potential participants in our project, but also because the resources of the web and the internet make it possible for more efficient remote participation. Our method begins with a questionnaire (and there will be more to follow), which has been developed to explore every domain in which at least some language has been known to use specialized forms to achieve anaphoric readings.
Our website also offers a means of attracting the participation of otherwise isolated scholars who have much to offer, while providing useful training for our interns and graduate assistants who help to run the site (see Staff). In the course of our operations, we hope that the network of consultants and researchers our project brings together will make it possible to explore other areas of grammar (outside anaphora) by the same means and with the same network, creating, in effect, a community space for research into African languages.
The languages for which we have collected data are arranged on separate pages in this site called "case files". The case file for each language in our study has the data elicited by our standard questionnaire, supplemented by follow-up elicitations designed to explore areas that are richer in the language in question than the same area may be in some other language. In addition, each case file has a short grammar sketch of the language, a bibliography, and a sketch of the anaphora system (highlighting aspects of anaphora in the language that appear to raise interesting theoretical or analytic issues). In the future we hope that each case file will also contain audio files, papers available on line, and responses to additional questionnaires on other topics. Those interested in going directly to any particular case file can simply click on the listed case files in the left margin, but users are encouraged to look at About the Case Files first, where certain conditions on the use of case file materials are discussed.
Interested users can access our The Anaphora Questionnaire (AG) to see our basic field elicitation document. A fuller description of the goals of our project is available in the form of a revised version of the The Project Statement originally submitted to NSF. Information about our Elicitation of Primary Data, about how to Contact Us, about how to Become a Consultant, and our Plans for the Future of the site, are all available below.
Our History and Prospects
The work of the African Anaphora Project (AAP) was initially supported by NSF grant BCS 0303447, a small, one year exploratory grant (2003) that was renewed for 2004. This preliminary support enabled us to explore the potential that our method of elicitation and our website resources might hold for future research. Because our funds were limited, we restricted ourselves to the development of only five case files in order to hone our methods, our mode of operation and the presentation of data on our site, but the future of the project depended on further funds to extend our investigations to the widest possible range of languages and to improve the information technology that supports our site so that participation in our project and access to our results could be maximally facilitated. In March, 2005, we introduced this, our provisional website.
In the summer of 2005, NSF approved new, broader funding for our project, and we are now in the process of expanding our research to additional languages (see Become a Consultant) and developing the site technology (see Plans for the Future). It is not clear whether the introduction of new technology will result in a reconstituted website that is discontinuous with this one, or if our website and operations will evolve incrementally until we have introduced most of the features we envision. However, until all our completed case files are visible and easy for visitors to use, we will not have realized our goals. On the other hand, the resources in place assure that our project has the potential for a long future ahead of it.
The Elicitation of Primary Data
Our elicitations are designed to be highly detailed and specific to portions of grammars that can be compared across a wide range of languages for the purpose of theoretical exploration.
Our theoretical bias is toward nativist accounts of language competence, hence we posit a universally available language forming capacity in human beings that operates no differently in Africa than it does anywhere else. From this point of view, even if the world's languages vary enormously, thanks to relatively small formal differences with large effects, or because lexicons and phonological forms must differ, there is still a core plan common to all the world's languages that can most profitably studied by closely studying how they vary. Some impossibilities are common to all the world's languages for example, hence we are just as interested in what linguistic forms are not possible as we are in what forms are possible. Since our goal is not pedagogy, we may explore constructions that many speakers consider marginal, especially if they sharply distinguish the marginal ones from similar constructions that are completely unacceptable. We fully hope and intend, however, that researchers with goals different from ours may find our data pertinent and accessible.
An important aspect of our methodology is that our questionnaires are designed to be filled out by consultants who are native speakers trained as linguists (at more than rudimentary levels of training). At this point in history, there is an ever-widening pool of potential consultants who fit this description, and we have recently begun to recontact many of these who volunteered before, but whom we had to turn away before given our limited resources.
Elicitations only begin with the questionnaire. After the questionnaire has been filled out, there ensues a follow-up process with a fair amount of back and forth to clarify the responses and questions. Additional follow-up questions are asked that are designed to explore issues and constructions that can be profitably explored for those domains where one language has a more articulated set of distinctions than others (for example). Thus the presentation of the data will contain more than just the answers to the questionnaire to be found in every language case - there will also be emphasized points of interest with additional data, as our resources permit.
When the elicitations are complete, a sketch of the results is written (by the project director, the consultant, or a collaboration between them and/or other consultants). It is part of our plan for the future that questionnaires will be available to be filled out directly online, but this technology is still under development.
Our anaphora questionnaire (AQ) is designed with every known language in mind and our current understanding of linguistic anaphora in mind in particular. Thus the AQ is in no way customized to the effects present in any particular language, but instead addresses as many interesting questions as we are aware of for the domain of grammar we are studying. The design of the questionnaires will naturally reflect the perspectives of, and serve the research purposes of, those who design them, but as the project develops, questionnaires may be developed by researchers with different perspectives and interests, including our native speaker linguist consultants. Naturally, we expect that our questionnaires will be revised as commentary and experience show their weaknesses. Additional questionnaires on other aspects of grammar are part of our plan for the future.
Making Our Results Accessible
It is our goal to make our thick descriptions as accessible as possible to researchers of a wide variety of theoretical perspectives, as well as to those interested in using some of our results for pedagogical purposes. The presentation of the data in an AQ response (AQR) will closely follow the format of our elicitation questionnaires, which will be as uniform as possible across languages. This uniformity will permit close comparisons, but every AQR is laced with commentary by our consultants and so each one takes on an individual character in terms of how the analytic comments are introduced. When our IT platform grows more sophisticated, we expect comparisons across AQRs to be available with a click or two by means of functions available in a toolbox. The toolbox, however, is still in the works.
It is our intention that, as much as possible, our data should be available through open source applications, especially since many of our participants and site users may lack the resources to use proprietary products. Our files are currently available in .pdf form (read only), which can be read with Adobe Acrobat Reader, a program available free of cost on the internet from the manufacturer. If you want to write a .pdf file, you have several options: Adobe Acrobat and Mac OS X (PDF functionality is built in and described here) are commercial solutions; free solutions include the Windows software listed here and the cross-platform GhostScript software. For those who are interested in participating, manipulable versions of the questionnaire are available (see Become a Consultant). In the meantime, if you have any difficulties accessing our materials Contact Us. However, when the software template for our AQRs is functional, no text editing program will be necessary to see our data online.
Become a Consultant
Our method begins with native speaker linguists who are willing to fill out the AQ to the best of their ability and to participate with us in follow-up questions. Although it is possible and anticipated that some linguists may choose not to reveal their identities in order to participate in the project, we find that most of our consultants do wish to be known, and some of these consultants may be asked to contribute to some of the other aspects of our files, including the grammar sketch, the anaphora sketch, the translated tale (with morpheme breakdowns and glosses) the bibliography and so forth. Moreover, consistent with our limited level of financing, we expect to pay our consultants for various levels of participation, but our funds are not limitless, so we can only work on so many cases at a time.
Although the initial elicitation is our pre-prepared AQ, the process of developing the file is an interactive one between our consultants and our project workers (including the principal investigator, his graduate assistant, and additional caseworkers that are occasionally employed). If consultants are inspired to use any of our data in their work on their language, they are encouraged to post their work on our site within the relevant case file or in our technical reports, a feature we hope to develop (see our Plans for the Future).
However, as stated above, our method begins with the AQ, and we encourage those considering participation in the project to explore the AQ first. This document is long and detailed and most potential consultants will discover that a great deal is requested of them. While linguistic training is required, the main effects of that training that is pertinent to filling out the AQ is that you need to be sensitive to matters of grammar in your language, that you are accustomed to some basic terminology, and that you have a sense of what a linguist might be interested in knowing. Some people have this latter sense with little training, but the AQ is not designed with naive native speakers in mind, and moreover, the level of commitment required is more typical of those who have a professional interest in matters of grammar.
If a review of the scope of the AQ does not discourage you, then we urge you to contact us at afranaph@rci.rutgers.edu and tell us a bit about yourself, including a brief account of your linguistic training, the language(s) you speak natively, your affiliation, if you have one, and how we may most conveniently maintain contact with you (e.g., do you have regular internet access, intermittent internet access, any level of web access, access to printers, or do you need to send and receive hard copy by mail or some other means). As much as our project is designed to exploit the modern conveniences of computers and the web, we are also committed to facilitating data collection consistent with our goals in whatever way is expedient. For example, we are prepared to transcribe handwritten materials sent to us by our consultants if that is the only way to gain access to what you know, and we have done so in the past.
Some of you may consider collecting data for us as one who administers the AQ to others who are native speakers of the subject language. We little experience of this so far, but we hope that the experiences of those who are willing to try such an experiment will instruct us on the viability of such projects. We anticipate considerable complications in the follow-up phase, as we would need to contact the same speakers with additional questions through the mediation of the data collector. However, we want to be as flexible as possible, especially in cases where there is no native speaker linguist we can count on, and even more so if the language in question is in any danger of extinction.
In addition to the .pdf version of the AQ mentioned above, which is read only, those who would like to download a version that they can manipulate and respond to with interlinear commentary and examples should download the .doc version, which can be read by MS Word. If you do not have MS Word, please let us know and we will do our best to make a questionnaire available to you, perhaps even, if the postal service in your country is efficient, with a hard copy sent by snail mail. We also have a version of the AQ in Corel Word Perfect.
It is our hope that the network of consultants and researchers that our project will bring together will make it possible to explore other areas of grammar (outside anaphora) by the same means and with the same network, creating, in effect, a community space for research into African languages.
About the Case Files
The materials available in the case files are the property of the African Anaphora Project (AAP) and are not to be published by anyone without the consent of the AAP director (currently the principal investigator of the NSF grant that supports the site). This right is not intended to discourage the use of any of our materials as long as proper standards of attribution are respected by those who employ our data in their published work. In other words, we expect, and actively hope, that researchers will lift portions of our data from the case files for the purpose of making analytic or theoretical arguments, but wholesale reproduction of our data without analysis or attribution is prohibited. The authors of the grammar sketches and the anaphora sketches and any accompanying articles that appear in a case file retain the right to publish these materials elsewhere without consulting the AAP director, but such articles and sketches are not to be excerpted unless the usual standards of attribution are met.
Not all of the case files currently on the site are complete, but we do not open a file unless we have at least a fairly complete AQR. We tend to stop working on a file for the time being once we have a complet AQR, a tale, an anaphora sketch and a grammar sketch. Sometimes a selected bibligraphy is also present as part of the grammar sketch or in a separate file. On most case file pages there is a heading About this file which contains pertinent information about the state and conditions of data collection, when we last changed something, and anything else that might be pertinent to that file. If a file is protected by password, that means it is not ready to be displayed because the work is too incomplete, but even these files can be accessed if you contact us and ask for permission to see them. We hope that researchers who encounter problems accessing the files on our site or those who have ideas about how our project, or our case files, could play a more effective role for the research community will send their comments to us (contact us).
We expect that the case files will undergo periodic revision, and any change in a case file that has been posted will result in a new version number, such as "1.0" followed by "1.1", if the revision is minor, and by "2.0" if the revision is considerable or the last version was "1.9". Citations should respect these version numbers. We will archive old versions of the files. This is a precaution that we observe for instances where a researcher cites a judgment in 1.0 that may be revised in 1.1 on the basis of convincing new evidence. No one looking at 1.1 would be able to recover the source of the researcher's published judgment if this precaution is not observed. Naturally, we hope to avoid such issues by getting things right the first time, but if we fail we want to be sure that the reputations of well-meaning researchers are protected from our errors.
Plans for the Future (and work in progress)
The current scope and configuration of our project and our web site are in line with the nature of our funding up to now and the limited resources that we have been able to call upon, but with the additional funding that has recently been granted by NSF, there are some natural extensions of our current project that we hope to pursue as aggressively as we can.
Data gathering
-
We are always on the lookout for those who might be willing to participate in our project as native speaker linguist consultants. Some hear of our project through the occasional e-mail solicitations we send out and others have met our representatives at conferences, but many consultants have come to us through personal contact or word of mouth. This has so far been enough to keep us busy, but we expect to make a greater efforts to publicize our project in the future with a view toward attracting new consultants.
-
We intend to make it possible for consultants to fill out the questionnaires online with code-protected access that allows them to work gradually through the questions.
-
We plan to develop a French language version of the AQ which will facilitate elicitations for those whose second language is French.
-
We will devise new questionnaires for other aspects of anaphora (a logophoricity questionnaire developed by O. Adesola and Ken Safir will be ready in 09/05), but we will also commission questionnaires that explore other well-studied, compact empirical domains that have been known to vary in interesting ways, such as the nature of questions or the nature of specialized focus constructions, in the languages that have them.
-
We will develop a cooperative networks of researchers that permit participants in the site, including our linguist consultants, to initiate empirical investigations broader than their native language(s). In other words, we expect our research platform to be flexible enough to serve the interests of anyone who has a good idea about what can insightfully be investigated using our resources, including phonological or semantic phenomena.
-
It seems a long way off just now, but we hope that audio files will eventually be added to our case files. In addition to important information about intonation that can influence anaphoric interpretation or the acceptability of phrases, well-developed audio files may also be a resource for phonologists interested in some of the languages in our case files.
Researcher Network Formation
-
In addition to the listserve that we will develop on our current site, we hope to eventually open a chat room for the members of the community we serve, in order to facilitate interaction and draw together researchers with common interests.
-
As our usership grows, we will have a bulletin board space and a newsletter space reporting what's new on the site and who is up to what.
-
We hope to develop a software library of open source materials that our users can download for their projects, including tree diagram programs, fonts, other forms of graphic representation, etc., and perhaps more ambitious analytic tools for speech recognition and data analysis. Participants in the project would make themselves available as references so that someone unfamiliar with a particular program could ask the listed reference person for advice on how to download it, install it and use it.
-
Case file consultant lists will be developed of linguist consultants who are willing to be contacted by other researchers with questions about their language. In this way it may eventually be possible to have several consultants available for a novel project on a language specific basis. This will also serve to form links between researchers on a given language, organized around the development of the case file for the language they speak.
Data Presentation
This is the domain that will be most affected by increased IT resources that until recently were beyond our reach. The template described below is in development.
-
The same template that is used for data entry by consultants on line will become the matrix for data presentation, such that a completed questionnaire can be almost instantly transformed to the searchable, readable, manipulable document that is seen online.
-
A search and comparison toolbox will be available that will permit our users to find and/or compare examples, glosses, translations, questionnaire section numbers and the like.
-
All characters and representations will be presented in a format that can be read on any computer, perhaps by means of an open source program that users can download to permit this use, perhaps by tools inherent to the site or both.
-
All data representation will be made consistent with all data-sharing protocols such as those suggested by the Linguistic Data Consortium. For example, we want the data we collect to be converted into an XML data format, which is a standard method of data representation, one that will allow the data to be used by tools developed in later phases of the project or by independent researchers. Similarly, bibliographic entries will be made consistent with existing protocols.
Additional Features
-
As our project grows and new work employs some of our data base, occasional papers will be published on the site as a series of technical reports connected with our project.
-
Useful links to other web sites will appear in appropriate places throughout the site.