Structuring a database is not an easy task. During this year of work, we have faced many challenges that have required from us great intellectual efforts and reflection. Nevertheless, I have heard from “digital humanists” and programmers that because we have a software developer, we are not making the database, that someone is doing it for us. The underlying argument is that we need knowledge on basic principle of programming such as HTML and CSS to claim authorship in the making-process. Having that programming skills today is helpful. However, that our participation on programming is limited does not mean we are not the main creators of the database. This blog shows some of the main challenges that make us -the historians- crucial for this type of project and it is, in part, an answer to technocratic point of views on the relationship historians and software developers.
First, the concept of the project –databasing baptismal records–, is ours. This project is not something that anyone could have imagined without the proper historical training. You need to know about sources, their internal logic, the institutions that produced them, paleography, and other language skills. It is important to decide the fields that can be extracted from the sources without violating the integrity of the documents. We have to respect historical concepts and to know that their meanings changed over time. We decided how to organize the fields in a coherent and hierarchical way. We need to translate our needs to programmers without historical training. We, historians, are the most important actor. Thus, HTML and CSS play a minor role to conceive the idea. The developing part is crucial, but should not be confused with the first step. This assertion is true for those cases where social scientists rely on programmers to materialize their projects.
We had important elements in our advantage when we started this project. First, the digitized copies of the original documents are available online. The project “Ecclesiastical & Secular Sources for Slaves Societies” (ESSSS) has digitized and posted online the parish records from Colombia, Brazil, Cuba, and Florida. Without this amazing repository, our database would have been impossible. These baptismal records are geographically, linguistically, and temporally diverse but, due to the centralized nature of the Catholic church, they are also homogeneous sources, regardless of language, period, and region. This circumstance makes them the perfect candidate to build a transnational standardized database. It makes also doable to move the data from the digitized documents to an accessible, searchable, malleable, and “cleaner” digital format. It sounds easier than it is though.
Defining the categories or fields that will be in the search tool is definitely challenging. Even when the documents are homogeneous, there is often new information showing up we need to decide if it deserves an individual field or not. Databases must have a limited universe of regular fields to make them functional. We restricted our variables to those that regularly appear in the documents and those which do not show up frequently are included in the field “Miscellaneous.” Deciding the fields is not the only challenge. Naming the fields is another difficult step. Take the example of race and ethnicity. Categories, language, and meanings of race differ over time and by region. For instance, the are sometime equatable categories of race from the Portuguese and from the Spanish-speaking world. Anglo-speaking regions have had different definition of race. In both cases, race categories are subjected to change over time. We do not want to violate the documents, thus, we kept race as it appears in the sources, including the original language. Something similar happens with African ethnic designations in the Americas. Across different regions, African origins are defined in every document as nations. We keep the term “nation” as it appears in the document, although sometimes these categories do not represent and ethnic identity that carried meaning in an African context. These decisions resulted after long discussions and after reading the most important historiography on the topic. There is always a great space for disagreement. The next post will discuss how we structured the fields in a relational diagram.