The Design of the Database
As introduced in the section Context, there are various types of data – catholic institutions and people – in the almanacs. What’s more, these data are interconnected with each other: in a specific year, for example, a church may be attending a few other smaller churches, while having multiple priests residing in it. In other words, as shown in the picture below, there are a lot of many-to-many relations between catholic institutions (for the sake of simplicity, I named them “church”) and catholic institutions, as well as between catholic institutions and people. Plus, this section will be updated once I finish redesigning the table structure on the backend.
Besides, it is noteworthy that the relationship between catholic institutions and catholic institutions, as well as between catholic institutions and people, can change over time. In the following demonstrative chart, we can see that compared with the year 1869, church001 no longer attends church002 in 1870, and person002 switches to church002 instead of staying in church001. Similar changes in how institutions and people are related to each other take place very frequently in the almanacs.
Therefore, in order not only to track and search for the evolvement of metadata (for example, how a catholic church’s name changes over time) but also the evolvement of the relationship between different categories of data, a database is needed to store the metadata and the relationship between metadata. This chart illustrates how I designed the tables:
The table “church” records the unique, immutable institution ID (instID) for each catholic institution. The table “churchInYear” corresponds to all the potentially mutable information of a specific institution in a specific year – its name, language, type, diocese, place, etc. Similarly, the table “person” records the unique, immutable person ID (persID) for each person, and “personInYear” has the information of a person’s name, title, suffix, etc. in a specific year. “churchChurch” and “churchPerson” tables are middle tables indicating church-church relationships and church-person relationships in specific years. Note that several fields like uniqueInstID, uniquePersID, and uniqueAttendingInstID do not belong to the original data from the almanacs but are autogenerated in the data preprocess step. These fields are added to facilitate the association of tables.
With all these tables created and associated, the database records the information of catholic institutions, people, and the relationship between them in different years.
Click here to see how I used different web frameworks to build this database.