Computational Analysis of feature distributions in image-text compositions in digital collections – Svenja Lange – Ph.D. Fellow

Personal Background and Interests

As a PhD fellow supervised by Prof. Weddigen, I am currently working with digitized art historical collections with image and text components and metadata. In accordance with Digital Visual Studies and the research areas of visual and textual computing as well as spatial analyses in terms of image layouts, my project centers around image-text compositions. I elaborated a research concept combining automated research for digital collections with art historical and linguistic research interests while staying at the Bibliotheca Hertziana to familiarize myself with the collections, Digital Humanities endeavors, and art historical concepts.
During my B.A. with a specialization in linguistics and media studies, including studies of multimodality, I became aware of the potentials of quantitative and computational approaches to complex data, as well as definition entanglements and limitations that can come along with its manual analysis. For this reason, I decided to study an M.Sc. in computer science to gain a better foundation in programming and algorithmic thinking and specialized in media informatics and AI and Machine Learning. Before joining DVS, I worked as a Software developer for media digitization workflows and as a research assistant in the library of the Max-Planck Institute for Mathematics in the Sciences, where I developed computer vision applications for scanned documents. This also included Document Layout Analysis, which is especially relevant to my current project.
My current focus lies in computational humanities and automated research on digital collections. Computational approaches in the Humanities gain in significance with the large collections of digital data. With my research I would like to enhance the understanding of their potential in respect to collections and gain statistical insights on the roles of features and records in relevant fields of research.


PhD Project – Distributional Analysis of Features and Metadata in Digital Collections with Image-Text Compositions


Multimodal data has been analyzed across disciplines to gain insights into meaningful interrelation structures across assumed signs or semiotic artifacts. In my PhD project I am planning to develop a computational approach to the analysis of interrelation structures in image-text compositions. For this purpose I will combine computer vision and Natural Language Processing methodology with a background in art history and studies of multimodality and semiotics. Early modern emblems are an interesting case study in this respect because they offer an allegorical composition of text – a motto and a description – with an image. Visual, as well as textual features of this data will be analyzed concerning their contextual appearances and distributional aspects within and across collections.
Furthermore, metadata units are taken into account, which provide a fruitful basis for research in data-relevant fields. The project will range from the development of a feature extraction framework and method that is both general, as well as adaptable to the nature of a defined variety of image-text compounds over the analysis of their distributions to visualizations of the results

The objectives of this study include a statistical research basis relevant to fields connected to the collections and automatized research on digital collections itself. Furthermore, patterns in the computational understanding of image-text compositions can be found. A combination of state-of-the-art machine learning and more transparent traditional computational methods can further provide insights into relevance metrics and some of the potentials of a computational research discipline connected to the fields.
Data Scopes
The project involves the definition of a data scope which must consider both the relevance of computer vision features across visual compositions and the availability of data, as well as further constraints such as language style and historical focuses and ranges. In this respect the study of early modern emblems can involve the analysis of print medium with a drawn appearance in allegorical, historical, multicultural and multilingual contexts and a wide variety of visual and textual appearances. This implies a complex range of data requiring normalization techniques for cross-collection studies, although some computational parts may be simplified by their visual appearance and allegorical compositions. This also makes their study interesting in terms of symbolism and iconography, since a more statistical account of the distributions of certain symbolisms can be achieved, as a certain visual appearance or text (so a signifier) does not necessarily refer to the a single meaning (signified) and needs to be studied in its setting.

Feature Scopes
In respect to this study, visual features are defined in the scope of layout arrangements and components, while text features are defined via automatic annotation linkages at several levels. The analysis of interrelation structures will also take place at several levels, ranging from rather direct annotation linkages within emblems to further contexts and interrelations within (and possibly across sub-) collections. The study of interrelations between features and metadata can seen in the context of relevance (e.g. the study of relevance of certain features and further metadata annotations within a collection and fields). At the same time the metadata annotations can enhance the quality of this statistical research, while considering biases.