普通视图

Received before yesterday

Workshop: Digitale Editionen der Zeitgeschichte zwischen KI und Linked Open Data

2025年11月7日 20:43
Das aktuelle Bild hat keinen Alternativtext. Der Dateiname ist: KGParl_titel_L.png

Datum: 4.–5. Dezember 2025
Ort: Kommission für Geschichte des Parlamentarismus und der politischen Parteien (KGParl), Schiffbauerdamm 40, Berlin
Veranstalter: Kommission für Geschichte des Parlamentarismus und der politischen Parteien (KGParl)

Beschreibung

Am 4. und 5. Dezember 2025 findet in den Räumen der Kommission für Geschichte des Parlamentarismus und der politischen Parteien (KGParl) in Berlin der Workshop „Digitale Editionen der Zeitgeschichte zwischen KI und Linked Open Data“ statt.

Der Workshop widmet sich den aktuellen Herausforderungen und Chancen, die der Einsatz von Künstlicher Intelligenz (KI), Großen Sprachmodellen (LLMs) und Linked Open Data (LOD) für digitale Editionen historischer Quellen mit sich bringt. Im Fokus stehen insbesondere zeitgeschichtliche Materialien mit politischem, verwaltungsbezogenem und diplomatischem Charakter – etwa parlamentarische Quellen, Kabinettsakten und Verordnungen.

Ziel der Veranstaltung ist es, Potenziale und Grenzen neuer Technologien in der digitalen Editorik zu beleuchten und zu diskutieren, wie sich diese auf editorische Standards, Workflows und die wissenschaftliche Nutzung auswirken. Dabei wird die Frage in den Mittelpunkt gerückt, welche methodischen und technischen Innovationen geeignet sind, um digitale Editionen langfristig nachhaltig, interoperabel und anschlussfähig an wissenschaftliche Infrastrukturen zu gestalten.

Der Workshop bringt internationale Expertinnen und Experten aus Editionswissenschaft, Digital Humanities und Informatik zusammen, um Perspektiven für die Zukunft editorischer Arbeit im Kontext KI-gestützter Analyse und semantisch vernetzter Daten zu entwickeln.


Programm

Donnerstag, 4. Dezember 2025

13:00–13:30 Uhr – Begrüßung (KGParl): Grußworte und Einführung

13:30–14:30 Uhr – Max-Ferdinand Zeterberg (SUB Göttingen): Eine digitale Edition auf Basis von Labeled-Property-Graph

14:30–15:30 Uhr – Stephan Kurz (Österreichische Akademie der Wissenschaften, Wien): Wir fangen die Kalender ein! Datenmodellierung zum Austausch von (Protokoll-)Editionen

15:30–16:00 Uhr – Pause

16:00–17:00 Uhr – Michael Schonhardt (Technische Universität Darmstadt): „Diese Anfrage verstößt gegen die Inhaltsrichtlinien“: LLMs und RAG in der editorischen Praxis

17:00–18:00 Uhr – Daniela Schulz (Herzog August Bibliothek Wolfenbüttel): KI und die „Edition der fränkischen Herrschererlasse“ – Warum (noch) nicht!

Freitag, 5. Dezember 2025

09:00–10:00 Uhr – Hennie Brugman (KNAW Humanities Center, Amsterdam): Publication of historical parliamentary resolutions using automatic text recognition and modern web standards

10:00–11:00 Uhr – Dimitra Grigoriou (Österreichische Akademie der Wissenschaften, Wien): Overcoming Historical NER Challenges: A Case Study of the Austrian Ministerial Council Protocols

11:00–11:30 Uhr – Pause

11:30–12:30 Uhr – Monika Jantsch, Peter Land (Deutscher Bundestag): Das Dokumentations- und Informationssystem für Parlamentsmaterialien und seine API

12:30–13:30 Uhr – Maximilian Kruse (KGParl): Open by Default? Warum viele digitale Editionen nicht so offen sind, wie sie scheinen


Kontakt:
Kommission für Geschichte des Parlamentarismus und der politischen Parteien (KGParl)
E-Mail: kruse@kgparl.de, juengerkes@kgparl.de

Hinweis:
Der Workshop wird zusätzlich über Zoom übertragen. Eine digitale Teilnahme ist möglich.
Eine Anmeldung wird per E-Mail an kruse@kgparl.de erbeten.

________________

Workshop: Bring your own Linked Open Data (Mainz, 03./04.12.)

2025年10月27日 21:03

Arbeitet ihr mit RDF-Daten (Linked Open Data) oder nutzt Wikidata bzw. das Wikibase-Ökosystem? Habt ihr Lust, mit euren Daten praktisch zu arbeiten und dabei Unterstützung zu bekommen?

Dann kommt zum nächsten Bring Your Own Data Lab (BYODL) des HERMES | Humanities Education in Research, Data, and Methods Datenkompetenzzentrums am DH Lab des Leibniz-Instituts für Europäische Geschichte in Mainz! Dieses Mal dreht sich alles um Linked Open Data (LOD) in den Geistes- und Kulturwissenschaften. Euch erwarten zwei intensive Tage mit Praxisbezug, Peer-Feedback und gezielter Unterstützung durch die ExpertInnen Martina Trognitz und Christian Erlinger – und das Beste: Ihr arbeitet direkt an Euren eigenen Daten! 

Das Wichtigste in Kürze:

📅 03.–04. Dezember 2025
📍 Präsenz-Workshop: Leibniz-Institut für Europäische Geschichte, Mainz

Was euch erwartet:
🛠 Hands-on-Sessions: Praktisches Arbeiten mit Euren eigenen Datensätzen
🔎 Vertiefung spezifischer LOD-Themen und Anwendungsfälle
🤝 Kollegialer Austausch, Peer-Feedback und gezielte fachliche Unterstützung

ExpertInnen: Martina Trognitz und Christian Erlinger

Weitere Informationen:

Der Workshop setzt grundlegende Kenntnisse in Linked Open Data voraus und bietet eine vertiefende Behandlung spezieller Themen und Anwendungsfälle in den Geistes- und Kulturwissenschaften. Hier geht es um:

  • Linked Open Data in der Forschungspraxis
  • Wikibase für Forschungsdaten und -fragen …
  • Die praktische Arbeit mit Tools und Abfragen in Wikidata

Nach verschiedenen Impulsvorträgen werden die eingeladenen ExpertInnen die Teilnehmenden bei der praktischen Anwendung von Linked Open Data am Beispiel der Plattform Wikidata auf die eigenen Forschungsdaten und -fragen unterstützen.

Das detaillierte Programm findet Ihr hier: https://hermes-hub.de/aktuelles/events/byodl-2025-12-03.html

Anmeldung:

Hinweis: Die Zahl der Teilnehmenden ist auf 15 Personen begrenzt – schnell sein lohnt sich!
Die Anmeldung ist bis zum 24. November 2025 geöffnet.

Kontakt: Dr. Judit Garzón Rodríguez: hermes@ieg-mainz.de

Bitte gebt bei der Anmeldung folgende Informationen an:

  • Fachgebiet
  • Welche Erfahrung habt ihr mit Linked Open Data?
  • Welche Art von Daten bringt ihr mit?

Das Team des HERMES Konpetenzzentrums am DH Lab des IEG Mainz freut sich auf Eure Teilnahme!

DiHMA.Lab mit MaRDI und NFDI4Objects am 14. und 15. Oktober an der FU Berlin

2025年9月7日 18:08

Am 14. und 15. Oktober 2025 lädt das DiHMa.Lab der Freien Universität Berlin in Kooperation mit dem ZIB zu einem gemeinsamen Workshop der Konsortien MaRDI und NFDI4Objects an die Freie Universität Berlin ein. Rund um das Thema „Objekte und Methoden“ wird der interdisziplinäre Workshop verschiedene Perspektiven aus der Mathematik und objektbasierten Geisteswissenschaften zusammenführen. Ziel ist es, Ansätze, Werkzeuge und Fragestellungen beider Disziplinen in den Dialog zu bringen, Schnittstellen sichtbar zu machen und gemeinsame methodische Herausforderungen zu diskutieren.

Das vorläufige Programm wird noch aktualisiert:

Tag 1

14.10.2025, 10:00-12:00 Uhr

Im Vorfeld des Workshops findet ein Ediathon: “Object meets Maths: Wikidata als Knowledge Hub” statt. Kommt sehr gern mit Daten und Notebooks vorbei.

12 Uhr Mittag auf Selbstzahlerbasis in der Mensa

14.10.2025 (13:00-16:45 Uhr)  

  • Icebreaker und Vorstellungsrunde (13:00-13:45)
  • Vorstellung der NFDI-Konsortien (14:00-14:45)
  • Statistische Methoden (15:00-15:45)
  • Datenintegration Wissensgraph (16:00-16:45)
  • Ab 17 Uhr gemeinsamer Ausklang auf Selbstzahlerbasis in der Eierschale

Tag 2

15.10.2025 (09:00 – 13:00 Uhr)

  • Wissenschaftliche Objektsammlungen und multimodale Forschungsdaten (9:00-9:45)
  • Use Case Datenanalyse “Windows on Data” (10:00-10:45) 
  • Interdisziplinäres Arbeiten (Challenges / Schwierige Fälle / was ist nicht operationalisierbar? / was ist bereits gut operationalisiert?) / MARDI als Methodendatenbank (11:00-11:45) 
  • Fishbowl: Zusammenarbeit mathematische Methoden in kulturhistorischen Fragestellungen (12:00-13:00)
  • Ab 13 Uhr gemeinsames Mittagessen auf Selbstzahlerbasis in der Mensa

Veranstaltungsort: Villa Engler, Altensteinstraße 2-4, 14195 Berlin (am Haupteingang des Botanischen Gartens)

Alle Informationen finden sich hier: https://www.ada.fu-berlin.de/ada-labs/dihma-lab/DiHMa-MARDI-N4O/index.html

Zur Anmeldung: https://www.ada.fu-berlin.de/ada-labs/dihma-lab/Modulbeschreibungen_DIHMa/PM-Anmeldung_DHIMA_LAB/index.html 

Mit freundlichen Grüßen Eure Local Organizers
Anja Gerber, Christin Keller, Dennis Mischke, Fabian Fricke, Marco Reidelbach, Marcus Weber

Interesting digital humanities data sources

2025年8月26日 12:00

I bookmark sources of data that seem interesting for digital humanities teaching and research:

  • showing humanists what data & datafication in their fields can look like
  • having interesting examples when teaching data-using tools
  • trying out new data tools

I’m focusing on sharing bookmarks with data that’s already in spreadsheet or similar structured format, rather than e.g.

  • collections of digitized paper media also counting as data and worth exploring, like Josh Begley’s racebox.org, which links to full PDFs of US Census surveys re:race and ethnicity over the years; or
  • 3D data, like my colleague Will Rourk’s on historic architecture and artifacts, including a local Rosenwald School and at-risk former dwellings of enslaved people

Don’t forget to cite datasets you use (e.g. build on, are influenced by, etc.)!

And if you’re looking for community, the Journal of Open Humanities Data is celebrating its 10th anniversary with a free, global virtual event on 9/26 including “lightning talks, thematic dialogues, and community discussions on the future of open humanities data”.

Data is being destroyed

U.S. fascists have destroyed or put barriers around a significant amount of public data in just the last 8 months. Check out Laura Guertin’s “Data, Interrupted” quilt blog post, then the free DIY Web Archiving zine by me, Quinn Dombrowski, Tessa Walsh, Anna Kijas, and Ilya Kreymer for a novice-friendly guide to helping preserve the pieces of the Web you care about (and why you should do it rather than assuming someone else will). The Data Rescue project is a collaborative project meant “to serve as a clearinghouse for data rescue-related efforts and data access points for public US governmental data that are currently at risk. We want to know what is happening in the community so that we can coordinate focus. Efforts include: data gathering, data curation and cleaning, data cataloging, and providing sustained access and distribution of data assets.”

Interesting datasets

The Database of African American and Predominantly White American Literature Anthologies

By Amy Earhart

“Created to test how we categorize identities represented in generalist literature anthologies in a database and to analyze the canon of both areas of literary study. The dataset creation informs the monograph Digital Literary Redlining: African American Anthologies, Digital Humanities, and the Canon (Earhart 2025). It is a highly curated small data project that includes 267 individual anthology volumes, 107 editions, 319 editors, 2,844 unique individual authors, and 22,392 individual entries, and allows the user to track the shifting inclusion and exclusion of authors over more than a hundred-year period. Focusing on author inclusion, the data includes gender and race designations of authors and editors.”

National UFO Reporting Center: “Tier 1” sighting reports

Via Ronda Grizzle, who uses this dataset when teaching Scholars’ Lab graduate Praxis Fellows how to shape research questions matching available data, and how to understand datasets as subjective and choice-based. I know UFOs sounds like a funny topic, and it can be, but there are also lots of interesting inroads like the language people use reflecting hopes, fears, imagination, otherness, certainty. A good teaching dataset given there aren’t overly many fields per report, and those include mappable, timeline-able, narrative text, and a very subjective interesting one (a taxonomy of UFO shapes). nuforc.org/subndx/?id=highlights

The Pudding

Well researched, contextualized, beautifully designed data storytelling on fun or meaningful questions, with an emphasis on cultural data and how to tell stories with data (including personally motivated ones, something that I think is both inspiring for students and great to have examples of how to do critically). pudding.cool

…and its Ham4Corpus use

Shirley Wu for The Pudding’s interactive visualization of every line in Hamilton uses my ham4corpus dataset (and data from other sources), which might be a useful example of how an afternoon’s work with open-access data (Wikipedia, lyrics) and some simple scripted data cleaning and formatting can produce foundations for research and visualization.

Responsible Datasets in Context

Dirs. Sylvia Fernandez, Miriam Posner, Anna Preus, Amardeep Singh, & Melanie Walsh

“Understanding the social and historical context of data is essential for all responsible data work. We host datasets that are paired with rich documentation, data essays, and teaching resources, all of which draw on context and humanities perspectives and methods. We provide models for responsible data curation, documentation, story-telling, and analysis.” 4 rich dataset options (as of August 2025) each including a data essay, ability to explore the data on the site, programming and discussion exercises for investigating and understanding the data. Datasets: US National park visit data, gender violence at the border, early 20th-century ~1k poems from African American periodicals, top 500 “greatest” novels according to OCLC records on novels most held by libraries. responsible-datasets-in-context.com

Post45 Data Collective

Eds Melanie Walsh, Alexander Manshel, J.D. Porter

“A peer-reviewed, open-access repository for literary and cultural data from 1945 to the present”, offering 11 datasets (as of August 2025) useful in investigations such as how book popularity & literary canons get manufactured. Includes datasets on “The Canon of Asian American Literature”, “International Bestsellers”, “Time Horizons of Futuristic Fiction”, and “The Index of Major Literary Prizes in the US”. The project ‘provides an open-access home for humanities data, peer reviews data so scholars can gain institutional recognition, and DOIs so this work can be cited’: data.post45.org/our-data.html

CBP and ICE databases

Via Miriam Posner: A spreadsheet containing all publicly available information about CBP and ICE databases, from the American Immigration Council americanimmigrationcouncil.org/content-understanding-immigration-enforcement-databases

Data assignment in The Critical Fan Toolkit

By Cara Marta Messina

Messina’s project (which prioritizes ethical critical studies of fan works and fandom) includes this model teaching assignment on gathering and analyzing fandom data, and understanding the politics of what is represented by this data. Includes links to 2 data sources, as well as Destination Toast’s “How do I find/gather data about the ships in my fandom on AO3?”.

(Re:fan studies, note that there is/was an Archive of Our Own dataset—but it was created in a manner seen as invasive and unethical by AO3 writers and readers. Good to read about and discuss with students, but I do not recommend using it as a data source for those reasons.)

Fashion Calendar data

By Fashion Institute of Technology

Fashion Calendar was “an independent, weekly periodical that served as the official scheduling clearinghouse for the American fashion industry” 1941 to 2014; 1972-2008’s Fashion International and 1947-1951’s Home Furnishings are also included in the dataset. Allows manipulation on the site (including graping and mapping) as well as download as JSON. fashioncalendar.fitnyc.edu/page/data

Black Studies Dataverse

With datasets by Kenton Ramsby et al.

Found via Kaylen Dwyer. “The Black Studies Dataverse contains various quantitative and qualitative datasets related to the study of African American life and history that can be used in Digital Humanities research and teaching. Black studies is a systematic way of studying black people in the world – such as their history, culture, sociology, and religion. Users can access the information to perform analyses of various subjects ranging from literature, black migration patterns, and rap music. In addition, these .csv datasets can also be transformed into interactive infographics that tell stories about various topics in Black Studies. “ dataverse.tdl.org/dataverse/uta-blackstudies

Netflix Movies & Shows

kaggle.com/datasets/shivamb/netflix-shows

Billboard Hot 100 Number Ones Database

By Chris Dalla Riva

Via Alex Selby-Boothroyd: Gsheet by Chris Dalla Riva with 100+ data fields for every US Billboard Hot 100 Number One song since August 4th, 1958.

Internet Broadway Database

Found via Heather Froehlich: “provides data, publishes charts and structured tables of weekly attendance and ticket revenue, additionally available for individual shows”. ibdb.com

Structured Wikipedia Dataset

Wikimedia released this dataset sourced from their “Snapshot API which delivers bulk database dumps, aka snapshots, of Wikimedia projects—in this case, Wikipedia in English and French languages”. “Contains all articles of the English and French language editions of Wikipedia, pre-parsed and outputted as structured JSON files using a consistent schema compressed as zip” huggingface.co/datasets/wikimedia/structured-wikipedia. Do note there has been controversy in the past around Hugging Face scraping material for AI/dataset use without author permission, and differing understandings of how work published in various ways on the web is owned. (I might have a less passive description of this if I went and reminded myself what happened, but I’m not going to do that right now.)

CORGIS: The Collection of Really Great, Interesting, Situated Datasets project

By Austin Cory Bart, Dennis Kafura, Clifford A. Shaffer, Javier Tibau, Luke Gusukuma, Eli Tilevich

Visualizer and exportable datasets of a lot of interesting datasets on all kinds of topics.

FiveThirtyEight’s data

I’m not a fan for various reasons, but their data underlying various political, sports, and other stats-related articles might still be useful: [data.fivethirtyeight.com(https://data.fivethirtyeight.com/) Or look at how and what they collect, include in their data and what subjective choices and biases those reveal :)

Zine Bakery zines

I maintain a database of info on hundreds of zines related to social justice, culture, and/or tech topics for my ZineBakery.com project—with over 60 metadata fields (slightly fewer for the public view) capturing descriptive and evaluative details about each zine. Use the … icon then “export as CSV” to use the dataset (I haven’t tried this yet, so let me know if you encounter issues).

OpenAlex

I don’t know much about this yet, but it looked cool and is from a non-profit that builds tools to help with the journal racket (Unsub for understanding “big deals” values and alternatvies, Unpaywall for OA article finding). “We index over 250M scholarly works from 250k sources, with extra coverage of humanities, non-English languages, and the Global South. We link these works to 90M disambiguated authors and 100k institutions, as well as enriching them with topic information, SDGs, citation counts, and much more. Export all your search results for free. For more flexibility use our API or even download the whole dataset. It’s all CC0-licensed so you can share and reuse it as you like!” openalex.org

Bonus data tools, tutorials

Matt Lincoln’s salty: “When teaching students how to clean data, it helps to have data that isn’t too clean already. salty offers functions for “salting” clean data with problems often found in datasets in the wild, such as pseudo-OCR errors, inconsistent capitalization and spelling, invalid dates, unpredictable punctuation in numeric fields, missing values or empty strings”.

The Data-Sitters Club for smart, accessible, fun tutorials and essays on computational text analysis for digital humanities.

Claudia Berger’s blog post on designing a data physicalization—a data quilt!—as well as the final quilt and free research zine exploring the data, its physicalization process, and its provocations.

The Pudding’s resources for learning & doing data journalism and research

See also The Critical Fan Toolkit by Cara Marta Messina (discussed in datasets section above), which offers both tools and links to interesting datasets.

Letterpress data, not publicly available yet…

I maintain a database of the letterpress type, graphic blocks/cuts, presses, supplies, and books related to book arts owned by me or by Scholars’ Lab. I have a very-in-progress website version I’m slowly building, without easily downloadable data, just a table view of some of the fields.

I also have a slice of this viewable online and not as downloadable data: just a gallery of the queerer letterpress graphic blocks I’ve collected or created. But I could get more online if anyone was interested in teaching or otherwise working with it?

I also am nearly done developing a database of the former VA Center for the Book: Book Arts Program’s enormous collection of type, which includes top-down photos of each case of type. I’m hoping to add more photos of example prints that use each type, too. If this is of interest to your teaching or research, let me know, as external interest might motivate me to get to the point of publishing sooner.

Our Journey to Praxathon

2025年4月18日 12:00

My cohort just finished our second week of Praxathon and I wanted to reflect on the development of our project and how we ended up focusing on conducting text analysis of the UVa students’ satirical publication, The Yellow Journal.

For me, this project started back in 2018 when I was accepted into The Yellow Journal as a second year undergraduate student at UVa. The Yellow Journal is an anonymously-published satirical newspaper that has operated on and off since 1913. Undergraduate students know The Yellow Journal for its members’ semesterly tradition of disrupting libraries during the first day of finals by raucously distributing the publication while masked and wearing all yellow… and often blasting Yellow by Coldplay or Black and Yellow by Wiz Khalifa on giant speakers. I started my tenure as a satirical writer with the headline and article below:

Hardest Part of Getting Accepted into the Comm School is Needing to Replace All of Your Friends, Student Says

As the season of applying to the McIntire School of Commerce approaches for second years, older students reflect on their prior application experiences. Kody, a fourth year in the Comm school, explains that the application itself was easy; he had no doubt in his mind that he would get in. The hardest part was letting go of all of his non-Comm friends afterwards. “I just can’t let failure into my life,” Kody explains. “Once you’re in the Comm School, you have to start setting standards for your friends, and most of my friends weren’t meeting mine.” Kody was on the fence about keeping his Batten friends, but eventually decided against it. “Hanging out with them is bad for optics, in my opinion,” Kody stated. “While Batten kids are also good at networking, I can’t let their morals get in my way. They’re all about government intervention… hey dummies, what about the invisible hand?” Drew, an Economics major, elaborates on his ended friendship with Kody: “The minute my roommate Kody got accepted, he turned to me and asked me to move out. I was heartbroken, we had been living together since first year. In fact, he’s also my cousin. But I understand… it had to be done.” Drew wasn’t sure if it was worth it to even continue college after his rejection from Comm. To him, having no diploma at all is better than getting an non-Comm Economics degree.

Outside of writing headlines and articles, Yellow Journal members were also in the midst of digitizing and archiving the entire history of the paper on our Google Drive. The publication started in 1913, but it was only published regularly starting in 1920 and then was subsequently banned in 1934 by the UVa administration due to its anonymity. The publication then resumed in 1987, having its own office next to The Cavalier Daily with a modest amount of revenue from selling ad placements. The paper was discontinued again in 1999, but a group of students revived it in 2010 which resulted in its current, ongoing iteration.

In late 2019, I realized that we were approaching 100 years since The Yellow Journal was published regularly and I applied to a few grants that could possibly fund a special anniversary issue. I wanted to use the extensive archive work that members had so painstakingly organized for future members to look back on. The idea was to publish some highlights from our archive, especially the jokes that still remained relevant today. With quarantine in March 2020, however, interest from my collaborators waned and I eventually abandoned that project. I knew that I wanted to return to working on a project about The Yellow Journal someday because it provided such unique insight on the student experience of the University. Also, even 100 years later, many of the early issues are still so funny.

My position as a former member of The Yellow Journal was definitely the reason that the subject was brought up as a possible topic for our Praxathon, but I don’t think this project would have necessarily worked with other cohorts. The final section on our charter is titled “Make Learning a Playful Process.” That was a big goal of our cohort: to approach the work in a fun, lighthearted way. I wasn’t completely sure about the viability of that pledge when we first wrote the charter. I didn’t know the rest of my cohort well at the time and I was still very operating in “traditional graduate classroom” mode. As we are approaching the end of the year, however, I think I can now safely say that we made every single part of Praxis fun and playful. I spend a good portion of my time in Praxis attempting to stifle my laughter at Oriane’s 10,000 things to commit to Github, Shane’s river drawing, or Brandon attempts to find new phrases because we accidentally made him insecure about saying “for what it’s worth.”

When I first pitched The Yellow Journal as an idea for Praxathon, I was mainly thinking about how it made sense as a project in a practical way: we already had access to high quality digitized records of all of the issues. The scope seemed manageable and it did not require too much preparatory work. As we’ve progressed in the project, I’ve slowly realized why it resonated with us as a group beyond logistics. Since we’re all graduate students at UVa, we are all familiar with and invested in the University’s history (especially told from a student perspective). We want to have fun with the material, which has led to many instances of us sitting in the fellows lounge and reading funny headlines out loud to each other.

Most of all, I think that the way we’ve developed the project has played into our individual and collective strengths. I never even thought about looking at student records from the 1920s and 30s but Gramond, being an incredible historian and lover of data, introduced us to that possibility. Oriane has done some amazing research on the history of the University at the time period that we’re looking at and, more generally, on analyzing satire. Because of her research of poetry, Amna was already interested in many of the text analysis methods that we’re using so she has expertly led us in thinking about how to apply those to The Yellow Journal. Kristin, as always, has shown herself to be an amazing problem solver, ready to tackle any coding task with such resolve and creativity. I just love assigning tasks to people so I have commandeered our Trello board.

Our poster will hopefully be done in the next few weeks, but it is clear to me now that the process, or journey, through the Praxathon is much more important than the end product. As I read through our charter again, I realize how true to our goals we’ve been and how interdisciplinary (and fun!) our final project is.

The M.E. Test

2025年4月15日 12:00

I recently gave a workshop for the US Latino Digital Humanities Center (USLDH) at the University of Houston on introductory text analysis concepts and Voyant. I don’t have a full talk to share since it was a workshop, but I still thought I would share some of the things that worked especially well about the session. USLDH recorded the talk and made it available here, and you can find the link to my materials here.

I had a teaching observation when I was graduate student, and one comment always stuck with me. My director told me, “this was all great but don’t be afraid to tell them what you think.” I’ve written elsewhere about how I tend to approach classroom facilitation as a process of generating questions that the group explores together. This orientation is sometimes in conflict with DH instruction, where you have information that simply needs to be conveyed. I had this tension in mind while planning the USLDH event. It was billed as a workshop, and I think there’s nothing worse than attending a workshop only to find that it’s really a lecture. How to balance the generic expectations with the knowledge that I had stuff I needed to put on the table? As an attempt to thread this needle, I structured the three-part session around a range of different kinds of teaching moves: some lecture, yes, but also a mix of open discussion, case study, quiz questions, and free play with a tool.

The broad idea behind the workshop entitled “Book Number Graph” is that people come to text analysis consultations with all varieties of materials and a range of research questions. Most often, my first step in consulting with them is to ask them to slow down and think more deeply about their base assumptions. Do they actually have their materials in a usable form? Is it possible to ask the questions they are interested in using the evidence they have? I built the workshop discussions as though I was prepping participants to field these kinds of research consultations, as though they were digital humanities librarians.

First, the “book” portion of the workshop featured a short introduction to different kinds of materials, exploring how format matters in the context of digital text analysis. We discussed how a book is distinct from an eBook is distinct from a web material, and how all of these are really distinct from the kind of plain text document that we likely want to get to. I used here a hypothetical person who shows up in my office and says, “Oh yeah, I have my texts. I’m ready to work on them with you. Can you help me?” And they will hand me either a stack of books or a series of PDF files that haven’t been OCR’d. I introduced workshop participants to the kinds of technical and legal challenges that arise in such situations so that they’ll be able to better assess the feasibility of their own plans. This all built to a pair of case studies where I asked the participants how they would respond if a researcher came to them with questions for their own project.

First case study: I am interested in a text analysis project on medieval Spanish novels. Oh yeah I have my texts. Can I meet? What kinds of questions would you ask this person? What kinds of problems might you expect? How would you address them?

I want to study the concept of the family as discussed in online forums for Mexican-American communities. Can we meet to discuss? What kinds of questions would you ask this person? What kinds of problems might you expect? How would you address them?

With these case studies, I hoped to give participants a glimpse into the real-world kinds of conversations that I have as a DH library worker. For the most part, consultations begin with my asking a range of questions of the researcher so as to help them get new clarity on the actual feasibility of what they want to do. I hoped for the participants to question the formats of the materials for these hypothetical researchers and point out a range of ethical and legal concerns. Hopefully they would be able to ask these questions of their own work as well.

Has anyone made this available before? If yes…Can I use it? Under what terms? If not…Do I have access to the texts myself? If yes…What format are they in? If not available as plain text…Can I convert them into the format I need? What do I want to do with these texts? Is it allowed?

For the second section of the workshop entitled “number,” I gave participants an introduction to thinking about evidence and analysis, distinguishing between what computers can do and the kinds of things that readers are good at. Broadly speaking, computers are concrete. They know what’s on the page and not what’s outside of it. Researchers in text analysis need to point software to the specific things that they are interested in on the page and supplement this information with any other information outside of the text. Complicated text analysis research questions have at their core really simplistic, concrete, measurable things on the page. You are pointing to a thing and counting. For examples of the things that computers can readily be told to examine, we discussed structural information, proximity, the order of words, frequency of words, case, and more.

To practice this, I adapted an exercise that I was first introduced to by Mackenzie Brooks but that was developed by librarians at the University of Michigan. To introduce TEI, the activity asks students to draw boxes around a printed poem as a way to identify the different structural elements that you would want to encode. For my purposes, I put a Langston Hughes poem on the Zoom screen and asked participants to annotate it with all sorts of information that they thought a computer would be capable of identifying.

Langston Hughes poem ready to be annotated

The result was a beautiful tapestry of underlines and squiggles. Some of the choices would be very easy for a computer: word frequency, line breaks, structural elements. But we also talked about more challenging cases. We know the poem’s title because we expect to see it in a certain place on the page. The computer might be pointed to this this by flagging the line that comes three after three blank line breaks. But what if this isn’t always the case? It was good practice in how to distinguish between the information we bring to the text and what is actually available on the page. We talked about the challenges in trying to bridge the gap between what computers can do and what humans can do, to try and think through how a complicated intellectual question might take shape in a computationally legible form.

Kinds of things that can be measured: Sequences of characters. Case. Words (tokens). Structural elements, with some caveats. Proximity. Order (syntagmatic axis). Metadata – often has to be added manually

Wrapping all this together, I introduced what I called the M.E. test for text analysis research. To have a successful text analysis project you have to have…

MATERIALS  - Appropriate, accessible. EVIDENCE - Identifiable, measurable

  • Materials that are…
    • appropriate to your questions and
    • accessible for your purposes.

You must also have

  • Evidence that is…
    • identifiable to you as an expression of your research question and
    • legible to the tool you are using.

Materials and Evidence. M and E.

M.E.

The next time you sit down to do text analysis, ask yourself, “What makes a good question? M.E. Me!”

XKCD comic on imposter syndrome describing an expert in imposter syndrome who immediately questions her own expertise

Painfully earnest? Sure! But this was a nice little way for me to tie in what I often joke is my most frequently requested consultation topic: imposter syndrome. The M.E. question is both a test for deciding whether or not a text analysis research question is appropriate, but it is also a call for you to recognize that you can handle this work. A nice little way for you to give yourself a pump up, because I believe that these methods belong to anyone. Anyone can handle these kinds of consultations. They’re more art than science at the level we are discussing. You just have to know the correct way to approach them. Deep expertise can come later. If you are too intimidated to get started you will never get there.

From there, I closed the “number” portion of the workshop with a couple more case study prompts. I asked participants to respond to two more scenarios as though someone had just walked into their office with an idea they wanted to try out.

Prompt: I am interested in which Shakespeare character is the most important in Romeo and Juliet.

Prompt: I am interested in how space and place are represented in literature of the southeastern United States.

The hypothetical consultation prompts involved, first, an interest in finding the most important characters in a particular Shakespeare play and, second, an interest in space and place in southeastern American literature. In each case, we discussed questions of format and copyright, but we also got to some fairly high-level questions about what kinds of evidence you could use to discuss the research questions. For importance, participants proposed measuring either number of lines for each character or who happens to be onstage for the greatest amount of time. For space and place, we discussed counting place names using Python (a nice way to introduce concepts related to Named Entity Recognition). In each case, my goal was to give the workshop participants a sense of how to test and develop their own research questions by walking them through the process I use when talking with researchers asking for a fresh consultation.

USLDH has shared the recording link, so feel free to check out the recording if you want to see the activities in action. The slides can be found here. And never forget the most important thing to ask yourself the next time you’re working on a text analysis problem:

“What makes a good research question? Me.”

Personal Data Story

2025年3月24日 12:00

I just gave a workshop online for Dr. Jennifer Isasi’s course at Pennsylvania State University. Normally I would share the text I used for events like these, but since it was a workshop I don’t quite have something formal to share in that mode. The abstract and title I provided for the event can give some flavor of things:

Organized Chaos: Humanities Data and Cleaning with a Purpose

This workshop covers the theory and practice of preparing humanities data for analysis. Even while being notoriously difficult to work with, we will explore how the mess of humanities data is actually what conveys a wealth of information. We will discuss the affordances and tradeoffs inherent in data cleaning with the humanities, when it is worth doing, and when it is worth maintaining a sense of organized chaos that our materials demand. Practical discussions will include working with dates, irregular spellings, and data organization. We will use OpenRefine to practice working with data and use examples drawn from the real world. Participants should come to the workshop with OpenRefine installed on their computer.

So basically, the broad point of the workshop was that data cleaning in a humanities context can best be thought of as a kind of organizing of many possibilities, as living with chaos rather than trying to eliminate it. Your data will never really be “clean,” per se, no matter how much you try to pursue those ends. And in trying to impose an order that does not exist in reality you eliminate a lot of important cultural and linguistic differences that we as humanists care about quite a bit. All of this was a layer over what was essentially a workshop on Open Refine, a fabulous open-source power tool for working with messy data. You can find my slides at this link.

While I don’t have a full text worth sharing, I do think that there is one piece of the workshop that worked especially well that I wanted to document. As is often the case, at the beginning of our time participants introduced themselves to me and to each other. Dr. Isasi helped fill in gaps about who the people in the room were, where they were from, and what they work on. Dr. Isasi then read my bio, which is always an uncomfortable moment for me. I never quite know what to do while someone is talking about me in this way or how to transition gracefully from that discomfort to the topic at hand. This time I decided to sit in that space a bit longer and tie it directly to what we were doing by having the first phase of the workshop address what I called my “personal data story.”

I began by sharing an internet search for my name and asking students what they made of it:

Screenshot of a google search for "brandon walsh" containing numerous photographs of a character from the TV show 90210. The speaker's image only appears at the very end

If you were alive at a certain point in the 1990’s, you likely already know the answer to what you are looking at. I share my name with a particular character from the Beverly Hills, 90210, a popular TV show that premiered in 1990. I have been haunted by this data point my whole life. My earliest memory of it was in kindergarten, but as recently as last year someone started laughing the moment I spoke my name while trying to book a doctor’s appointment [Update: between drafting and publishing this post a faculty member from a different department pointed the connection out again!]. The workshop participants very quickly recognized that there was more than one Brandon Walsh out there in this particular dataset. I pointed to this as an example of how messy data can be when we’re dealing with people. How do you represent these differences in a data set? It can get complicated quite quickly. I also took the moment to point out that I’m slowly creeping up in the search engine optimization. If you zoom out several times, I finally show up at the very bottom of the page. I’m coming for you, Brandon Walsh.

As the next stage of my data journey, I asked participants to consider a particular piece of mail that I’ve been getting my whole life.

Screenshot of a question for the audience that reads:"My whole life, I have gotten mail from modeling agencies. This mail is addressed to Brandonm. What do you think happened to cause this error behind the scenes?"

My whole life I’ve been getting mail to a particular person named Brandonm, all one word, from modeling agencies asking me to come in and do runway shoots. It started when I was in elementary school. Every few years I will I think I’m finally free and forget about it. But then I’ll get a fresh call asking me to come in. I asked the participants to guess what they thought might be going on. They immediately had the same thought I did, as someone asked, “does your middle name start with the letter M?” It does indeed. My middle name is Michael, and I would be willing to bet money that someone accidentally merged two columns in a table at some point. Brandon M Walsh became Brandonm Walsh, and a star was born. In the context of this particular workshop, I found it interesting for the way in which this story shows that data errors can follow you your whole life whether you realize it or not.

For the last stage of my data journey, I gave the participants a more technical exercise. Not about me, per se, but rather about a particular kind of data problem that was quite pivotal to one stage of my life.

Screenshot of a question for the audience that presents several different dates formatted differently, asks them to identify the issue and develop a plan to address it.

I presented five pieces of data to the participants and asked them a series of questions:

  • What are we looking at? What are these things in front of you?
  • What’s going on with them?
  • And then, if you cared about such things, how would you correct them?

The data in front of the group, of course, consisted of dates in a variety of different formats. These are the sort of thing that can be quite confusing, especially if you’re talking to people from different geographic locations, different cultural contexts. The most confusing are the two formats that interchange day and month but keep the year in the same place. The participants described how they would first decide on a standard and then convert each date one at a time to conform by moving digits around and editing the punctuation separating things. I then revealed the trick of this prompt: this was actually a real-life job interview question I got when I was interviewing for a programming position at the University of Virginia Library. And I think it’s a really good example of a technical question that one might get in an interview. You have to think through a problem, talk about how you would solve it, and display a lot of technical understanding. But you don’t have to actually write any code on a whiteboard. If I recall correctly, my own response was “these are dates that are formatted incorrectly, and I would start by using regular expressions to try and work out how to massage the dates into a particular format. Otherwise it can make data processing and computational work quite challenging, if not possible.”

I use this technical question all the time in mock interviews because it’s easy to remember. In the context of this particular workshop, it also served as a good pivot from the individual, personal stakes of data management to the ways the same questions might arise in a real-world professional context. At this point, the participants quite wisely began to make connections to Katie Rawson and Trevor Muñoz’s piece entitled “Against Cleaning.” In that piece, Rawson and Muñoz discuss the need to preserve the cultural context of data. Rather than cleaning away difference, the students suggested that we actually add a second column for the new, cleaned data fields as we work through them. In this way, we would preserve the original information while also gaining a new set of material that we could use for computation. Adding data rather than taking away difference.

There’s more in the slides if you’re interested. The workshop drew heavily on the work that I have been doing in my class this semester on “Data for the Rest of Us.” I’ll keep sharing more about that work in the future.

Training: RDM for Humanities and Social Sciences 2025

2025年3月19日 16:24

RDM covers a wide range of subjects, with extensive information that requires practical implementation. Within KU Leuven, there are training sessions specifically designed to cultivate practical RDM skills. For researchers within the field of Humanities and Social Sciences, we recommend these upcoming training sessions to get yourself acquainted with RDM.

These events are only open to KU Leuven researchers and staff

RDM Workshop for PhDs in Humanities and Social Sciences

Program

Research data management (RDM) refers to how you handle your data during and after your research project to ensure they are well organized, structured, of high quality and Findable, Accessible, Interoperable and Reusable (FAIR). During this session you will learn best practices for the management of research data according to the FAIR data principles. We consider the technical, legal, and ethical aspects of research data, secure storage of materials, documentation and metadata, research data sharing, reusing data shared by others, and more. This solid grounding in basic RDM skills will help you make informed decisions on how to handle your research data. Additionally, you will learn how to write and maintain your own Data Management Plan (DMP)

The training consists of two parts: 

  • A short general introduction on Research Data Management  (20’ – 25’)  
  • Followed by small interactive group sessions, where participants dicuss their Data Management Plan (DMP), under the guidance of an RDM expert.

Practicalities

  • When: March 25, 2025 from 14:00 to 16:00
  • Where: Online
  • For who: This training is mainly aimed at doctoral researchers, preferably at the start of their PhD or project.
  • Price and registration: Free but mandatory
  • More info: Click here

Workshop Documentation & Metadata for Qualitative Research

Program

Documentation and metadata are essential to understand your data in detail, and help other researchers to find and use your data. It enables making your data more Findable, Accessible, Interoperable and Reusable (FAIR) and improves the reproducibility of your data. Documentation and metadata are therefore of crucial importance for good Research Data Management.

Through an introductive presentation, interactive exercises, polls and brainstorm sessions you will practice how to:

  • Organise data files and folders
  • Identify information in a dataset and within data files
  • Search for a metadata standard
  • Use metadata schemes
  • Deposit a dataset in RDR

Practicalities

  • When: April 24, 2025 from 13:00 to 16:00
  • Where: University Library, Colloquium (Mgr. Ladeuzeplein 21, 3000 Leuven)
  • For who: This workshop is intended for researchers in need of knowing the basics of documentation & metadata.
  • Price and registration: Free but mandatory
  • More info: Click here

Call for Contributions: »Digitale Editionen der Zeitgeschichte zwischen KI und Linked Open Data: Herausforderungen und Perspektiven«

2025年2月21日 17:19

Zum 5-jährigen Bestehen der digitalen TEI-Edition »Fraktionen im Deutschen Bundestag. Sitzungsprotokolle 1949 bis 2005« (fraktionsprotokolle.de) lädt die KGParl ExpertInnen aus den Digital Humanities, der Editionswissenschaft und angrenzenden Disziplinen am 4./5. Dezember 2025 zum Workshop »Digitale Editionen der Zeitgeschichte zwischen KI und Linked Open Data: Herausforderungen und Perspektiven« in Berlin ein.

Ziel des Workshops ist es, die Auswirkungen von Künstlicher Intelligenz, Großen Sprachmodellen und Linked Open Data auf digitale Editionen zu diskutieren – mit besonderem Fokus auf politischen, verwaltungsbezogenen und diplomatischen Quellen wie Parlaments- und Fraktionsprotokollen, Verordnungen oder Kabinettsakten.

Im Mittelpunkt steht die Frage, welche methodischen und technischen Innovationen erforderlich sind, um digitale Editionen parlamentarisch-administrativer Quellen langfristig nachhaltig, interoperabel und wissenschaftlich nutzbar zu machen.

Zum Call for Contributions

Beiträge für den Workshop sollen als kurzes Abstract (nicht mehr als 200–300 Wörter) eingereicht werden. Das Abstract sollte die Fragestellung, Methodik und erwarteten Ergebnisse klar skizzieren.

Bitte senden Sie Ihr Abstract bis zum 31. März 2025 an juengerkes@kgparl.de.

Workshopreihe: Einstieg & Vertiefung in Linked Open Data (LOD)

2025年2月19日 23:31

Das Datenkompetenzzentrum HERMES lädt zu einer zweiteiligen Workshopreihe zu Linked Open Data (LOD) ein. Die Veranstaltungen bieten eine Einführung sowie eine Vertiefung in den Umgang mit LOD und richten sich an Forschende und Studierende der Geistes- und Kulturwissenschaften – unabhängig von ihrer Karrierestufe.

 

Workshop 1: Einführung in Linked Open Data

📅19. März 2025 | ⏰ 9:00–17:00 Uhr | 🖥Online via Zoom

Dieser Grundlagen-Workshop (konzipiert und organisiert vom HERMES Data Carpentries Team) führt in die theoretischen und technischen Konzepte von LOD ein. Dabei werden unter anderem folgende Themen behandelt:

  • Semantic Web & RDF – Wie funktionieren vernetzte Daten?
  • Von CSV zu RDF – Umwandlung von Daten in das LOD-Format
  • Annotation, Vokabulare & Ontologien – Daten mit Bedeutung versehen
  • SPARQL-Abfragen & LOD-Publikation – Daten effizient durchsuchen und nutzen
  • Visualisierungstools – Anschauliche Darstellungen von LOD-Daten

Für wen? Der Workshop ist ideal für alle, die sich neu mit LOD beschäftigen. Keine Vorkenntnisse erforderlich!

 

Workshop 2: Vertiefung & eigene Anwendungen (Bring Your Own Data Lab)

📅5.–6. Juni 2025 | 📍Leibniz-Institut für Europäische Geschichte, Mainz

Für alle, die ihr Wissen vertiefen und eigene Forschungsdaten mit LOD verknüpfen möchten! Im BYODL-Workshop werden Expert*innen spezifische Anwendungsfälle begleiten und individuelle Fragestellungen unterstützen. Themen:

  • Arbeiten mit Wikidata & spezialisierten Ontologien
  • Vokabulare für Forschungsdaten
  • Tool Criticism – kritische Perspektiven auf Datenverarbeitung

Der Workshop kombiniert Impulsvorträge mit praktischer Anwendung, um LOD gezielt auf eigene Forschungsprojekte anzuwenden.

Teilnahmebedingungen:

  • Workshop 1 oder entsprechende Vorkenntnisse sind Voraussetzung für die Teilnahme an Workshop 2.
  • Ein eigener Laptop wird benötigt.

Teilnahme an nur einem oder beiden Workshops möglich!

Weitere Informationen & Anmeldung: https://hermes-hub.de/events/intern/carpentries_byodlab_workshopreihe_2025.html

Zine Bakery: catalog as dataset research

2024年9月16日 12:00

A catalog is also a dataset, which means because of my Zine Bakery project’s zine catalog, I’ve got a hand built, richly described, tidily organized dataset I know well. Seeing my zine catalog as a dataset opens it to my data science and digital humanities skillset, including data viz, coding, and data-based making. Below, I share some of the data-driven scholarship I’ve pursued as part of my Zine Bakery project.

Photo of Amanda Wyatt Visconti presenting virtually at the DH 2024 conferenceGiving a talk on data-driven making for the DH 2024 conference

A peek under the hood

Screenshot of just a small portion of my thematic tagging. I’ve got 134 different tags used on catalog zines (as of 9/16/2024): Screenshot of a portion of the Zine Bakery catalog, showing a variety of thematic tags including AI, anti-racism, and coding

Below, a zoomed-out screenshot of my tagging table, which does not capture the whole thing (which is about twice as wide and twice as a tall as what’s shown); and a zoomed-in view: Screenshot of a portion of the Zine Bakery catalog, showing a way-zoomed-out screenshot of a portion of the zine catalogue's underlying thematic tags to zine titles tableScreenshot of a portion of the Zine Bakery catalog, showing a zoomed-in screenshot of a portion of the zine catalogue's underlying thematic tags to zine titles table

The tags are just one of many fields (78 total fields per zine, as of 9/16/2024) in my database: Screenshot of a portion of the Zine Bakery catalog, showing several titles of zines

I’m able to easily pull out stats from the catalog, such as the average zine length in my collection being 27 pages (and shortest, longest zine lengths):

Screenshot of a portion of the Zine Bakery catalog, showing average zine length is 27 pages long, longest zine is 164 pages long, and shortest zine length is 4 pages long

Data-driven making research

My Spring 2024 peer-reviewed article “Book Adjacent: Database & Makerspace Prototypes Repairing Book-Centric Citation Bias in DH Working Libraries” discusses the relational database I built underlying the Zine Bakery project, as well as 3 makerspace prototypes I’ve built or am building based on this data.

One of those projects was a card deck and case of themed zine reads, with each card displaying a zine title, creators, and QR code linking to free reading of the zine online: Example themed reading card deck, prepared for the ACH 2023 conference's #DHmakes (digital humanities making) session. An open plastic playing card case holds a playing-card-style card with information about the "#DHMakes at #ACH2023" project governing the readings chosen for inclusion in the deck; next to the case is a fanned-out pile of playing-card-style cards showing tech, GLAM, and social justice zine titles such as "Kult of the Cyber Witch #1" and "Handbook for the Activist Archivist"; on the top of the fanned pile you can see a whole card. The whole card is white with black text; the title "Design Justice for Action" is in large print at the top of the card, followed by a list of the zine's creators (Design Justice Network, Sasha Costanza-Chock, Una Lee, Victoria Barnett, Taylor Stewart), the hashtags "#DHMakes #ACH2023, and a black square QR code (which links to an online version of that zine).

Photo of a fake, adult-size skeleton (Dr. Cheese Bones) wearing the ACH 2023 #DHMakes crew's collaborative DH making vest, which boasts a variety of neat small making projects such as a data visualization quilt patch and felted conference name letters. One of my themed reading card decks is visible half-tucked into its vest pocket. Photo and Dr. Bones appearance by Quinn Dombrowski.

My online zine quilt dataviz will eventually be an offline actual quilt, printed on fabric with additional sewn features that visualize some of the collection’s data: Screenshot of a digital grid of photos of zine front covers; it's very colorful, and around 200 zine covers are shown

The dataset is also fueling design plans for a public interactive exhibit, with a reading preferences quiz that results in a receipt-style printout zine reading list: My sketches and notes planning the layout of the Mini Book List Printer's acrylic case. A photo of a spiral-bound sketchbook, white paper with black ink. The page is full of notes and drawings, including sketches of a simplified Mac Classic-style computer case, as well as the various pieces of acrylic that would need to be cut to assemble the case and their dimensions. The notes contain ideas about how to assemble the case (e.g. does it need air holes?), supplies I needed to procure for the project, and note working out how to cut and adhere various case piece edges to achieve the desired final case dimensions.

Author's sketch of what the final Mini Book List printer should look like. A rough drawing in black ink on white paper, of a computer shaped like a simplified retro Mac (very cubic/boxy); the computer screen reads "We think you'll enjoy these reads:" followed by squiggles to suggest a list of suggested reads; from the computer's floppy drive hole comes paper receipt tape with squiggles listed on it to suggest a reading recommendation list printout on receipt-width paper. There are sparkly lines drawn around the receipt paper, with an annotation stating these denote "magic" rather than light, as there are no LEDs in this project.

I’m also experimenting with ways to put digital-only zines visibly on physical shelves: Photo of materials for the Ghost Books project artfully arranged on a floor, including a swirl of blue LEDs with silicone diffusion making them look like neon lights, superglue, acrylic and glass cut to size to be assembled into a rectangular-prism/book shape with smoothe or crenellated edges, and one of the books I'm basing the initial prototype on (10 PRINT) because of it's interesting blue and white patterned cover.

Zine Bakery: research roadmap

2024年8月18日 12:00

Some future work I’m planning for my Zine Bakery project researching, collecting, and amplifying zines at the intersections of tech, social justice, and culture.

Critical collecting

  • Ethical practices charter: how do I collect and research?
    • Finish drafting my post on ethics-related choices in my project, such as
      • not re-hosting zines without creator informed, explicit consent, so that catalogue users use zine creator’s versions and see their website; and
      • taking extra care around whether zines created for classes gave consent outside of any implicit pressures related to grades or the teacher serving as a future job reference
    • Read the Zine Librarians Code of Ethics in full, and modify my charter wit citations to their excellent project.
  • Collecting rationale: why do I collect, and what do I/don’t I collect?

  • ID areas I need to collect more actively, for Zine Bakery @ Scholars’ Lab goals of a welcoming, diverse collection reflecting SLab’s values and our audience

  • Contact zine creators: I already don’t display, link, etc. zines creators don’t positively indicate they want people to. But I could also contact creators to see if they want something added/edited in the catalogue, or if their preferences on replication have changed since they published the zine; and just to let them know about the project as an example of something citing their work.

  • Accessibility:
    • Improve zine cover image alt text, so rather than title and creators, it also includes a description of important visual aspects of the cover such as color, typography, illustration, general effect. Retry Google Vision AI, write manually, or look at existing efforts to markup (e.g. comics TEI) and/or extrapolate image descriptions.
    • Look into screen-reading experience of catalogue. Can I make a version (even if it requires scheduled manual exports that I can format and display on my website) that is more browsable?
    • Run website checks for visual, navigational, etc. accessibility

Data, website, coding

  • Better reader view:
    • Create a more catalogue-page-like interface for items
    • Make them directly linkable so when I post or tweet about a zine, I can link people directly to its metadata page
  • Self-hosted data and interface: explore getting off AirTable, or keeping it as a backend and doing regular exports to reader and personal collecting interfaces I host myself, using data formats + Jekyll

  • Make metadata more wieldly for my editing:
    • I wish there were a way to collapse or style multiple fields/columns into sections/sets.
    • I might be able to hackily do this (all-caps for umbrella field for a section? emojis?); or
    • Using an extension allowing styling view (unsure if these are friendly for bulk-editing);
    • the self-hosted options mentioned above might let me better handle this (use or make my own, better viewing interface)
  • Crosswalk my metadata to xZINECOREx metadata?: so is interoperable with the Zine Union Catalogue and other metadata schema users

  • File renaming:
    • I started with a filename scheme using the first two words of a zine title, followed by a hyphen, then the first creator’s name (and “EtAl” if other creators exist)
      • I quickly switched to full titles, as this lets me convert them into alt text for my zine quilt
      • I need to go back and regularize this for PDFs, full-size cover images, and quilt-sized cover images.
  • Link cover images to zine metadata (or free e-reading link, if any?) from zine quilt vis

Metadata & cataloguing

  • Create personal blurbs for all zines that don’t have one written by me yet

  • Further research collected zines so I can fill in blank fields, such as publication date and location for all zines

Community

  • Explore setting up for better availability to the Zine Union Catalogue, if my project fits their goals

  • Further refine logo/graphics:
    • finish design work
    • create stickers to hand out, make myself some tshirts :D
  • Learn more about and/or get involved with some of the
    • cool zine librarian (Code of Ethics, ZLUC, visit zine library collections & archives) and
    • zine fest (e.g. Charlottesville Zine Fest, WTJU zine library) efforts

Research & publication

  • Publication:
  • More visualization or analysis of metadata fields, e.g.
    • timeline of publication
    • heatmap of publication locations
    • comparison of fonts or serif vs. sans serif fonts in zines
  • Digital zine quilt: play with look of the zine quilt further:
    • Add way to filter/sort covers?
    • Add CSS to make it look more quilt-like, e.g. color stiching between covers?

Making

  • Thermal mini-receipt printer:
    • Complete reads/zines recommendation digital quiz and mini-receipt recommendation printout kiosk.
    • Possibly make a version where the paper spools out of the bread holes of a vintage toaster, to go with the Zine Bakery theme?
    • Thanks to Shane Lin for suggesting a followup: possibly create version that allows printing subset of zines (those allowing it, and with print and post-print settings that are congenial to some kind of push-button, zine-gets-printed setup.
  • Real-quilt zine quilt: Print a SLab-friendly subset of zine covers as a physical quilt (on posterboard; then on actual fabric, adding quilt backing and stitching between covers?)

  • More zine card decks: create a few more themed subsets of the collection, and print more card decks like my initial zine card deck

Zine Bakery: topical zine collections

2024年8月16日 12:00

The Zine Bakery catalog is a public view of a subset of the Zine Bakery dataset. It includes most/all of the zines in my personal catalogue, but only a subset of the metadata fields—leaving out fields irrelevant to the public like how many copies of a zine do I have at home, or private data like links to private PDF backups of zines.

I recently set up a “Zine Reader’s View” here, which is 1) only the zines that allow anyone to read them online for free, and 2) only includes my catalogue metadata of most interest to folks looking to read zines (e.g. the metadata about printing zines is hidden).

I also set up my catalogue to link readers directly to just zines with certain themes, like feminist tech zines and digital humanities zines!

Screenshot of the multi-colored buttons on my ZineBakery.com website, linking people to specific subsets of my zine catalogue such as "tech knowledges" zines and "feminist tech" zines

Screenshot of the multi-colored buttons on my ZineBakery.com website, linking people to specific subsets of my zine catalogue such as “tech knowledges” zines and “feminist tech” zines.

In addition to viewing the whole public catalogue, you can now easily see:

(The “+” means that was the count of zines when I created these tags in early August, but I’m adding more zines all the time.)

My digital humanities makerspace research

2024年8月6日 12:00

My DH 2024 conference talk on my recent book-adjacent data physicalizations and makerspace research, as part of co-facilitating the #DHmakes mini-conference. What is #DHmakes? Briefly: anyone (you?) DH-adjacent sharing their (DH or not) crafty or making work with the #DHmakes hashtag, getting supportive community feedback. Resulting collaborations have included conference sessions and a journal article. For an in-depth explanation of #DHmakes’s history, rationale, goals, examples, see the peer-reviewed article I recently co-authored with Quinn Dombrowski and Claudia Berger on the topic.

Hey! I’m Amanda Wyatt Visconti (they/them). I’m Director of the Scholars’ Lab at the University of Virginia Library.

My background’s in librarianship, literature and textual scholarship, so a lot of my making is reading- or book-adjacent. I know the ways we do and share knowledge work can take really any format, as can the things that influence our scholarly thinking. I have been informed or inspired by, for example, a literal bread recipe; fictional creative work that explores new possibilities, or conveys an ethos I took back to my research; tutorials, informal discussions, datasets, infrastructural and administrative work, zines, social media posts, and countless other of the ways humans create and share thinking*.

First slide from my DH2024 #DHmakes talk, showing screenshots of my zine grid and zine database, and saying "to amplify & credit more formats of knowledge: data => making!"

Why make book-adjacent prototypes?

“Generous” citation—in whom we cite, and what formats of work we cite—is actually just accurate citation. Academia routinely lags in citing all the emails, attended conference talks, social media posts, elevator conversations, podcasts, reviewer comments, and more that inspire and inform our scholarship. With my particular context of a library-based lab: physical scholarship displays in academic libraries tend to disinclude relevant reads that aren’t in a print scholarly book or journal format.

It’s hard to display many of the formats I just listed, but also many people don’t think of them as worth displaying? This sends a message that some scholarly formats or methods are lesser, or not relevant to the building and sharing of knowledge. We know there’s systemic racism, sexism, and other harms in publishing and academia. Limiting ourselves to displaying and amplifying just some of the most gatekept formats of knowledge sharing—books and journal articles—fails at presenting a welcoming, inclusive, and accurate picture of what relevant work exists to inform and inspire around a given topic.

So, I’ve been using making projects to change what scholarly formats and authors the Scholars’ Lab will be able to amplify in its public space…

Data-driven research making

I started by focusing on collecting and describing a variety of DHy digital and physical zines, though I hope to expand the dataset to other formats eventually. (Briefly, you can think of zines as DIY self-published booklets, usually intended for replication and free dissemination, usually in multiple copies as opposed to some artists’ books being single-copy-only or non-replicable.) In the upper-left of the slide is a slice of my digital “zine quilt”, a webpage grid of zine covers from zines in my collection.

Second slide from my DH2024 #DHmakes talk, showing photos of my digital zine cover grid, themed reading card decks, a notebook open to design drawings, and a pile of makerspace supplies including a neon loop and a book cover

Having a richly described zine-y database I know by heart, because I researched and typed in every piece of it, has opened my eyes to ways data can suggest data-based research making.

I’ve got 3 crafting projects based on this zine database so far:

1st, I created a playing card deck that fits in a little case you can slip into your pocket. Each card has the title and creators of a zine, and a QR code that takes you to where you can read the zine for free online. This lets me hand out fun little themed reading lists or bibliographies, as shuffle-able card decks… or potentially play some really confusing poker, I guess?

2nd, I’m learning to work better with LEDs, sheet acrylic, and glass by reverse-engineering a simple and less gorgeous version of Aidan Kang’s Luminous Books art installation. Kang’s sculptures fills shelves with translucent, glowing boxes that are shaped and sized like books with colorful book covers. I’ve been prototyping with cardboard, figuring out how to glue glass and acrylic securely, and figuring out programmable lights so I can make these book-shaped boxes pulse and change color. I hope to design and print fake “covers” for non-book reads like a DH project or a dataset. This would let me set these glowy neon fake books on our real book shelves, where the colored light might draw people to look at them, and follow a link to interact with the read further.

3rd, I’m hooking up a tiny thermal printer, like the ones that print receipts, to a Raspberry Pi and small display screen. I’m hoping to program a short quiz people can take, that makes the printer print out a little “receipt” of reading recommendations you can take away, based on metadata in my reading database. I’d been working to construct a neon acrylic case that looks like a retro Mac to hold the display and printer, again figuring out how to make a simpler approximation of someone else’s art, in this case SailorHg’s “While(Fruit)”. But naming my collection a “Zine Bakery” got me excited about instead hiding the receipt printer inside a toaster, so the receipt paper could flow out of one of the toaster’s bread holes. You can read more about these book-adjacent making projects at TinyUrl.com/BookAdjacent, or the zine project at ZineBakery.com.

Unrelatedly: resin!

Completely unrelated to reading: I’ve been learning how to do resin casting! You can think of resin like chemicals you mix up carefully, pour carefully into molds over multiple days and multiple layers of pouring with various pigments and embedded objects, and carefully try not to breathe. It hardens into things like this silly memento mori full-size skull I made, where I’ve embedded novelty chatter teeth and a block of ramen for a brain. Or for this necklace, I embedded multicolor LED bulbs in resin inside of D&D dice molds.

Third slide from my DH2024 #DHmakes talk, showing photos of a translucent frosted resin skull with a ramen brain and chatter teeth, and a light-up D&D dice necklace

(See my recent post on resin casting for more about this work!)

Come #DHmakes with us!

I’ve discovered I really like the experience of learning new crafts: what about it is unexpectedly difficult? How much can I focus on the joy of experimenting and learning, and grow away from frustration that I can’t necessarily make things that are pretty or skillful yet? So I’ve got a weird variety of other things cooking, including fixing a grandfather clock, building a small split-flap display like in old railway stations (but smaller), mending and customizing clothes to fit better, prototyping a shop-vac-powered pneumatic tube, carving and printing linoleum, and other letterpress printing.

To me, the digital humanities is only incidentally digital. The projects and communities I get the most from take a curious and capacious approach to the forms, methods, fields we can learn from and apply to pursue knowledge, whether that’s coding a website or replicating a historical bread baking recipe. #DHmakes has helped me bring more of that commitment to experimentation into my life. And with that comes the joy of making things, being creative, and having an amazing supportive community that would love yall to share whatever you’re tinkering with using the #DHmakes hashtag, so I hope you join us in doing that if you haven’t already!

* Some of the text of this talk is replicated from my Spring 2024 peer-reviewed article, “Book Adjacent: Database & Makerspace Prototypes Repairing Book-Centric Citation Bias in DH Working Libraries”, in the DH+Lib Special Issue on “Making Research Tactile: Critical Making and Data Physicalization in Digital Humanities”.

Event: The RDM Open House

2024年8月7日 22:06

“Data are the lifeblood of research and good research data management (RDM) leads to reliable results, increased visibility, and greater impact. In light of supporting researchers to implement high quality RDM practices, the symbolic doors to our RDM support at KU Leuven will be pushed wide open from the 25th to 29th of November to celebrate best practices, tools and collaboration during The RDM Open House. The Research Data Management Competence Centre of KU Leuven invites everyone to join for training sessions, workshops, and open discussions. Whether you’re an early career researcher, a seasoned academic, research support staff or a policymaker, our doors are wide open. No prior expertise needed – just curiosity and a desire to enhance your skills in the field of Research Data Management.

Programme

  • Each day focuses on specific RDM topics, from sessions on the basic principles to a metadata tools fair, workshops on data protection or lectures on data sharing. You can pick and choose the days you would like to attend. There is no requirement to participate the full week.  For more information about the programme, visit the website
  • Knowledge Hub Community Day (28/11): Co-organized with the FRDN and hosted by KU Leuven, this event unites data stewards, RDM support staff, and professionals interested in open and FAIR data.

Practicalities

  • When: 25th to 29th of November 2024. You can pick and choose the days you would like to attend. There is no requirement to participate the full week.
  • Where: Sessions take place in Leuven’s city center.  Some sessions will be organized both in-person and online for broader accessibility.
  • Who: the RDM Open House opens its doors to everyone: from early career researchers and senior academic staff to research support personnel, students and policy makers, whether affiliated with KU Leuven or external institutions.
  • Learn more about the event on the website
  • Registration: Click here  and reserve your spot before November 11th to join us to celebrate open research data and it’s best practices!

CfP »Linked Open Data and Literary Studies« (19./20. November 2024, FU Berlin)

2024年6月13日 22:37

Die internationale Konferenz »Linked Open Data and Literary Studies« bietet Wissenschaftler*innen eine Plattform für den Austausch von Erkenntnissen, Methoden und bewährten Verfahren zur Nutzung von LOD-Technologien in der Literaturwissenschaft.

Immer mehr Institutionen, darunter Archive, Bibliotheken und Universitäten, reichern ihre Datensätze mit Metadaten an und veröffentlichen sie als Linked Open Data (LOD) im semantischen Web. Dieser Trend kommt auch den (digitalen) Geisteswissenschaften, einschließlich der Literaturwissenschaft, zugute und erleichtert die Einbeziehung metadatengestützter Ansätze in diesem Bereich. LOD können die Möglichkeiten zur Erforschung der globalen Dimensionen von Literatur erweitern.

Mögliche Themen der Konferenz:

  • literarische Metadaten und Knowledge Graphs,
  • semantische Annotation literarischer Texte,
  • LOD und unterrepräsentierte literarische Traditionen,
  • Grenzen der Modellierung von Linked Data,
  • LOD-Repositorien,
  • Veröffentlichung und Abfrage von LOD,
  • Visualisierungstechniken für die Erforschung literarischer Metadaten,
  • Data Reconciliation,
  • theoretische Grundlagen von LOD für die Literaturwissenschaft,
  • Fallstudien über den Einsatz von LOD in der literaturwissenschaftlichen Projekten,
  • Möglichkeiten und Grenzen der Nutzung von Linked Data für die Forschung zu literarischen Gattungen und Epochen,
  • semantische Search & Recommendation Systems,
  • KI-gesteuerte semantische Annotationen.

Bitte senden Sie Ihre Abstracts (200–300 Wörter) und Kurzbiografien an Frank Fischer (fr.fischer@fu-berlin.de) bis zum 24. Juni 2024.

Die Benachrichtigung über die Annahme erfolgt bis zum 8. Juli 2024. Für die Präsentationen ist eine Länge von etwa 20 Minuten geplant, gefolgt von einer 10-minütigen Diskussion.

Die Konferenz wird von der Research Area 5 des Exzellenzclusters 2020 »Temporal Communities« organisiert und findet am 19. und 20. November 2024 an der Freien Universität Berlin statt.

Der ausführliche (englischsprachige) Call for Papers findet sich auf der Homepage des Exzellenzclusters:
https://www.temporal-communities.de/calls/papers/cfp-open-data.html

Training: RDM for Humanities and Social Sciences

2024年3月12日 22:12

RDM covers a wide range of subjects, with extensive information that requires practical implementation. Within KU Leuven there are training sessions specifically designed to cultivate practical RDM skills. For researchers within the field of Humanities and Social Sciences, we recommend these upcoming training sessions to get yourself aquinted with RDM.

These events are only open to KU Leuven researchers and staff.

RDM Workshop for PhDs in Humanities and Social Sciences

Program

Research data management (RDM) refers to how you handle your data during and after your research project to ensure they are well organized, structured, of high quality and Findable, Accessible, Interoperable and Reusable (FAIR). During this session you will learn best practices for the management of research data according to the FAIR data principles. We consider the technical, legal, and ethical aspects of research data, secure storage of materials, documentation and metadata, research data sharing, reusing data shared by others, and more. This solid grounding in basic RDM skills will help you make informed decisions on how to handle your research data. Additionally, you will learn how to write and maintain your own Data Management Plan (DMP)

Practicalities

  • When: 21 March 2024, 14h00 -16h00
  • Where: Online
  • For who: This training is mainly aimed at doctoral researchers, preferably at the start of their PhD or project. 
  • Price and registration: free but registration is mandatory
  • More info: Click here.

Workshop Documentation & Metadata in Humanities and Social Sciences

Program

In this workshop we will focus on documentation and metadata. Through an introductive presentation, interactive exercises, polls and brainstorms the participants will go over the following topics: Organising files and folders, identifying information within data files and in datasets, searching for a metadata standard, metadata schemes, depositing data in the institutional data repository RDR. 

Practicalities

  • When: 18 April 2024, 13h00 -16h00
  • Where: Physical event at AGORA, M00.E67 Collaborative Study Space
  • For who: This workshop is intended for researchers in need of knowing the basics of documentation & metadata. 
  • Price and registration: free but registration is mandatory
  • More info: Click here.

Data Visualization: On and Off the Screen

作者Mac Scott
2018年10月4日 00:33
It’s easy to consider digital rhetoric and writing in terms of always-advancing computer technologies. This isn’t inaccurate, and keeping our fingers on the pulse regarding the rhetorical affordances of new software makes for innovative digital writing, research, and pedagogy. At the same time, however, it’s helpful to remember that digital rhetoric is more than what’s […]

Accessible Data Visualizations

2018年1月3日 23:18
Are you reading this blog post from computer screen or an screen reader? Did you need to adjust the font or text size, screen brightness, or filter the interface through a browser extension or rely on an app like Accessibility to access this information? In her entry on “Access” in Keywords for Disability Studies, Bess Williamson […]

When Data Visualization Goes Wrong and Numbers Mislead

2017年12月29日 22:34
Source image: The Most Misleading Charts of 2015 Fixed on Quartz To some students and readers, one of the rhetorical effects of data visualization is that the mere presence of a pie chart, graph, or timeline on a page confers “legitimacy” to an argument. At worse, this gesture attempts to obfuscate weak evidence. At best, […]
❌