Milad Alshomary
Columbia University. Data Science Institute

Schapiro Center
530 W 120th St
New York, NY 10027
I am a postdoctoral research scientist at the Data Science Institute at Columbia University, with a background in Natural Language Processing (NLP) focusing on human argumentation and the explainability of machine learning models. My research explores the intersection of argumentation and explainability, driven by the increasing need to understand the behavior of AI systems.
I earned my PhD in Computer Science from Paderborn University (July 2018 - December 2023). My doctoral research, resulted in the dissertation titled “Audience-Aware Argument Generation,” under the supervision of Professor Henning Wachsmuth, aimed to advance the effectiveness of argument generation by emphasizing the importance of relevance, consideration of the opponent’s argument, and addressing the audience’s interests. My current postdoctoral research at Columbia University (January 2024 - Present), under the supervision of Professor Kathleen McKeown and Professor Smaranda Muresan, focuses on developing methods for authorship attribution, with a particular emphasis on making these models explainable. This work involves studying how humans explain in dialogues and interpreting latent spaces to understand what aspects of style authorship attribution models capture. Recentely, I joined a project to study how vision LLMs perform a formal analysis of the style of art pieces - a collaboration between computer scientists, science and technology studies (STS) scholars, art historians, and legal scholars.
news
Aug 26, 2025 | I gave a guest talk at the Institute of Artificial Intelligence at Leibniz University Hannover (link). The talk covered my recent work on the explainability of authorship analysis models and how to make them robust for out-of-domain scenarios. |
---|---|
Aug 25, 2025 | Excited to share that our paper, “Generalizable Analysis of Authorial Style by Leveraging All Transformer Layers,” has been accepted at EMNLP2025 as a main conference paper (link). We introduce a novel approach to address authorship analysis, showcasing state-of-the-art outcomes in out-of-domain scenarios while progressing towards more interpretable models. Looking forward to sharing more insights at the conference! |
May 26, 2025 | I gave a guest talk at MBZUAI on the role of explainability and argumentation in the age of AI (link) |
Jan 20, 2025 | Our proposal on “Art Images and AI: Latent Space Interpretability, Art History, and the Law” get funded by Columbia’s Data Science Institute Seed Funding program. (link) We will explore approaches for explainability of vision LLMs when they perform formal analysis of Art pieces. |
Jan 20, 2025 | My PhD. disseration recieved the the Dissertation Award 2024 of Paderborn University |
Jun 1, 2024 | I am co-chairing the publication committee for the (EMNLP-24 conference) |
Jan 1, 2024 | I started a new position at the Data Science Institute at Columbia University as Postdoctoral Research Scientist. |
Dec 28, 2023 | I successfully defended my Ph.D. thesis at Paderborn University (tweet). Thesis can be found here (link) |
Jan 1, 2023 | I am co-organizing the 10th workshop on Argument Mining (link) |
Oct 23, 2022 | I participated in the “Towards a Unified Model of Scholarly Argumentation” Seminar at Dagtsuhl (link) |
Oct 12, 2022 | Recently, I attended the COLING-22 conference to present our paper “A Dialogue Corpus for Learning to Construct Explanations” (link) |
Aug 1, 2022 | We are organizing a SharedTask on Identifying human values in argumentative texts at SemEval 2023 (Link) |
selected publications
- Layered Insights: Generalizable Analysis of Authorial Style by Leveraging All Transformer LayersarXiv preprint arXiv:2503.00958 2025
- Latent Space Interpretation for Stylistic Analysis and Explainable Authorship AttributionIn Proceedings of the 31st International Conference on Computational Linguistics 2025
- Proceedings of the 10th Workshop on Argument MiningIn Proceedings of the 10th Workshop on Argument Mining 2023
- Semeval-2023 task 4: Valueeval: Identification of human values behind argumentsIn Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023) 2023
- The Moral Debater: A Study on the Computational Generation of Morally Framed ArgumentsIn Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 2022
- Toward Audience-aware Argument GenerationPatterns 2021
- Belief-based Generation of Argumentative ClaimsIn Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume 2021