Fahim Faisal

fahim_small.jpg

Hi there, I’m Fahim!

I am a Ph.D. candidate in the Department of Computer Science at George Mason University, working with the GMU NLP group under the supervision of Dr. Antonios Anastasopoulos. My research centers on making language models work for the world’s underrepresented languages — particularly in low-resource settings where data is scarce and the stakes are high. More broadly, I am interested in the intersection of language, culture, and computation. I am increasingly interested in AI-driven scientific discovery and in evaluating LLMs as their capabilities expand into new frontiers.

I am currently on the job market, seeking faculty positions in NLP/LLM as well as research scientist roles. I am expecting to graduate in Spring–Fall 2026. (CV)

News

Aug 30, 2025 One paper on dialectal toxicity detection accepted in EMNLP 2025 findings.
Aug 22, 2025 Wrapped up my summer internship at Zoom, where I worked on improving multilingual reasoning capabilities of LLMs utilizing a high-resource expert model.
Nov 1, 2024 I am at EMNLP 2024, presenting my paper at the MRL Workshop.
Aug 25, 2024 DialectBench received the Best Social Impact Award at ACL 2024.
Aug 15, 2024 Wrapped up my summer internship at eBay, where I worked on creating a policy-aligned synthetic dataset for e-commerce LLM safety alignment.

Selected Publications

  1. ACL
    DIALECTBENCH: An NLP Benchmark for Dialects, Varieties, and Closely-Related Languages
    Faisal, Fahim, Ahia, Orevaoghene, Srivastava, Aarohi, Ahuja, Kabir, Chiang, David, Tsvetkov, Yulia,  and Anastasopoulos, Antonios
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Aug 2024
  2. ACL
    Dataset Geography: Mapping Language Data to Language Users
    Faisal, Fahim, Wang, Yinkai,  and Anastasopoulos, Antonios
    In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) May 2022
  3. EMNLP
    SD-QA: Spoken Dialectal Question Answering for the Real World
    Faisal, Fahim, Keshava, Sharlina, Alam, Md Mahfuz Ibn,  and Anastasopoulos, Antonios
    In Findings of the Association for Computational Linguistics: EMNLP 2021 Nov 2021
  4. AACL
    Phylogeny-Inspired Adaptation of Multilingual Models to New Languages
    Faisal, Fahim,  and Anastasopoulos, Antonios
    Accepted for publication in AACL 2022 Nov 2022
  5. JOI
    Mining Temporal Evolution of Knowledge Graphs and Genealogical Features for Literature-based Discovery Prediction
    Choudhury, Nazim,  Faisal, Fahim,  and Khushi, Matloob
    Journal of Informetrics Nov 2020