Byron Wallace

Data mining
Machine learning
Natural language processing

PhD in Computer Science, Tufts University
BS in Computer Science, University of Massachusetts at Amherst

Byron Wallace is an associate dean of graduate programs, director for the undergraduate data science program, and the Sy and Laurie Sternberg Interdisciplinary Associate Professor in the Khoury College of Computer Sciences at Northeastern University, based in Boston.

Wallace’s research areas include artificial intelligence, data science, machine learning, natural language processing, and information retrieval, with an emphasis on applications in health informatics. He is a member of the applied machine learning group and the Data Science and Analytics Lab at Northeastern. Before joining Northeastern, Wallace taught at Brown University and the University of Texas at Austin.

Wallace develops machine learning and natural language processing methods that make synthesizing the vast biomedical evidence base more efficient. He also works on core machine learning and natural language processing methods, with his recent work delving into convolutional neural network architectures for text. Wallace has also recently been developing hybrid, interactive human–machine learning systems that aim to combine human and machine intelligence.

Wallace's work has been supported by grants from the Army Research Office, the NIH, and the NSF. He won the Tufts University 2012 Outstanding Graduate Researcher award and his thesis work was recognized as The Runner Up for the 2013 ACM Special Interest Group on Knowledge Discovery and Data Mining Dissertation Award. Wallace also co-authored the winning submission for the Health Care Data Analytics Challenge at the 2015 IEEE International Conference on Healthcare Informatics.

Published: August 29th, 2016
RobotReviewer

Lead PI: Byron Wallace

Published: November 1st, 2024
Learning from Natural Language Explanations for Generalizable Entity Matching

Citation: Somin Wadhwa, Adit Krishnan, Runhui Wang, Byron C. Wallace, Luyang Kong. (2024). Learning from Natural Language Explanations for Generalizable Entity Matching EMNLP, 6114-6129. https://aclanthology.org/2024.emnlp-main.352
Published: November 1st, 2024
Token Erasure as a Footprint of Implicit Vocabulary Items in LLMs

Citation: Sheridan Feucht, David Atkinson, Byron C. Wallace, David Bau. (2024). Token Erasure as a Footprint of Implicit Vocabulary Items in LLMs EMNLP, 9727-9739. https://aclanthology.org/2024.emnlp-main.543
Published: August 1st, 2024
Open (Clinical) LLMs are Sensitive to Instruction Phrasings

Citation: Alberto Mario Ceballos-Arroyo, Monica Munnangi, Jiuding Sun, Karen Y. C. Zhang, Denis Jered McInerney, Byron C. Wallace, Silvio Amir. (2024). Open (Clinical) LLMs are Sensitive to Instruction Phrasings BioNLP@ACL, 50-71. https://aclanthology.org/2024.bionlp-1.5
Published: August 1st, 2024
FactPICO: Factuality Evaluation for Plain Language Summarization of Medical Evidence

Citation: Sebastian Joseph, Lily Chen, Jan Trienes, Hannah Louisa Göke, Monika Coers, Wei Xu , Byron C. Wallace, Junyi Jessy Li. (2024). FactPICO: Factuality Evaluation for Plain Language Summarization of Medical Evidence ACL (1), 8437-8464. https://aclanthology.org/2024.acl-long.459
Published: August 1st, 2024
InfoLossQA: Characterizing and Recovering Information Loss in Text Simplification

Citation: Jan Trienes, Sebastian Joseph, Jörg Schlötterer, Christin Seifert, Kyle Lo, Wei Xu , Byron C. Wallace, Junyi Jessy Li. (2024). InfoLossQA: Characterizing and Recovering Information Loss in Text Simplification ACL (1), 4263-4294. https://aclanthology.org/2024.acl-long.234
Published: June 20th, 2024
Investigating Mysteries of CoT-Augmented Distillation

Citation: Somin Wadhwa, Silvio Amir, Byron C. Wallace. (2024). Investigating Mysteries of CoT-Augmented Distillation CoRR, abs/2406.14511. https://doi.org/10.48550/arXiv.2406.14511
Published: June 1st, 2024
Towards Reducing Diagnostic Errors with Interpretable Risk Prediction

Citation: Denis Jered McInerney, William Dickinson, Lucy C. Flynn, Andrea Young, Geoffrey Young, Jan-Willem van de Meent, Byron C. Wallace. (2024). Towards Reducing Diagnostic Errors with Interpretable Risk Prediction NAACL-HLT, 7193-7210. https://doi.org/10.18653/v1/2024.naacl-long.399
Published: March 29th, 2024
On-the-fly Definition Augmentation of LLMs for Biomedical NER

Citation: Monica Munnangi, Sergey Feldman, Byron C. Wallace, Silvio Amir, Tom Hope, Aakanksha Naik. (2024). On-the-fly Definition Augmentation of LLMs for Biomedical NER CoRR, abs/2404.00152. https://doi.org/10.48550/arXiv.2404.00152
Published: January 16th, 2024
Function Vectors in Large Language Models

Citation: Eric Todd, Millicent L. Li, Arnab Sen Sharma, Aaron Mueller, Byron C. Wallace, David Bau. (2024). Function Vectors in Large Language Models ICLR. https://openreview.net/forum?id=AwyxtyMwaG
Published: January 16th, 2024
Evaluating the Zero-shot Robustness of Instruction-tuned Language Models

Citation: Jiuding Sun, Chantal Shaib, Byron C. Wallace. (2024). Evaluating the Zero-shot Robustness of Instruction-tuned Language Models ICLR. https://openreview.net/forum?id=g9diuvxN6D
Published: December 1st, 2023
Appraising the Potential Uses and Harms of LLMs for Medical Systematic Reviews

Citation: Hye Sun Yun, Iain James Marshall, Thomas A. Trikalinos, Byron C. Wallace. (2023). Appraising the Potential Uses and Harms of LLMs for Medical Systematic Reviews EMNLP, 10122-10139. https://aclanthology.org/2023.emnlp-main.626
Published: November 8th, 2023
Future Lens: Anticipating Subsequent Tokens from a Single Hidden State

Citation: Pal, K., Sun, J., Yuan, A., Wallace, B.C., & Bau, D. (2023). Future Lens: Anticipating Subsequent Tokens from a Single Hidden State. ArXiv, abs/2311.04897.
Published: May 23rd, 2023
Automated Metrics for Medical Multi-Document Summarization Disagree with Human Evaluations

Citation: Lucy Lu Wang, Yulia Otmakhova , Jay DeYoung, Thinh Hung Truong, Bailey Kuehl, Erin Bransom, Byron C. Wallace. (2023). Automated Metrics for Medical Multi-Document Summarization Disagree with Human Evaluations ACL (1), 9871-9889. https://aclanthology.org/2023.acl-long.549
Published: May 10th, 2023
Summarizing, Simplifying, and Synthesizing Medical Evidence using GPT-3 (with Varying Success)

Citation: Chantal Shaib, Millicent L. Li, Sebastian Joseph, Iain James Marshall, Junyi Jessy Li, Byron C. Wallace. (2023). Summarizing, Simplifying, and Synthesizing Medical Evidence using GPT-3 (with Varying Success) ACL (2), 1387-1407. https://aclanthology.org/2023.acl-short.119
Published: May 8th, 2023
Revisiting Relation Extraction in the era of Large Language Models

Citation: Somin Wadhwa, Silvio Amir, Byron C. Wallace. (2023). Revisiting Relation Extraction in the era of Large Language Models CoRR, abs/2305.05003. https://doi.org/10.48550/arXiv.2305.05003
Published: December 7th, 2022
That’s the Wrong Lung! Evaluating and Improving the Interpretability of Unsupervised Multimodal Encoders for Medical Data

Citation: Jered Jered McInerney, Geoffrey Young, Jan-Willem van de Meent, Byron C. Wallace. (2022). That's the Wrong Lung! Evaluating and Improving the Interpretability of Unsupervised Multimodal Encoders for Medical Data EMNLP, 3626-3648. https://aclanthology.org/2022.emnlp-main.238
Published: January 1st, 2022
Evaluating Factuality in Text Simplification

Citation: Ashwin Devaraj, William Sheffield, Byron C. Wallace, Junyi Jessy Li. (2022). Evaluating Factuality in Text Simplification ACL (1), 7331-7345. https://doi.org/10.18653/v1/2022.acl-long.506
Published: January 1st, 2022
PHEE: A Dataset for Pharmacovigilance Event Extraction from Text

Citation: Zhaoyue Sun, Jiazheng Li , Gabriele Pergola, Byron C. Wallace, Bino John, Nigel Greene, Joseph Kim, Yulan He . (2022). PHEE: A Dataset for Pharmacovigilance Event Extraction from Text EMNLP, 5571-5587. https://aclanthology.org/2022.emnlp-main.376
Published: June 6th, 2021
On the Impact of Random Seeds on the Fairness of Clinical Classifiers

Citation: Silvio Amir, Jan-Willem van de Meent, Byron C. Wallace. (2021). On the Impact of Random Seeds on the Fairness of Clinical Classifiers NAACL-HLT, 3808-3823. https://doi.org/10.18653/v1/2021.naacl-main.299
Published: April 25th, 2017
Exploiting Domain Knowledge via Grouped Weight Sharing with Application to Text Categorization

Citation: Ye Zhang, Matthew Lease, Byron C. Wallace