2022 Research Prize Winners

Outstanding Graduate Research in Computational Science

1st. Prize: Non-Parallel Text Style Transfer with Self-Parallel Supervision

Rubio Liu (Computer Science) Advisor: Vosoughi

Style is the packaging in which writing is presented, often influencing the reader's impression of the text's content. Text style transfer models can unwrap a text's packaging and repackage the same content in a different style. However, the performance of existing text style transfer models is severely limited by the non-parallel datasets on which the models are trained. In non-parallel datasets, no direct mapping exists between sentences of the source and target style; the style transfer models thus only receive weak supervision of the target sentences during training, which often leads the model to discard too much style-independent information, or utterly fail to transfer the style. In this work, we propose LaMer, a novel text style transfer framework based on large-scale language models. LaMer first mines the roughly parallel expressions in the non-parallel datasets with scene graphs, and then employs MLE training, followed by imitation learning refinement, to leverage the intrinsic parallelism within the data. On three benchmark datasets, LaMer shows superior performance than many baselines, and extensive human evaluations further confirm the effectiveness of LaMer.

2nd Prize: Hierarchical Tumor Immune Microenvironment Epigentic Deconvolution

Ze Zhang (Quantitative Biomedical Science, Epidemology) Advisor: Salas & Christensen

The complexity of the tumor microenvironment (TME) impedes the cost-effective deconvolution of TME cells with high resolution, accuracy, and specificity. We developed a novel tumor-type-specific hierarchical algorithm, HiTIMED, to deconvolve seventeen cell types in TME for twenty different carcinoma types using DNA methylation data in conjunction with the constrained projection quadratic programming approach. HiTIMED promises new avenues for the study of TME in assessing clinical outcomes. HiTIMED deconvolution is amenable to application in archival tumor biospecimens and provides a very cost-effective high-resolution cell composition profile enabling new opportunities to study the relation of the TME with etiologic factors, disease progression, and response to therapy.

Ze Zhang, John K. Wiencke, Karl T. Kelsey, Devin C. Koestler, Brock C. Christensen & Lucas A. Salas; HiTIMED: Hierarchical Tumor Immune Microenvironment Epigenetic Deconvolution for accurate cell type resolution in the tumor microenvironment using tumor-type-specific DNA methylation data; Cancer Research (Under review), May 2022

3rd Prize: Influence of ph on the Distribution and Evoluation of Lipid Cyclization Genes in Extremophiles

Laura Blum (Earth Sciences) Advisor: Leavitt

Single-celled Archaea are well-adapted to thrive in extreme environments on Earth, like hot springs. Optimal function of the cell membrane is crucial to survival in these high-stress environments, which reach the limits of temperature and pH conditions on Earth. We employed bioinformatics techniques to trace genes which form unique cell membrane lipid structures (grs) across hot springs environments. By analyzing genetic datasets from hot springs worldwide, we detected patterns in gene distribution associated with hot spring pH and temperature. Our results inform the relative importance of different environmental pressures in shaping the evolution of Archaeal lineages in diverse ecosystems. 

Blum, L.N., Colman, D.R., Eloe-Fadrosh, E.A., Kellom, M., Boyd, E.S., Zhaxybayeva, O., Leavitt, W.D. (2022, May 19). Distribution of GDGT Membrane Lipid Cyclization Genes in Terrestrial Thermal Springs Linked to pH. [Conference Talk]. AGU AbSciCon, Atlanta, GA, U.S.


Outstanding Undergraduate Research in Computational Science

1st. Prize: Entropy-based Metrics for Predicting Choice Behavior Based on Local Response to Reward

Ethan Trepka (Psychological and Brain Sciences) Advisor: Soltani

Our research is focused on predicting how animals distribute their choices in response to reinforcement with the goal of understanding how the brain supports learning and decision making. We developed metrics based on information theory and utilized them to study the choices that mice and monkeys made in dynamic decision-making tasks. We found that these metrics could explain a significant amount of variance in choice behavior in both mice and monkeys and could be used to construct more accurate reinforcement learning models of choice.  

Trepka, E., Spitmaan, M., Bari, B.A., Costa, V.D., Cohen, J.Y., Soltani, A. Entropy-based metrics for predicting choice behavior based on local response to reward. Nat Commun 12, 6567 (2021). https://doi.org/10.1038/s41467-021-26784-w 

2nd Prize: No Longer Lost in Translation: Lexicon-based Sentiment Analysis for Ancient Translation Variation

Georgina Davis (Classical Studies) Advisor: Glauthier

Within the field of Classical studies, translations play a pivotal role in engaging non-specialists with the ancient world. Since most Classical texts are written in languages that are no longer spoken, the translator occupies a powerful, mediating position between the scholar and the text. Despite the recent rise in computational linguistic abilities, there is little scholarship within the field that uses such methodologies. As a result, this paper presents a novel, cross-disciplinary approach to studying Classical literature by applying lexicon-based sentiment analysis to the study of ancient translation variation. Through a series of case studies, this paper examines several factors that influence translation sentiment including translator identity, author-specific writing style, conformity within translation practice, and generic tendencies of texts.

3rd Prize: Language Models are Multilingual Chain of Thought Reasoners

Suraj Srivats (Computer Science) Advisor: Vosoughi

Through experimentation on word problems drawing from the realms of arithmetic and symbolic reasoning, we show that multilingual chain of thought prompting can improve the reasoning abilities of large models. English back-translation and few-shot evaluation improve individual non-English language solve rate, while multilingual ensembling improves aggregate performance on arithmetic reasoning tasks. We also discover that foreign language solve rate is correlated to the frequency of language examples seen during training, and translation ability is indicative of solve rate in foreign languages.