Skip to main content
Sept. 11, 2020

Biology Major Researches How Computers Learn to Identify Genetic Material

When Breanna Arbanas’ biology research project fizzled in Fall 2019, she was frustrated. The data wasn’t producing any significant results. But it was one of many lessons the current senior was to learn about research.   Breanna Arbanas ’21 is using machine learning to identify human adenoviruses. “Machine learning is so new that every time a mistake came up I couldn’t Google it,” she said. “Dr. Pad and I would have to analyze the issue and work through it together.”

“This research helped me learn that there are so many things that are going to go wrong, and you just have to take a step back and change your ideology about it and you keep moving forward,” Arbanas said. “I think that about research is fascinating but also frustrating at the same time.”

When Breanna Arbanas’ biology research project fizzled in Fall 2019, she was frustrated. The data wasn’t producing any significant results. But it was one of many lessons the current senior was to learn about research.
 
“There are so many things that are going to go wrong, and you just have to take a step back and change your ideology about it and you keep moving forward,” Arbanas said. “I think that about research is fascinating but also frustrating at the same time.”
When Mahadevan suggested the topic on machine learning, all Arbanas could think of were horror films where computers took over the world and artificial intelligence was maleficent. But she had recently attended a symposium on artificial intelligence and thought she could focus the project on medical research, given her career goal of becoming a physician. 
Machine learning is a subset of artificial intelligence and is the ability for a computer to improve its performance by self-regulating its functionality. Arbanas’ research project determines whether a computer program can identify one of the many strains of the human adenovirus. Similar research has been done with viruses like HIV and HPV, all with the goal of finding a faster and easier way for researchers to identify unknown strains of these viruses and would lead to quicker treatment plans for patients.
“Once you know what something is, it’s automatically easier to treat. It’s automatically easier to understand how to assess the patients better,” Arbanas said. “If we can advance (the diagnostic) process at all, anything following will be easier.”
Human adenovirus infection typically results in fevers, upper respiratory tract symptoms and conjunctivitis, according to the World Health Organization. Arbanas utilized a free, public access database called GenBank, the National Institutes of Health’s collection of all publicly available DNA sequences, and found 754 sequences (or varieties) they could use of the genetic material (or genome) of the human adenovirus. She also utilized a publicly available, free tool called CASTOR that has been created to teach computers how to identify and classify viral genomes.
After trial and error (she had tried two other tools before CASTOR), Arbanas was able to determine that machine learning shows potential to be used to organize samples of the human adenovirus, matching them to their correct strains. Her results speak in favor of using machine learning as a tool for medical research.
Poster describing the reseach, “Classification of Human Adenoviruses Using Machine Learning”

“This research helped me learn that there are so many things that are going to go wrong, and you just have to take a step back and change your ideology about it and you keep moving forward,” Arbanas said. “I think that about research is fascinating but also frustrating at the same time.”

“Since human adenoviruses are an important human pathogen and cause a variety of disease, accurately classifying these viruses is an important step in better understanding the pathogenicity and epidemiology of this virus,” Mahadevan said.
Much of Arbanas’ work was learned on the go and through Mahadevan’s mentorship. She was familiar with GenBank due to the genetics course she took with him previously. She is in his bioinformatics course this semester and is sailing through it, since she was basically doing an experiential bioinformatics course with her research project this summer.
“(Studying) bioinformatics requires attention to detail, good quantitative skills, ability to troubleshoot and solve problems, as well as perseverance and a willingness to learn new computational techniques/methods,” Mahadevan said. “Breanna is an extremely hard-working student, who pays attention to detail and learns new material very quickly. She has a very positive attitude which is necessary to succeed in research, and indeed in life. She is very independent, and I could not have asked for a better research student.” 
Arbanas’ experience has broadened her view of the impact she could have as a doctor. 
“I used to think the medical field meant you wear a white coat and you see patients. But there’s so much more behind the scenes that you can do that at the end of the day will help a patient,” said Arbanas, a biology major and chemistry minor from Jefferson, GA. “It made me wonder what can I contribute beyond seeing a patient, diagnosing and treating them?” 
“Start that conversation and get in as early as possible. My greatest advice is to be bold,” Arbanas said. “A lot of my professors have told me they love working at UT because they love doing undergraduate research with students. Be open, be bold and get your foot in the door. Just like with research, you can try and you might fail but that doesn’t mean you have to stop. Keep going.”