Tools Developed in Wellesley Computer Science Labs Aid Biologists, Medical Researchers in the Study of Diseases

June 21, 2016
An illustration of DNA, RNA and protein
Credit:
Janet Iwasa, Creative Commons, via exploringorigins.org

"Owing to advances in genome sequencing technology, the amount of DNA and RNA that is being sequenced is growing rapidly," explained Wellesley Computer Scientist Brian Tjaden. "A major challenge in the field of genomics is extracting new biomedical insights from this torrent of genome sequencing data."

Tjaden, who is associate professor of computer science and chair of the computer science department, designs approaches for identifying novel genes in a genome, as well as for characterizing the functions of genes and how gene products interact as part of a system that carries out important processes in a cell.

Tjaden's Rockhopper, a comprehensive and user-friendly system for computational analysis of bacterial RNA-seq data, is being used by biologists, genome scientists, and medical researchers to understand what regions of a genome are being activated and when, and to study diseases, including cancer. The system, which generates insights using novel algorithmic methods, has been downloaded more than 7,000 times for use by others since he created it and it has an active community of thousands of users.

According to Tjaden, Rockhopper supports the management, processing, analysis, and integration of large sequencing datasets both from organisms whose genomes are known as well as from organisms whose genomes are not yet known. Using sequencing data, Rockhopper identifies new genes and then characterizes the extent to which genes, both new and previously annotated, are expressed.

Another tool from Tjaden's lab, called TargetRNA, identifies mRNA targets of sRNA regulatory action in bacteria. "Many gene products act as regulators, turning on or off other genes. TargetRNA takes the sequence of a regulator and identifies other genes in a genome that are likely targets of regulation," said Tjaden. "While Rockhopper aims to characterize genes and their expression, TargetRNA aims to identify the targets of those genes that act as regulators."

A recent Wellesley Computer Science departmental newsletter described Tjaden as "always on the lookout for students interested in bioinformatics, biotechnology, and algorithms with biomedical applications." A second iteration of the TargetRNA program, called TargetRNA-2, was developed by then-students Mary Beth Kery '15 and Monica Starr Feldman '14. Kery and Feldman are lead authors on a research article describing their work.

Kery, with funding from the Science Center Summer Research Program, had the opportunity to begin development on TargetRNA-2 during the summer after her first year at Wellesley. "Initially, I simply read, since I was new to computational biology and had no bio background," Kery said. "Then I began putting together existing computational biology tools and developing the code for this system. I met weekly with Professor Tjaden to discuss progress, and we would discuss what avenues to try next, and how to interpret wet-lab bio results into computation."

By the end of that summer, Kery said, the team had made significant progress and TargetRNA-2 was predicting more accurately, and quickly, than all other related state-of-the art systems. TargetRNA receives approximately 20,000 sequence submissions for analysis from the biology community annually.

Kery, who just completed the first year of her Ph.D. in Human-Computer Interaction at Carnegie Mellon University, said, "I am particularly grateful that I had this opportunity to work with Professor Tjaden [during my] first-year. Researching early on inspired me to continue in research, and helped open new opportunities. I was able to research in different areas of CS throughout my time at Wellesley. Now my dream is to become a professor, too."

Feldman joined the TargetRNA project to design and develop the user interface during her last years at Wellesley. "I took a backend engineering class with Brian and would frequently express interest in the integration of backend engineering with the [user interface]. He saw this interest of mine and gave me an opportunity to contribute to both his research and my passions."

She described her efforts as "putting the frosting on" Tjaden's research. "I turned his algorithm and research into a user-friendly website that anyone could interact with," she said.

Feldman said she came to Wellesley not knowing how software was developed or, she said, "even that software was developed." She enrolled in CS110 only after she was unable to enroll in other classes she wished to take. However, she said, "Because of the incredible faculty (always approachable, friendly, nurturing, and encouraging), and the Wellesley Computer Science community, I ended up developing a passion for computing."

Feldman went on to "TA for several courses in the CS Department, help build websites for a few professors, and ended up working on an honors thesis project." She is now a lead software engineer at Apple. "This experience taught me to embrace a change in plans... and not to be afraid to venture into new turf. It also taught me the importance of mentoring and encouraging others," she said.

Tjaden's work is funded in part by grants from the National Science Foundation and the National Institutes of Health.