Your browser is unsupported

We recommend using the latest version of IE11, Edge, Chrome, Firefox or Safari.

The data inside us

Yang Dai

Bioinformatics and the future of medicine

In 1990, scientists around the world began a project so complex it seemed unachievable. Called the Human Genome Project, its objective was to decipher the entire set of human DNA. It took thirteen years, but they accomplished the task, uncovering the complete structure and organization of billions of human genes.

Today, this knowledge is the baseline used to discover more about how the human body works, from the ways in which embryos develop in the womb to what can happen in our cells to cause disease. It also has given health care providers the ability to create individualized treatment plans with the help of genome sequencing, moving us toward a new age of personalized medicine.

The project’s success as well as its lasting impact on society is thanks to the help of the earliest stages of bioinformatics, a field that has been rapidly advancing ever since.

What is bioinformatics?

Bioinformatics combines two seemingly opposite disciplines: the study of the natural and living world with computers, machine learning, and big data. Its main goal is to discover more about biological processes by using software programs that can turn data sets into meaningful information. Most often, that data involves DNA sequences, which can be used to develop major research areas such as gene mutation and drug delivery.

One example of bioinformatics in action is a recent UIC study that used advanced computational algorithms to study HIV virus genes. Jie Liang, a Richard and Loan Hill professor in the biomedical engineering department, and his colleagues found a genetic switch that causes latent HIV inside cells, which are normally invisible to the immune system, to start replicating. The replicated cells are then able to be found by the immune system, which can completely eradicate the virus from the body. The work could help doctors eventually come up with a cure for HIV.

Research projects such as this one can give scientists a better understanding into how diseases work, which also gives them insight into how doctors can treat — or even cure — those diseases. These transformative insights are driving huge growth in the global bioinformatics market. Data released in a recent report by Brand Essence, a global market research firm, found that bioinformatics was worth nearly $11 billion in 2021 and is projected to grow to $24 billion by 2028.

Building a collaboration pipeline

Bioinformatics and computational modeling have had a long-established presence at UIC, but the University’s launch of the Center for Bioinformatics and Quantitative Biology, which opened three years ago, has helped highlight the fields. Liang, the director, says that the center has not only helped the College of Engineering recruit faculty members, but has also provided many opportunities to bring together researchers across the university.

“We have seen people with strong expertise in modeling, computation, and engineering on the one hand, and people who are studying complex biological and biomedical problems on the other hand sit down together, exchange notes, identify common interests, and strategize joint collaborations,” Liang said.

One of those faculty members taking advantage of the center is Yang Dai, an associate professor of biomedical engineering. Dai, who came to UIC with a mathematics background, started her career creating models to optimize processes and behaviors in factories and other industry settings. But when she arrived at UIC, she found herself pivoting to biomedical engineering. “We started with molecular profiling of tumors, and then generated high-dimensional data so that researchers could tell the difference between healthy and cancerous cells,” Dai said. “At the time I knew very little about biology, so it was sort of a gut feeling that my mathematical tools would have a lot of value and provide interesting research for me.”

To start, Peñalver and Dai are using more traditional machine learning methods to analyze pregnant women’s electronic medical records to look for patterns, outliers, and other important data that could factor into developing depression. They also plan to integrate multiple omics information, such as the gut microbiota, from a longitudinal study that is following a cohort of women throughout their pregnancies and beyond.

Dai and Peñalver’s long-term goal is to create an unbiased method for diagnosing perinatal depression, which would help address the inadequate care that an estimated 95 percent of women who have perinatal depression receive. Reasons for this include that many women are reluctant to tell family and health care providers because of the stigma around depression, lack of insurance, or fear that some treatments can be harmful for the fetus.

Creating this method as well as extracting clinically relevant information is a monumental challenge. But from this challenge could emerge huge benefits for women’s reproductive and mental health, Peñalver said.

“Machine learning is amazing. It can help us combine such complex data sets, but we still have a ways to go,” she added. “Right now, our technology is pretty good at predicting a specific outcome, like gathering a data set and using it to see who has cancer and who doesn’t have cancer. But data integration over time, over a long process like pregnancy, is not there yet.”

A field with big potential

Dai noted that the bioinformatics field has changed greatly since the work on the Human Genome Project. While the tools and computational technology have grown exponentially since 1990, so have the problems that scientists are now trying to solve. She added that the low-hanging fruit for straightforward projects — what makes up our DNA, for example — has all been collected.

Still, she is confident the insights will continue to come — as are many of her colleagues, including Meishan Lin, the Center for Bioinformatics and Quantitative Biology’s assistant director.

“It is inevitable that bioinformatics and quantitative biology will play a more and more important role in medical and biological research,” Lin said. “Advancement in experimental techniques is driving the growth of biological data. While all of this new growth is creating a huge challenge to handle the overwhelming amount of data, it certainly creates opportunities for researchers to develop more accurate and robust models that could lead to real breakthroughs.”