The artificial intelligence model searches breast cancer interacting gene variants

The artificial intelligence model searches breast cancer interacting gene variants

The huge amount of genomic data has made possible that researchers can now calculate what kind of gene variants are among the groups who have cancers. Hundreds or thousands of gene variants can have an impact to a single disease. With statistical methods researchers can estimate how the gene variants of a single person can increase the disease risk.

In addition to gene variants there are also genomic variants in the locations of the single base pairs in the DNA stretch. The variations cause differences between individuals, but they can also help localise the disease-causing genes. These single nucleotide polymorphisms (SNP’s) can act as markers indicating the disease. The artificial intelligence model developed at the University of Eastern Finland searches breast cancer interacting SNP’s.

SNP’s can be beneficial when searching the genetic risk factors for cancer. In biomedical research, SNP’s are used for comparing regions of the genome between cohorts with and without a disease.

– When SNP’s occur within a gene or in a regulatory region near a gene, they may play a direct role in disease by affecting the gene’s function. We have a novel machine learning approach to identify group of interacting SNPs, which contribute most to the breast cancer risk, says researcher Hamid Behravan from University of Eastern Finland. He works in Kuopio at the Institute of Clinical Medicine.

– We have published several findings about identifying the genetic component of the breast cancer risk that would reliably distinguish disease cases from healthy controls. Identifying the breast cancer-associated SNPs that reliably distinguish disease cases from healthy controls may be particularly useful in improving breast cancer risk prediction and developing individual treatment strategies, says Behravan.

Since cancer is a multi-factorial disease caused by lifestyle, genetic, and environmental factors, individual analysis of the sources of genetic variants may not be enough to create a comprehensive view of the disease risk. According to Behravan other sources of data is needed.

– We are developing integrative machine learning approaches to combine different sources of data, such as demographic data.

Read the article at ELIXIR pages