ISB News

Now researchers can explore genomic data across space and time

The figure above is part of a four-step procedure for the multiscale segmentation of genomic signals.

3 Bullets:

  • Understanding systems from a multiscale perspective gives us a more detailed and holistic view of how features or functions from each scale connect and interact in a given system.
  • The challenge is integrating the different types of information that come from each scale in an efficient way that yields the most insight.
  • ISB developed a new tool to make it easier for researchers to to integrate, analyze and visualize human genome data at multiple resolutions.

By Varsha Dhankani

Most living and nonliving systems around us have important features at different resolutions – or scales – of space and time. Understanding such systems from a multiscale perspective gives us a more detailed and holistic view of how features or functions from each scale connect and interact in a given system.

Consider Google or Bing Maps. At the global scale of 30,000 feet, we can locate continents, countries and oceans. As we progressively zoom in, we can view cities, streets, homes and buildings. Having the ability to zoom in or out allows meteorologists, for example, to study weather patterns in a concentrated location or more broadly on the global scale.

Human genomes and the underlying biological mechanisms are no exception to this multiscale spatial and temporal nature. The challenge is integrating the different types of information that come from each scale in an efficient way that yields the most insight.

Journal: Nature Methods, online April 13, 2014 (link)

Title: Multiscale representation of genomic signals

Authors: Theo A. Knijnenburg, Stephen A. Ramsey, Benjamin P. Berman, Kathleen A. Kennedy, Arian F.A. Smit, Lodewyk F.A. Wessels, Peter W. Laird, Alan Aderem, Ilya Shmulevich

Scientists at ISB have developed a framework to generate a multiscale mapping of the human genome. The research “Multiscale representation of genomic signals” was published online on April 13 in the journal Nature Methods (in advance of the June print issue). The framework allows researchers to integrate, analyze and visualize information across multiple resolutions of the human genome. []

In general, biological information is encoded at scales of varying lengths of the genome, and it is crucial to integrate this information across multiple scales to gain a systems-level understanding of the biological mechanisms in health and disease.

This multiscale signal representation (MSR) method divides the genome into a hierarchical organization with single bases at the top progressively percolating information into larger segments of the DNA at the bottom of the hierarchy. This hierarchy enables researchers to summarize genomic information at multiple genomic scales.

The Science

For example, at a scale of 10-100 bases (lengths), binding sites on the DNA regulate the transcription of genes and exons within a gene make parts of the protein encoded by the gene. At a scale of 1000s of bases, DNA segments called CpG islands are susceptible to a DNA alteration called methylation that has been linked with silencing of tumor suppressor genes. Finally, at a scale of megabases, scientists study tightly packed DNA and protein structures like heterochromatin and nuclear lamina-associated domains (LADs). Heterochromatin has been associated with gene regulation as well as protection of the integrity of chromosomes. LADs represent a repressive chromatin environment indicated by low gene-expression levels in these regions.

The MSR approach was used to analyze both murine (mouse-model) data as well as human colon cancer data to demonstrate that genomic signals contain functional information at multiple scales. For murine data, predictive models based on multiscale representation of histone acetylation (a gene regulation mechanism) were able to predict the activation state of genes with better accuracy than conventional single-scale approaches.

Similarly, using a multiscale representation of DNA methylation based on human colon cancer data revealed that the relationship between DNA methylation and gene activation depends on the scale. DNA methylation has been recognized as an important component in organismal development as well as cancer onset and progression. The MSR approach confirmed the current understanding of the inverse correlation between methylation and gene expression at the scale of promoter regions of DNA.

By studying multiple scales concurrently, scientists at ISB observed a scale-dependent relationship between methylation and gene expression: Upregulated genes were associated with hypomethylation at small scales but hypermethylation at larger scales. Understanding such epigenetic processes at different scales can help scientists gain a better understanding of the organismal developmental process, as well as point at alternative targets for cancer therapy. The MSR approach saves researchers from the complex task of having to narrow down the genomic areas of interest.

Results from the MSR method encourage reformulation of biological experiments to expand beyond a predominantly gene-centric view of the genome to understand events that occur at much smaller or larger scales. MSR presents a novel and powerful way to unravel the biological information that can’t be observed at any single scale in isolation.

Just as with Google Maps, scientists can now zoom in and out of the human genome map and make discoveries that were not possible with a single-scale map.

– Theo Knijnenburg contributed to this report

About Varsha Dhankani: Varsha is a bioinformatics scientist in the Shumulevich Lab and a member of the ISB Editorial Board.

Related Articles

  • Genome fingerprinting

    Genome Comparison Easier with ‘Fingerprint’ Method

    ISB researchers have developed a method of capturing a personal genome as a “fingerprint,” which will have a major impact on how quickly and efficiently genome sequences are compared.

  • TCGA, ISB Researchers Identify Potential Drug Targets for Leading Form of Deadly Liver Cancer

    June 19, 2017 Researchers in ISB’s Shmulevich Lab and their colleagues in The Cancer Genome Atlas Research Network performed the first large-scale, multi-platform analysis of hepatocellular carcinoma, the predominant form of liver cancer. Study was published on June 15, 2017, in the journal Cell. 3 Bullets: Liver cancer is the second most common cause of death from cancer worldwide. ISB researchers and colleagues from The Cancer Genome Atlas Research Network…

  • William Poole, 2013 undergrad intern

    Former Undergrad Intern Publishes Algorithm in PLOS Computational Biology

    Posted March 1, 2017 In a study published in PLoS Computational Biology, researchers at Institute for Systems Biology (ISB) have developed a multiscale mutation clustering algorithm (M2C) that identifies variable length regions with high mutation density in cancer genes. The M2C algorithm was developed by William Poole (first author on the paper), who started as a summer intern in 2013 as part of ISB’s Center for Systems Biology internship program….

Show More Articles