Exploring the Multifractal Behavior of the Human Genome T2T-CHM13v2.0: Graphical Representations and Cytogenetics



Summary

This study analyzed the multifractal behavior of the human genome T2T-CHM13v2.0 using Chaos Game Representation (CGR) and other methods, revealing distinct distributions and structural patterns in individual chromosomes and the entire genome.

Highlights

  • The study applied CGR to the complete human genomic sequence T2T-CHM13v2.0, analyzing the entire chromosome assembly and each chromosome separately.
  • Multifractal spectra were determined using two types of box-counting coverage, revealing slight variations across most chromosomes.
  • Chromosomes 9 and Y exhibited the greatest differences in singularity (Ho¨lder exponent), with minor variations in their fractal support.
  • The CGR distributions generally demonstrated an approximate separation between encoding and non-coding sections, as well as CpG or GpC islands.
  • A base-by-base analysis of the fractal support of the CGR uncovered characteristic structural bands in chromosome sequences, which align with patterns identified in cytogenetic studies.
  • The study compared two alternative representations: the Binary Genomic Representation (RGB) and the Markov Chain (MC) representation.
  • The optimal fit was achieved using MC for twelve-base chains, yielding an average percentage error of 2% relative to the full genomic assembly.

Key Insights

  • The multifractal analysis revealed that the human genome exhibits a complex and non-uniform structure, with distinct distributions and patterns in individual chromosomes and the entire genome.
  • The CGR method was effective in capturing the multifractal behavior of the genome, but had limitations in terms of memory usage and computational complexity.
  • The RGB and MC representations offered alternative approaches to analyzing the genome, with MC providing a better fit to the data for longer sequence lengths.
  • The study's findings have implications for our understanding of the structure and function of the human genome, and may inform future research in genomics and epigenomics.
  • The use of multifractal analysis and CGR may also have applications in the study of other complex biological systems and datasets.
  • The study highlights the importance of considering the complex and non-uniform nature of biological systems when analyzing and interpreting genomic data.
  • The findings of this study may also have implications for the development of new methods and tools for analyzing and visualizing large-scale genomic data.



Mindmap



Citation

Alvarez-Ballesteros, Y. A., Quiroz-Juarez, M. A., Del-Rio-Correa, J. L., & Escobar-Ruiz, A. M. (2024). Exploring the Multifractal Behavior of the Human Genome T2T-CHM13v2.0: Graphical Representations and Cytogenetics (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2412.16705

Previous Post Next Post

Contact Form