https://doi.org/10.1140/epje/i2018-11609-8
Regular Article
Analyzing similarities in genome sequences
1
Departamento de Física, Universidade Federal da Paraíba, 58051-970, João Pessoa, PB, Brazil
2
Departamento de Física, Universidade Federal Rural de Pernambuco, 52171-900, Recife, PE, Brazil
3
Departamento de Física, Universidade Federal de Pernambuco, 50670-901, Recife, PE, Brazil
* e-mail: phugof@gmail.com
Received:
24
August
2017
Accepted:
21
December
2017
Published online:
19
January
2018
This article investigates aspects of similarity between complete sequences of mitochondrial DNA by determining the distribution of the relative frequencies of words with different lengths and the characteristics of their relevance throughout the sequences. The degree of similarity is obtained by comparing the distances between words contained within these sequences. Our results indicate that the best groupings among different species depend on the lengths of words and their respective relative frequencies. We also observed that the longer the word the more consistent the grouping between the sequences becomes. The application of our results, together with the perspective of analyzing DNA sequences belonging to a single biological species, may be important for the construction of phylogenetic trees, which are appropriate structures for understanding the evolutionary history of the species.
Key words: Living systems: Biological Matter
© EDP Sciences, SIF, Springer-Verlag GmbH Germany, part of Springer Nature, 2018