The regulatory genome constrains protein sequence evolution: implications for the search for disease-associated genes
»
The regulatory genome constrains protein sequence evolution: implications for the search for disease-associated genes
Development of protein sequence evolution models clearly have far-reaching implications for our understanding of cellular biology, population history, and etiology of the disease. Here we analyze the transcriptome GTEx resources to measure the effects of the transcriptome on the protein sequence evolution within the framework of multi-network. We found substantial variation between the central nervous system tissue expression effect variance on an evolutionary level, with highly variable genes in cortex indicates purifying selection is significantly larger than highly variable genes in subcortical regions (Mann-Whitney U p = 1.4 × 10- 4).
The remaining network cluster in expression observed correlation with the level of evolution, allowing the analysis of the evolution of genes in diverse physiological systems, including the digestive system, reproductive, and immune system. Importantly, the tissue where the gene reaches a maximum variance expression varied significantly (p = 5.55 × 10-284) with an evolutionary level, shows a model of tissue-anchored protein sequence evolution.
Using large-scale reference resource, we show that the model provides a network-based approach transcriptome anchored to predict the main tissue affected by developmental disorders. Using a gradient driven regression tree to model the evolution of the level under various parameters of the model, features selected account for up to 62% of the variation in the rate of evolution and provide additional support to the network model.
Finally, we investigated several methodological implications, including the importance of gene expression imputation models-rate-conscious evolution using genetic data to improve the search for genes related disease in transcriptome-wide association studies. Collectively, this study presents a comprehensive transcriptome analysis based on the various factors that can inhibit the molecular evolution and propose a new framework for the study of gene function and disease mechanisms.
The regulatory genome constrains protein sequence evolution: implications for the search for disease-associated genes
Optimizing primer used nuclear protein-coding genes in the phylogeny of beetles and their applications in the genus Sasajiscymnus Vandenberg (Coleoptera: Coccinellidae)
Advances in genome biology and the increasing availability of genomic resources allows developing hundreds (NPC) protein-coding nuclear markers, which can be used in phylogenetic studies. However, for lower taxonomic level, it may be more practical to select multiple loci molecules suitable for phylogenetic inference. Unfortunately, the presence of the marker NPC degenerate primers can be a major obstacle, as the amplification success rate is low and they tend to reinforce the nontargeted areas.
In this study, we optimized the five fragments NPC is widely used in phylogenetic beetle (ie, the two parts of carbamoyl-phosphate synthetase: CADXM and CADMC, topoisomerase, winged and PEPCK) by reducing the length of the site and the primary sagged a little target gene. The fifth fragment NPC and 6 other molecules amplified loci to test the monophyly of the genus coccinellid Sasajiscymnus Vandenberg. Analysis of molecular data sets us clearly supported monophyletic genera may Sasajiscymnus but confirmation with extended sampling is required.
A fossil-calibrated kronogram produced by BEAST, shows the origin of the genus in the late Cretaceous period (77.87 Myr). In addition, phylogenetic informativeness profile is generated to compare the phylogenetic nature of each gene is more explicit. The results showed that COI provide phylogenetic signal is the strongest among all the genes, but PEPCK, topoisomerase, CADXM and CADMC also relatively informative. Our results provide insight into the evolution of the genus Sasajiscymnus, and also enriches the molecular source of data for further study.