- Service Details
- Case Study
Functional annotation and enrichment analysis has been widely used in bioinformatics of omics research. Creative Proteomics can provide our customers multiple functional annotation and enrichment analysis services, such as GO annotation analysis and GO enrichment analysis, KEGG annotation and KEGG enrichment analysis, COG/KOG annotation, domain annotation and enrichment analysis, and subcellular localization. As one of the leading omics industry company in the world, we are open to help you with Functional Annotation and Enrichment Analysis Service.
What Is Functional Analysis
Common methods for gene (protein) functional analysis include metabolic signaling pathway analysis and Gene Ontology (GO) analysis. Additionally, there are other analyses such as Clusters of Orthologous Groups of proteins (COG) and protein domain analysis.
GO and pathway analyses both study gene function, but they have differences. GO primarily focuses on studying gene function, while pathway analysis involves the study of gene and protein functions. GO categorizes gene functions into three major classes: Biological Process (BP), Molecular Function (MF), and Cellular Component (CC). Among these, GO BP analysis is commonly used. Common pathway data sources include KEGG, Reactome, Biocarta, etc.
Functional analysis can be divided into two categories: functional annotation analysis and functional enrichment analysis.
What Is Functional Annotation
Functional annotation is the process of attaching biological information to sequences of genes or proteins. The basic level of annotation is using sequence alignment tool BLAST for finding similarities, and then annotating genes or proteins based on that. Nevertheless, nowadays more and more additional information of biological functions is added to the annotation system. The additional information allows hand-operated annotation to distinguish genes or proteins that have the same annotation. With many genomes sequenced, computational annotation approaches to characterize genes and proteins from their sequence are increasingly important.
Functional annotation consists of three main steps:
Identifying portions of the genome that do not code for proteins
Identifying elements on the genome, a process called gene prediction
Attaching biological information to these elements.
Functional annotation analysis involves annotating genes with GO terms and pathway information. For example, the DDR1 gene may be associated with 20 biological processes (GO BP), such as GO:0001558 regulation of cell growth, GO:0007155 cell adhesion, and GO:0031100 organ regeneration.
What Is Functional Enrichment Analysis
Functional enrichment analysis is a method to determine classes of genes or proteins that are over-represented in a large group of genes or proteins, and may have relations with disease phenotypes. This approach uses statistical methods to determine significantly enriched groups of genes. In GSEA, DNA microarrays, or RNA-Seq, are still carried out and compared between two distinct categories, but focusing on a gene set instead of a single gene in a long list. Researchers analyze whether the most of genes in the set is located in the extremes of the list: The top and bottom of the list represent the largest differences in expression between the two types. If the gene set falls at either the top (over-expressed) or bottom (under-expressed), it is considered to be related to the phenotypic differences.
The general steps of enrichment analysis provided by Creative Proteomics are summarized below:
Calculate a p-value that represents the amount to which the proteins in the set are over-represented at either the top or bottom of the list.
Evaluate the statistical significance of a node or pathway based on the p-value.
P-value for each set is normalized and a false discovery rate is calculated for multiple hypothesis testing.
Functional enrichment analysis refers to analyzing a gene set to identify significantly enriched functions using the hypergeometric distribution algorithm. By using enrichment analysis, we can summarize a comprehensive overview of events based on many seemingly scattered differentially expressed genes. For example, we can conclude that the TP53 signaling pathway is related to the occurrence of gastric cancer, rather than stating that the occurrence of gastric cancer is associated with the seven genes BAX, BID, ABL1, ATM, BCL2, BOK, and CDKN1A.
Application
Up to date, functional annotation and enrichment analysis has obtained Important achievements in variety of scientific research fields, such as:
Cancer cell profiling
Complex disorders (such as schizophrenia)
Depression
Spontaneous preterm birth
Genome-wide association studies
Creative Proteomics can provide the following services
How to place an order:
*If your organization requires signing of a confidentiality agreement, please contact us by email
Now, bioinformaticians at Creative Proteomics is opening to provide our customers functional annotation and enrichment analysis service. With years of experience in the computational sciences and knowledge of these powerful technologies, you will find what you need from the best. Contact Us for all the detailed information!
Liver Extrahepatic Bile Duct and Gallbladder Genes in a Biliary Atresia Mouse Model
Research Objective
Biliary atresia (BA) is a condition where there is blockage in the intrahepatic and extrahepatic bile ducts, leading to obstructive cholangiopathy and ultimately resulting in liver failure. Rotavirus infection can induce BA-like disease in mice. Therefore, we constructed a time expression profile of biliary atresia in neonatal mice infected with rotavirus. We analyzed the differentially expressed genes and their functional roles in the disease samples to elucidate the molecular mechanisms of the biliary atresia model.
Methods and Results
I Experimental Design:
Control samples treated with normal saline, with time points at 3 days (Day3_NS), 7 days (Day7_NS), and ** days (Day**NS), with 3 samples at each time point.
Experimental samples treated with rotavirus, with time points at 3 days (Day3_RRV), 7 days (Day7_RRV), and ** days (Day**_RRV), with 3 samples at each time point.
Microarray Preparation: Microarrays were created for gene expression analysis.
II Expression Data Preprocessing and Differential Expression Analysis of Microarray Data:
Gene expression values were used for Venn diagram analysis across multiple datasets to rapidly identify important genes and observe similarities and differences among differentially expressed genes at the three time points.
The results of the Venn diagram analysis are shown in Figure 1. There were 115 upregulated genes simultaneously appearing at all three time points, while there was only one downregulated gene. Subsequent analysis will focus on these 116 genes.
III Functional Enrichment Analysis of Differentially Expressed Genes:
Enrichment analysis tools were used to separately analyze the upregulated and downregulated genes at each time point for their involvement in GO functions and KEGG pathways. The VennPlex Venn diagram provided the expression direction consistency of the 116 differentially expressed genes across the three time points.
IV Intersection Gene Analysis:
Combining the results from the VennPlex Venn diagram analysis, we identified genes that showed consistent expression direction across the three time points. Transcription factors regulating these genes were predicted. We selected transcription factor-target gene relationships from the transfac and jaspar databases for this predictive analysis.