1.1. NCBI Overview
1.1.1. Google for NCBI and go to the NCBI home page.
1.1.2. Briefly go over the Resources by categories. Check all of the databases by clicking on the “All Databases” box.
1.1.3. We are going to learn to use the NCBI databases using a gene that was identified as significantly differentially expressed in a genome scale experiment, but does not seem to have any known function. Google for “rnaseq kidney transplant” , open the PubMed article and look for the “Table 4”;
1.1.4. Go to the NCBI home page and search for “c10orf105”
1.1.5. Explore the result sets
1.2. Gene Database
1.2.1. Search the NCBI Gene database for “c10orf105”
1.2.2. How many RefSeq transcripts are available for this gene ?
1.2.3. How many organisms have orthologs to this gene ?
1.2.4. Where does this gene seem to be expressed most ?
1.2.5. Are there any references for this gene’s function ?
1.2.6. Choose the “Expression” option to view only the expression data, with counts and RPKM values for this gene
1.2.7. Download the “Full Report (text)” file, using the “Send to…” option, for this gene
1.2.8. Open the RefSeq mRNA and protein sequence entries for the first variant of this gene
1.2.9. In the RefSeq page, choose the “FASTA (text)” option and save the mRNA and Protein sequences.
1.3. Gene Advanced Search
1.3.1. Click on the “Advanced” option below the Gene database search box.
1.3.2. In the resulting page, find out how many gene entries are present for the homo sapiens, by using the “Organism” filter.
1.3.3. Search for all the human genes annotated as involved in the allergic rhinitis…
1.3.4. You can also create custom filters, to easily group and navigate through the gene search results. Click on the “Manage Filters” option.
1.3.5. If you do not have the NCBI account, please create one and login. You can also associate outside identity service providers to login to NCBI.
1.3.6. You need this account …
to submit data to NCBI
to create custom filters
to create save and use custom email alerts
to save your searches
to maintain a curated list of your pubmed indexed publications
1.3.7. Create a your gene filter list with the following standard filters;
1.3.8. Search the Gene database for “unknown function”. Using the filters that you created, find out how many of the results are found in human.
1.3.9. Combine the above human hits with the left hand side faceting menu and find out how many of those human genes with unknown functions are alternatively spliced.
1.4. Expression Database
1.4.1. Search for the Staphylococcus GEO datasets that are published by ‘nagarajan’.
1.4.2. How many human GEO datasets are there that studied “stress” as the Subset Variable Type.. Use the “Show index list” option to get a the numbers.
1.4.3. Search for the Expression Profiles of the gene “c10orf105”. Filter the results for “Up/down genes”. Choose the “Profile neighbors” of the resulting hits and run the “Find pathways” option under the “Profile pathways” section;
1.4.4. Do you see any relevant pathways ?
1.5. Variation Database
1.5.1. Search ClinVar database for “BRCA2gene”. Most of the pathogenic variants are of which type (SNP or deletion or duplication or indel )...?
1.5.2. Find out how many pathogenic ClinVar variants are reported for BRCA2 gene, along with the number of pathogenic dbVar entries and pathogenic dbSNP entries.
1.6.1. Search for “tylenol” in the PubChem compound database;
1.6.2. Click open the first hit (acetaminophen) and go to the “Bioactivities link”. Choose the top most “Target” gene against which tylenol has been characterized as “Active”
1.7. UniProt Database
1.7.1. Google for “uniprot”. How many Human reviewed proteins entries are currently present at UniProt ?
1.7.2. How many reviewed human transmembrane proteins are available in the UniProt database ? (HINT: Use the Advanced Search)...
1.7.3. Search for the “c10orf105” related protein.
1.7.4. Use the UniProt ID mapping service to convert IDs from one format to another format.
1.8. Array Express ATLAS
1.8.1. Google for “array express atlas”.
Search for “c10orf105”
1.8.2. Choose the “Differential Expression” tab and browse through the gene expression map.
1.8.3. Search for the expression profile of the two ORFs identified in our allograft publication. Note anything interesting ?
1.9. STRING Database
1.9.1. Google for “string protein”. Search for “trpv1” and choose the “human” or Organism.
1.9.2. Click “Continue” in the next page, after confirming that the returned hit is the correct one.
In the resulting page, find the combined score for the strongest associations between the two proteins (SRC and PRKCE).
1.9.3. Click the “Analysis” button and see if you find any pathway related to acetaminophen.
1.10. Other databases
Google for NAR databases