New & Noteworthy

New High-throughput GO Annotations Added to SGD

June 06, 2016

We’ve added 1,400 high-throughput (HTP) cellular component GO annotations from a new paper published by Maya Schuldiner’s lab. In this paper, Yofe et al., 2016 devised and implemented a methodology, called SWAT (short for SWAp-Tag), creating a parental library containing 1,800 strains, all known or predicted to localize to the yeast endomembrane system. Once created, this novel acceptor library serves as a template that can be ’swapped’ into other libraries, thus facilitating the rapid interconversion to new libraries by simply replacing the acceptor module with a new tag or sequence of choice. As proof of principle, this paper describes the parental library (N’ SWAT-GFP), and its utility as a gateway to the construction of two additional libraries (N’ mCherry and N’ seamless GFP). A high-content screening platform was used to generate images that were then manually reviewed and used to assign subcellular locations for proteins in these collections. Based on these results, SGD has incorporated GO annotations for proteins when at least two of three tags gave the same cellular localization. In addition, Locus Summary page descriptions for genes within this collection that did not have a known cellular location prior to this study have been updated. Finally, this study also provides access to a list of proteins predicted to contain signal peptides using three different algorithms. We would like to thank Maya Schuldiner and members of her lab for help with the integration of this information into SGD.

Categories: New Data

New SGD Help Video: Yeast-Human Functional Complementation Data

June 30, 2015

Yeast and humans diverged about a billion years ago, but there’s still enough functional conservation between some pairs of yeast and human genes that they can be substituted for each other. How cool is that?! Which genes are they? What do they do?

This two-minute video explains how to find, search, and download the yeast-human functional complementation data in SGD. You can find help with many other aspects of SGD in the tutorial videos on our YouTube channel. And as always, please be sure to contact us with any questions or suggestions.

Categories: Homologs, New Data, Tutorial

Tags: video, yeast model for human disease

Yeast-Human Functional Complementation Data Now in SGD

June 10, 2015

Yeast and humans diverged about a billion years ago. So if there’s still enough functional conservation between a pair of similar yeast and human genes that they can be substituted for each other, we know they must be critically important for life. An added bonus is that if a human protein works in yeast, all of the awesome power of yeast genetics and molecular biology can be used to study it.

To make it easier for researchers to identify these “swappable” yeast and human genes, we’ve started collecting functional complementation data in SGD. The data are all curated from the published literature, via two sources. One set of papers was curated at SGD, including the recent systematic study of functional complementation by Kachroo and colleagues. Another set was curated by Princeton Protein Orthology Database (P-POD) staff and is incorporated into SGD with their generous permission.

As a starting point, we’ve collected a relatively simple set of data: the yeast and human genes involved in a functional complementation relationship, with their respective identifiers; the direction of complementation (human gene complements yeast mutation, or vice versa); the source of curation (SGD or P-POD); the PubMed ID of the reference; and an optional free-text note adding more details. In the future we’ll incorporate more information, such as the disease involvement of the human protein and the sequence differences found in disease-associated alleles that fail to complement the yeast mutation.

You can access these data in two ways: using two new templates in YeastMine, our data warehouse; or via our Download page. Please take a look, let us know what you think, and point us to any published data that’s missing. We always appreciate your feedback!

Using YeastMine to Access Functional Complementation Data

YeastMine is a versatile tool that lets you customize searches and create and manipulate lists of search results. To help you get started with YeastMine we’ve created a series of short video tutorials explaining its features.

Gene –> Functional Complementation template

This template lets you query with a yeast gene or list of genes (either your own custom list, or a pre-made gene list) and retrieve the human gene(s) involved in cross-species complementation along with all of the data listed above.

Human Gene –> Functional Complementation template

This template takes either human gene names (HGNC-approved symbols) or Entrez Gene IDs for human genes and returns the yeast gene(s) involved in cross-species complementation, along with the data listed above. You can run the query using a single human gene as input, or create a custom list of human genes in YeastMine for the query. We’ve created two new pre-made lists of human genes that can also be used with this template. The list “Human genes complementing or complemented by yeast genes” includes only human genes that are currently included in the functional complementation data, while the list “Human genes with yeast homologs” includes all human genes that have a yeast homolog as predicted by any of several methods.

Downloading Functional Complementation Data

If you’d prefer to have all the data in one file, simply visit our Curated Data download page and download the file “functional_complementation.tab”.

Categories: New Data, Yeast and Human Disease

Tags: yeast model for human disease

Reference Genome Annotation Update R64.2.1

February 23, 2015

SGD curators periodically update the chromosomal annotations of the S. cerevisiae Reference Genome, which is derived from strain S288C. Last November, the genome annotation was updated for the first time since the release of the major S288C resequencing update in February 2011. Note that the underlying sequence of 16 assembled nuclear chromosomes, plus the mitochondrial genome, remained unchanged in annotation release R64.2.1 (relative to genome sequence release R64.1.1).

The R64.2.1 annotation release included various updates and additions. The annotations of 2 existing proteins changed (GRX3/YDR098C and HOP2/YGL033W), and 1 new ORF (RDT1/YCL054W-A) and 4 RNAs (RME2, RME3, IRT1, ZOD1) were added to the genome annotation. Other additions include 8 nuclear matrix attachment sites, and 8 mitochondrial origins of replication. The coordinates of many autonomously replicating sequences (ARS) were updated, and many new ARS consensus sequences were added. Complete details can be found in the Summary of Chromosome Sequence and Annotation Updates.

Categories: Data updates, New Data, Sequence

New Protein Modification and Abundance Data in YeastMine

December 17, 2014

Have you ever wondered what’s happening to your favorite protein as it’s hanging out in the cell? SGD’s advanced search tool, YeastMine, now includes four new templates that can be used to find protein modification and abundance data.

The Gene -> Protein Modifications template retrieves phosphorylation, ubiquitination, succinylation, acetylation and methylation data, currently curated from the following 11 publications: Peng et al. 2003, Hitchcock et al. 2003, Seyfried et al. 2008, Vogtle et al. 2009, Ziv et al. 2011, Mommen et al. 2012, Henriksen et al. 2012, Swaney et al. 2013, Kolawa et al. 2013, Weinert et al. 2013, and Wang et al. 2014.

The Gene -> Experimental N-termini and N-terminal modifications template retrieves experimentally-determined amino-terminal sequence and acetylation data, currently curated from Vogtle et al. 2009 and Mommen et al. 2012.

Lastly, two new templates pull protein abundance data curated from Ghaemmaghami et al. 2003. Gene -> Protein Abundance retrieves molecules/cell counts for a gene or list of genes. The same data can be quickly filtered using the Retrieve -> Proteins in a given molecules/cell abundance range template.

Please explore these new YeastMine protein data templates, and send us your feedback.

Categories: Data updates, New Data

New Alternative Reference Genomes

December 08, 2014

At SGD, we are expanding our scope to provide annotation and comparative analyses of all major budding yeast strains, and are making progress in our move toward providing multiple reference genomes. To this end, the following new S. cerevisiae genomes have been incorporated into SGD as “Alternative References”: CEN.PK, D273-10B, FL100, JK9-3d, RM11-1a, SEY6210, SK1, Sigma1278b, W303, X2180-1A, Y55. These genomes are accessible via Sequence, Strain, and Contig pages, and are the genomes for which we have curated the most phenotype data, and for which we aim to curate specific functional information. It is important to emphasize that we are not abandoning a standard sequence; S288C is still in place as “The Reference Genome”. However, we do recognize that it is helpful for students and researchers to be able to ‘shift the reference’, selecting the genome that is most appropriate and informative for a specific area of study.

These new genome sequences have been also been added to SGD’s BLAST datasets, multiple sequence alignments, the Pattern Matching tool, and the Downloads site. Please explore these new genomes, and send us your feedback.

Categories: Data updates, New Data, Sequence

Tags: reference genome, Saccharomyces cerevisiae, strains

New and improved Locus Summary pages

October 13, 2014

We are pleased to announce that the redesign of our gene-specific pages, which has been ongoing over the past year, is now complete with the release of the reworked Locus Summary page. The page contains all of the information on the previous Locus Summary page, and has a more modern look and feel. Note that the order and organization of the sections has changed, and the order of the tabs across the top of the page has changed as well. New elements on the page include a navigation bar on the left to take you to the different sections of the page, a redesigned map showing genomic context in the sequence section, and a new interactive histogram summarizing expression data. Biochemical pathway information now appears in its own section (see an example), and we have added a History section to replace the previous Locus History tab. If there are no data of a particular type (for example, Pathways), then that section is absent from the page.

Please explore this new page and send us your feedback.

Categories: New Data, Website changes

Redesigned Expression pages

October 06, 2014

The Expression pages have been redesigned and now include a clickable histogram depicting conditions and datasets in which the gene of interest is up- or down-regulated. Expression data are derived from records contained in the Gene Expression Omnibus, and datasets are assigned one or more categories to facilitate grouping, filtering and browsing. Short descriptions of the focus of each experiment are also provided. The PCL files generated for each dataset are used to populate the expression analysis tool SPELL. Also included on the pages are network diagrams which display genes that share expression profiles. The Expression pages provide seamless access to the SPELL tool at SGD, as well as external resources such as Cyclebase, GermOnline, YMGV and FuncBase.

Please explore these new pages, accessible via the Expression tab on your favorite Locus Summary page, and send us your feedback.

Categories: New Data, Website changes

New fungal homolog data at SGD

September 15, 2014

Have you ever wondered about the role played by the homolog of a particular yeast gene in other fungal species? SGD’s advanced search tool, YeastMine, can now be used to find homologs of your favorite Saccharomyces cerevisiae genes in the pathogenic yeast, Candida glabrata. There are now 25 species of pathogenic and non-pathogenic fungi in YeastMine, including S. cerevisiae.

The fungal homologs of a given S. cerevisiae gene can be found using the template called “Gene –> Fungal Homologs.” Fungal homology data comes from various sources including FungiDB, the Candida Gene Order Browser (CGOB), the Yeast Gene Order Browser (YGOB), the Candida Genome Database (CGD), the Aspergillus Genome Database (AspGD) and PomBase, and the results link directly to the corresponding homolog gene pages in the relevant databases.

A results table is generated after each query and the identifiers and standard names for the fungal homologs are listed in the table. As with other YeastMine templates, results can be saved as lists for further analysis. You can also create a list of yeast gene names and/or identifiers using the updated Create Lists feature that allows you to specify the organism representing the genes in your list. The query for homologs can then be made against the custom gene list.

All of the new templates that query fungal homolog data can be found on the YeastMine Home page under the “Homology” tab. This template complements the template “Gene → Non-Fungal and S. cerevisiae Homologs” that retrieves homologs of S. cerevisiae genes in humans, rats, mice, worms, flies, mosquitos, and zebrafish.

We invite you to watch SGD’s YeastMine Fungal Homologs video tutorial (also available below) for tips on accessing Fungal Homolog data at SGD. You can view all Video Tutorials for YeastMine here.

Categories: Homologs, New Data, Tutorial

New Sequence, Chromosome, and Contig pages

August 25, 2014

New Sequence pages are now available in SGD for virtually every yeast gene (e.g., HMRA1 Sequence page), and include genomic sequence annotations for the Reference Strain S288C, as well as several Alternative Reference Genomes from strains such as CEN.PK, RM11-1a, Sigma1278b, and W303 (more Alternative References coming soon). Each page includes an Overview section containing descriptive information, maps depicting genomic context in Reference Strain S288C (as shown below) and Alternative Reference strains, as well as chromosomal and relative coordinates in S288C.

The sequence itself includes display options for genomic DNA, coding DNA, or translated protein.

Also available on each Sequence page are links to redesigned S288C Chromosome pages, links to new Contig pages for Alternative Reference Genomes, and a Downloads menu for easy access to DNA sequences of several other industrial strains and environmental isolates. The new Sequence, Chromosome, and Contig pages make use of many of the features you enjoy on other new or redesigned pages at SGD, including graphical display of data, sortable tables, and responsive visualizations. The Sequence pages also provide seamless access to other tools at SGD such as BLAST and Web Primer. Please explore these new pages, accessible via the Sequence tab on your favorite Locus Summary page, and send us your feedback.

Categories: Data updates, New Data, Sequence, Website changes