Overview: Perform exploratory research of variants in pediatric cancer subtypes via 5 custom interactive interfaces. Curated variants have been identified from whole-genome, whole-exome and whole-transcriptome (RNA-seq) sequencing data from 5500 samples.
ONCOPRINT
A user will default to a placeholder if a diagnosis and/or subtype is not selected from the Sunburst (Figure 1).
Figure 1: Variants: Oncoprint view, without filtering. A placeholder is shown when a user directly accesses the Variants data facet.
Diagnosis and/or subtype is required for generating a new oncoprint (Figure 2).
warning
Updating the filters will not update the sunburst selection, but it will update the other facet views (e.g., Variant Prevelance, GenomePaint, and/or ProteinPaint).
Figure 2: Variants: Oncoprint view, with filtering. This figure can be generated by following the subtype-centric user experience where a (i) diagnosis and/or subtype was selected from the sunburst landing page or (ii) a user selected a diagnosis and/or subtype filter — the filters on the left will auto-populate based on the sunburst filter selection and can later be customized.
tip
- The oncoprint is generated by ProteinPaint and therefore if a user alters the configs within the ProteinPaint UI (e.g, settings, sort, data, etc), the filters will not update, nor will PeCan save this state in the membrane.
- Clicking a gene will enable a button that allows the user to navigate to ProteinPaint or GenomePaint for that gene.
Once the filters are selected and submitted, then an oncoprint will render. For example, a user can choose to render an oncoprint for just a disease: Brain, Solid, or Blood(Heme). Further, the user can customize the oncoprint by employing the ProteinPaint settings.
A few highlights to mention about the ProteinPaint configs are described below (Figure 3).
Figure 3: Oncoprint configs. ProteinPaint’s UI has customizable configurations that enable a user to edit and change the visualization.
VARIANT PREVALENCE
A user will default to a placeholder if a diangosis and/or subtype is not selected from the Sunburst (Figure 4).
Figure 4: Variants: Variant Prevalence view, without filtering. A placeholder is shown when a user directly accesses the tab.
Diagnosis and/or subtype is required for generating a new barplot (Figure 5).
Figure 5: Variant Prevalence view, with filtering. This visualization will provide variant data dependent upon the diagnosis and subtype selected. The Mutation Type
legend will be the same variant classes as seen for Oncoprint.
Once the filters are selected and submitted, then a new barplot will render. Once these filters are selected and submitted, then the new mutational data (i-iii) will render for that diagnosis and/or subtype selection: (i) the mutation type proportion (0-1) depicted as a bar plot displaying the proportion of each mutation type (consistent ordering of mutation types for each gene); (ii) mutations are depicted in a table view displaying somatic and germline counts; (iii) total # of mutations are depicted as a barplot to represent the total number of variants in a given gene for that selected diagnosis and/or subtype.
tip
- An example of when a user can choose to render the barplot for just a disease can be found here: Brain, Solid, or Blood(Heme).
-
Clicking on a gene will enable a button that allows the user to view the gene in either the ProteinPaint or GenomePaint view.
- If a subtype is already selected, the data will also be filtered.
note
For some subtypes, we have identified the most important disease pathway for the gene(s) and have categorized them as such. This information is represented via the use of headers. They are ordered by the pathway with the most genes.
GENOMEPAINT
The user will be given a GenomePaint view, supported by ProteinPaint, for all of the Pediatric dataset (highlighted in red) (Figure 6). See our Methods and Data section to learn more.
Figure 6. GenomePaint view, without filtering. This visualization will provide the basic GenomePaint view where the gene for a given reference genome for the full Pediatric dataset will show.
Or, the user can filter the view based on the selected subtype (Figure 7).
Figure 7. GenomePaint view, with filtering. This visualization will provide the basic GenomePaint view where the reference genome, Pediatric dataset, and the selected subtype data will show.
tip
- The user can navigate the UI using the GenomePaint configs (only coordinates or HUGO gene names are accepted).
warning
If the user alters configs using the ProteinPaint UI (e.g., coordinate view, reference genome) the PeCan platform will not save this state in the membrane or update any filters across other views in this facet.
PROTEINPAINT
The user will be given a ProteinPaint view for the defaulted Pediatric
dataset (highlighted in red) (Figure 8). See our Methods and Data section to learn more.
Figure 8. ProteinPaint view, without filtering. This visualization will provide the basic ProteinPaint view where the full Pediatric
dataset defaults to hg19.
When the user filters by diagnosis and/or subtype, the user will see a subset of data underneath the full Pediatric
dataset (Figure 9).
Figure 9. ProteinPaint view, with filtering. This visualization will provide the basic ProteinPaint view where the Pediatric dataset defaults and the variants for the gene are shown with the selected subtype variants underneath the lollipop full dataset.
tip
- The user can navigate the UI using the GenomePaint configs (only coordinates or HUGO gene names are accepted).
warning
If the user alters configs using the ProteinPaint UI (e.g., coordinate view, reference genome) the PeCan platform will not save this state in the membrane or update any filters across other views in this facet.
A few highlights to mention about the ProteinPaint configs are described below (Figure 10).
Figure 10. ProteinPaint configs. ProteinPaint’s UI has customizable configurations that enable a user to edit and change the visualization.
Certain variants that are classified as pathogenic or likely pathogenic (P/LP) may have a Variant Details page (Figure 11).
Figure 11. Accessing Variant Details. First select the variant of interest, this will populate a sunburst view. Click list
(highlighted in red above) which will populate the variants table. Then, click the pathogenic
or likely pathogenic
button to access the variants detail page.
note
If there are more variant changes at the same coordinate, a user can hover under the bottom variant and find a total which creates the variant table for all variants at that amino acid (as seen in Figure 11).
VARIANT DETAILS
A user can access variant details (Figure 12) which are individual customized variant pages displaying detailed associated genomic, clinical, classification, and observed population frequency annotation. These details are similar to the output in our PIE tool that displays the implementation of a systematic variant classification in the germline and tumor context and expert-recommended sources for variant classification (i.e., ClinGen expert committees).
Figure 12: (P/LP) Variant Details Page. A user interface that displays pathogenic or likely pathogneic context for recommended variant classfication.
Variant Details Features
Each variant links to a detailed variant page, which integrates data from a variety of sources.
Summary information
The top of the page shows a summary of the variant, including its genomic and HGVS annotations, predicted effect on the protein, and somatic and germline medals. A description of the gene from Entrez follows, and a custom description or selection rationale may also be entered.
Medal call information
Clicking on one of the medal icons (gold, silver, bronze, unknown) or on the top of the page will show a summary of information related to the medal call.
ProteinPaint
This embedded ProteinPaint Zhou et al., Nat. Genet. 2016 version enables visual inspection of variant recurrence, hotspots, and enrichment of loss-of-function mutation while highlighting the variant in the context of a number of pediatric datasets including PCGP. A user can toggle to the ProteinPaint tab to look at further datasets, including COSMIC and ClinVar.
ASHG pathogenicity classification
Formal variant pathogenicity classification is supported by an interface implementing ACMG guidelines (Richards et al., Genet Med. 2015). The system will compute an appropriate pathogenicity score based on the categories detected by the pipeline. ACMG tag descriptions can be found in Table 1.
Tags | Description |
---|---|
PVS1 | truncating variant in a tumor suppressor or other loss-of-function gene. |
PM1 | somatic hotspot in COSMIC. |
PM2 | absent from ExAC non-TCGA or appearing at a frequency not greater than 0.0001. |
BA1 | >5% population frequency in ExAC non-TCGA. |
PM4 | in-frame protein insertions and deletions. |
PS1 | amino acid comparisons made. |
PM5 | pathogenic variants in ClinVar or those identified by the St. Jude germline variant review committee. |
Table 1. ASMG tags. Definitions of ACMG tags.
ClinVar and allele frequency
- Matches of the variant in ClinVar are also provided, along with predicted clinical significance and review status.
- Allele frequencies for the variant in the Pediatric Cancer Genome Project (PCGP) (Downing et al. 2012) (somatic and germline), NHLBI ESP 6500, The 1000 Genomes Project Consortium 2015, and ExAC databases are presented both as fractional values and on a log10 plot. Detailed allele population breakdowns are provided for ExAC.
Damage prediction algorithms
Precomputed damage-prediction algorithm calls for nonsynonymous coding SNVs are presented from the dbNSFP database. Available algorithms are PolyPhen2 (HVAR), SIFT, CADD, REVEL, FATHMM, MutationAssessor, and LRT. The calls are presented as a list with entries color-coded based on the predicted severity of the result.
Medal ceremony and linkouts
Additional output from medal ceremony classification can also be reviewed. This is only loosely structured, additional fields here may eventually be integrated into Pecan PIE. Links are provided to relevant dbSNP entries and other information sources.