All Classes and Interfaces
Class
Description
And expression
Annotate using a VCF "database"
Annotate using a VCF "database"
Note: Reads and loads the whole VCF file into memory
Annotate using a VCF "database"
Note: Assumes that the VCF database file is sorted.
Annotate using a tabix indexed VCF "database"
BoolArray is a class that provides a compact representation of a boolean array using a byte array.
Count number of heterozygous samples
Count number of homozygous samples
Count number of refernces samples
Count number of ALT samples
A set of DataColumns, indexed by position.
A wrapper for a data column of primitive type T
A Data Column is a column of a specific data tyle (String, Float, Long, etc.) that is stored using primitive types for memory efficiency.
A column of boolean values, that can also be null
The bollean values are stored in a BoolArray
DataFrameDel is a specialized subclass of DataFrame designed to handle deletion variants.
DataFrameIns is a specialized subclass of DataFrame designed to handle insertion variants.
DataFrameMixed is a specialized subclass of DataFrame that is designed to handle mixed data types.
DataFrameMnp is a specialized subclass of DataFrame designed to handle
multi-nucleotide polymorphisms (MNPs).
DataFrameOther is a subclass of DataFrame that is specifically tailored for handling
variant type counting and categorization.
The DataFrameRow class represents a row in a DataFrame.
The DataFrameSnp class extends the DataFrame class and represents a specific type of data frame
that deals with single nucleotide polymorphisms (SNPs).
Use a file as a 'marker' database.
DbNSFP database:
Reference https://sites.google.com/site/jpopgen/dbNSFP
DbNSFP database entry:
Reference https://sites.google.com/site/jpopgen/dbNSFP
Added lazy parsing of key/value pairs
Use a VCF file as a database for annotations
A VCF database consists of a VCF file and an index.
Loads a VCF file into memory.
Use an uncompressed sorted VCF file as a database for annotations
Note: Assumes that the VCF database file is sorted and uncompressed.
Use a bgzip-compressed, tabix indexed VCF file as a database for annotations
And expression
The EnumArray class is a specialized array for storing and managing strings as enumerated values.
Equal
Exists operator (true if a field exists)
A generic expresion
Expressions have values (VcfInfoType)
Binary condition
An expression that can be negated
Get a region from a fasta file
A field:
E.g.: 'DP', 'CHROM'
A 'constant' field: e.g.
A 'constant' field: e.g.
An 'EFF' field form SnpEff:
E.g.: 'EFF[2].GENE'
A field:
E.g.: 'GEN[2].GT'
A field:
E.g.: 'GEN[2].PL[3]'
Iterates on fields / sub-fields
It's a singleton
A LOF field form SnpEff:
E.g.: 'LOF[2].GENE'
A NMD field form SnpEff:
E.g.: 'NMD[2].GENE'
The Fields class represents a collection of VCF (Variant Call Format) header information fields.
A field that has sub fields (e.g.
A function that returns an expression (i.e.
A function that returns a bool type (i.e.
Greater equal
Greater equal
GWAS catalog table.
Entry from a GWAS-catalog
References:
http://www.genome.gov/gwastudies/#download
http://www.genome.gov/Pages/About/OD/OPG/GWAS%20Catalog/Tab_delimited_column_descriptions_09_27.pdf
Iterate on each line of a GWAS catalog (TXT, tab separated format)
Equal
Is an expression in a set?
An individual in the pedigree
Individuals are like TfamEntries but have drawing info (coordinates, color, etc.)
Is 'genotypeNum' heterozygous?
Is 'genotypeNum' homozygous?
Is 'genotypeNum' reference?
Is 'genotypeNum' reference?
Creates objects from an AST
Less or equal than
Greater equal
Represents a marker in a file (located at 'fileIdx' bytes since the beginning of the file)
Match a regular expression (string)
And expression
And expression
Exists operator (true if a field exists)
Not equal
Not expression
Match a regular expression (string)
Or expression
Draws a pedigree using SVG
Conservation score
And expression
The PosIndex class is designed to manage and index chromosome positions efficiently converting chromosome positions to zero-based indices.
A database query and a result.
Generic SnpSift tool caller
This class provides an empty implementation of
SnpSiftListener,
which can be extended to create a listener which only needs to handle a subset
of the available methods.This class provides an empty implementation of
SnpSiftVisitor,
which can be extended to create a visitor which only needs to handle a subset
of the available methods.Convert VCf file to allele matrix
Note: Only use SNPs
Note: Only variants with two possible alleles.
Annotate a VCF file with ID from another VCF file (database)
Annotate a VCF file from another VCF file (database) by creating "DataFrames" for each chromosome.
Count number of cases and controls
Summarize a VCF annotated file
Calculate genotyping concordance between two VCF files.
Convert allele 'matrix' file into Covariance matrix
Note: Only variants with two possible alleles.
Annotate a VCF file with dbNSFP.
Extract fields from VCF file to a TXT (tab separated) format
Generic SnpSift filter
Filter out data based on VCF attributes:
- Chromosome, Position, etc.
Filter using CHROM:POS only
Generic SnpSift genotype filter
Removes genotypes matching the filter:
e.g.
Annotate a VCF file using Gene sets (MSigDb) or gene ontology (GO)
Add genotype information to INFO fields
Annotate a VCF file using GWAS catalog database
Loads GWAS catalog in memory, thus it makes no assumption about order.
Calculate Hardy-Weinberg equilibrium and goodness of fit for each entry in a VCF file
Intersect intervals
Filter variants that hit intervals
Filter variants that hit intervals
Use an indexed VCF file.
Annotate a VCF file with ID from another VCF file (database)
Draws a pedigree using SVG according to a VCF file
Annotate using PhastCons score files
Annotate if a variant is 'private'.
Removes reference genotypes.
Removes INFO fields
Sort VCF file/s by chromosome invalid input: '&' position
Split a large VCF file by chromosome or bby number of lines
Calculate Ts/Tv rations per sample (transitions vs transversions)
Annotate a VCF file with variant type
Transform a VCF to a TPED file
Check VCF files (run some simple checks)
Annotate a field based on an operation (max, min, etc.) of other VCF fields
This interface defines a complete listener for a parse tree produced by
SnpSiftParser.This interface defines a complete generic visitor for a parse tree produced
by
SnpSiftParser.This is a class that reads a VCF file and returns the variants in sorted order.
The StringArray class implements a memory-efficient array of strings.
Abstract base class for a string array that supports iteration and serialization.
Summarize a VCF annotated file
And expression
The `VariantCategory` enum represents different categories of genetic variants.
A counnter for genotypes
A database of variant's data used to annotate a VCF file (i.e.
A DataFrame of variant's data that is indexed "variant type AND chromosome possition".
VariantTypeCounter is a class that counts the number of variants in a VCF (Variant Call Format) file.
The VariantTypeCounters class is responsible for counting the number of variants in VCF (Variant Call Format) files.
Calculate Hardy-Weimberg equilibrium and goodness of fit.
An index for a VCF file
Represents a set of VCF entries stored in an (uncompressed) file
All entries belong to the same chromosome
Interval tree structure for an 'VcfIndexChromo'
The whole tree is stored in a single class as a set of arrays.
Calculate Linkage Disequilibrium
Reference: "Principles of population genetics (4th edition)" Hartl invalid input: '&' Clark, pages 73 to 81
Note: I try to follow the same notation as the book.
Or expression
Test: This class loaads a "database" VCF file and then annotates another VCF file.