Class VariantDatabase
java.lang.Object
org.snpsift.annotate.mem.database.VariantDatabase
A database of variant's data used to annotate a VCF file (i.e. VCF entries).
The database only loads one chromosome at a time, to speed up the process while fitting in memory.
The database is a collection of VariantDatabaseChr objects, one per chromosome. Each 'VariantDatabaseChr' is
stored in one file per chromosome.
'VariantDatabase' manages the 'VariantDatabaseChr' files (loading, saving, etc).
This class provides methods to:
- Create a variant database from a VCF file or its contents.
- Annotate VCF entries using the database.
- Add VCF entries to the database.
- Check the presence of required fields in the database.
- Load and save the database and its components.
- Handle database directories and file names.
The class maintains the current chromosome and interval being processed, and uses a VariantDataFrame to store
the data for the current chromosome. It also uses VariantTypeCounters to count variants per chromosome.
The class supports verbose logging and progress display during database creation and annotation.
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionVariantDatabase(String databaseVcfFileName, String dbDir, String[] fieldNamesCreate) Constructor used to create a databaseVariantDatabase(String databaseVcfFileName, String dbDir, String[] fieldNamesAnnotate, String prefix, boolean emptyIfNotFound) Constructor used to annotate a database -
Method Summary
Modifier and TypeMethodDescriptionprotected voidadd(org.snpeff.vcf.VariantVcfEntry variantVcfEntry) Add a VCF entry to the databaseintannotate(org.snpeff.vcf.VcfEntry vcfEntry) This method is used to annotate a VCF entry The annotations are added to the INFO field of the VCF entrybooleancheckFields(String[] fieldNames, boolean throwExceptionOnError) Check that all `fieldNames` are available in `fields`voidcreate()Create a database from a VCF filevoidCreate a database from a VCF file content (as a string) This is used for testingstatic StringdbDirFromVcfFileName(String dbFileName) Get the database directory name from a "VCF database" file name buy just appending ".snpsift.vardb"Get the database for a chromosomegetDbDir()voidload()protected FieldsparseVcfHeaderFields(String databaseVcfFileName) Read VCF header and get columns data typesprotected FieldsparseVcfHeaderFields(org.snpeff.fileIterator.VcfFileIterator vcfFileIterator) Read VCF header and get columns data typesvoidvoidvoidsetFieldNamesAnnotate(String[] fieldNamesAnnotate) voidvoidsetVerbose(boolean verbose) toString()Collection<org.snpeff.vcf.VcfHeaderEntry>
-
Field Details
-
VARIANT_DATABASE_EXT
- See Also:
-
VARIANT_DATAFRAME_EXT
- See Also:
-
FIELDS_EXT
- See Also:
-
-
Constructor Details
-
Method Details
-
dbDirFromVcfFileName
Get the database directory name from a "VCF database" file name buy just appending ".snpsift.vardb" -
add
protected void add(org.snpeff.vcf.VariantVcfEntry variantVcfEntry) Add a VCF entry to the database -
annotate
public int annotate(org.snpeff.vcf.VcfEntry vcfEntry) This method is used to annotate a VCF entry The annotations are added to the INFO field of the VCF entry -
checkFields
Check that all `fieldNames` are available in `fields`- Returns:
- true is all fieldsNames are present, false otherwise
-
create
Create a database from a VCF file content (as a string) This is used for testing -
create
public void create()Create a database from a VCF file -
get
Get the database for a chromosome -
getDatabaseVcfFileName
-
getDbDir
-
getFields
-
getVariantTypeCounters
-
load
public void load() -
parseVcfHeaderFields
Read VCF header and get columns data types- Parameters:
databaseVcfFileName-
-
parseVcfHeaderFields
Read VCF header and get columns data types- Parameters:
databaseVcfFileName-
-
saveFields
public void saveFields() -
saveCurrentDataFrame
-
setDbDir
-
setFieldNamesAnnotate
-
setPrefix
-
setVerbose
public void setVerbose(boolean verbose) -
toString
-
vcfHeaders
-