Class VariantDatabase

java.lang.Object
org.snpsift.annotate.mem.database.VariantDatabase

public class VariantDatabase extends Object
A database of variant's data used to annotate a VCF file (i.e. VCF entries). The database only loads one chromosome at a time, to speed up the process while fitting in memory. The database is a collection of VariantDatabaseChr objects, one per chromosome. Each 'VariantDatabaseChr' is stored in one file per chromosome. 'VariantDatabase' manages the 'VariantDatabaseChr' files (loading, saving, etc). This class provides methods to: - Create a variant database from a VCF file or its contents. - Annotate VCF entries using the database. - Add VCF entries to the database. - Check the presence of required fields in the database. - Load and save the database and its components. - Handle database directories and file names. The class maintains the current chromosome and interval being processed, and uses a VariantDataFrame to store the data for the current chromosome. It also uses VariantTypeCounters to count variants per chromosome. The class supports verbose logging and progress display during database creation and annotation.
  • Field Details

  • Constructor Details

    • VariantDatabase

      public VariantDatabase()
    • VariantDatabase

      public VariantDatabase(String databaseVcfFileName, String dbDir, String[] fieldNamesCreate)
      Constructor used to create a database
    • VariantDatabase

      public VariantDatabase(String databaseVcfFileName, String dbDir, String[] fieldNamesAnnotate, String prefix, boolean emptyIfNotFound)
      Constructor used to annotate a database
  • Method Details

    • dbDirFromVcfFileName

      public static String dbDirFromVcfFileName(String dbFileName)
      Get the database directory name from a "VCF database" file name buy just appending ".snpsift.vardb"
    • add

      protected void add(org.snpeff.vcf.VariantVcfEntry variantVcfEntry)
      Add a VCF entry to the database
    • annotate

      public int annotate(org.snpeff.vcf.VcfEntry vcfEntry)
      This method is used to annotate a VCF entry The annotations are added to the INFO field of the VCF entry
    • checkFields

      public boolean checkFields(String[] fieldNames, boolean throwExceptionOnError)
      Check that all `fieldNames` are available in `fields`
      Returns:
      true is all fieldsNames are present, false otherwise
    • create

      public void create(String vcfContents)
      Create a database from a VCF file content (as a string) This is used for testing
    • create

      public void create()
      Create a database from a VCF file
    • get

      public VariantDataFrame get(String chr)
      Get the database for a chromosome
    • getDatabaseVcfFileName

      public String getDatabaseVcfFileName()
    • getDbDir

      public String getDbDir()
    • getFields

      public Fields getFields()
    • getVariantTypeCounters

      public VariantTypeCounters getVariantTypeCounters()
    • load

      public void load()
    • parseVcfHeaderFields

      protected Fields parseVcfHeaderFields(String databaseVcfFileName)
      Read VCF header and get columns data types
      Parameters:
      databaseVcfFileName -
    • parseVcfHeaderFields

      protected Fields parseVcfHeaderFields(org.snpeff.fileIterator.VcfFileIterator vcfFileIterator)
      Read VCF header and get columns data types
      Parameters:
      databaseVcfFileName -
    • saveFields

      public void saveFields()
    • saveCurrentDataFrame

      public String saveCurrentDataFrame()
    • setDbDir

      public void setDbDir(String dbDir)
    • setFieldNamesAnnotate

      public void setFieldNamesAnnotate(String[] fieldNamesAnnotate)
    • setPrefix

      public void setPrefix(String prefix)
    • setVerbose

      public void setVerbose(boolean verbose)
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • vcfHeaders

      public Collection<org.snpeff.vcf.VcfHeaderEntry> vcfHeaders()