Outputs
This is a list of all the outputs produced after a successful run of each of haystac
’s modules. All these outputs will be found under the --output
path specified by the user when running any of haystac
’s modules. We advise to create separate output directories for each one of the modules (database
, sample
and analyse
).
Expected outputs for haystac database
db_taxa_accessions.tsv: includes all the taxon/accession pairs that are included in a database.
idx_database.done: file indicating that all the individual bowtie2 indices for each taxon have been prepared
entrez: directory containing all the results from querying the NCBI, including the nucleotide and taxonomy databases
bowtie: directory containing the bowtie2 index files for all out database, that will be used for the filtering alignment
database_inputs: directory containing the representative RefSeq table that is downloaded from NCBI.
Expected outputs for haystac sample
fastq_inputs: folder containing the outputs of the sample module.
fastq_inputs/meta: directory that includes the read count file.
fastq_inputs/(SE | PE | COLLAPSED): directory containing the trimmed reads produced by AdapteRremoval
Expected outputs for haystac analyse
bam: directory containing the bam file of the filtering alignment
fastq: directory containing the filtered reads in fastq format along with their average read length
alignments: directory where all the individual alignment bam files for each taxon in a database of the filtered reads are outputted
ts_tv_counts: directory where all the transition and transversion counts are stored per taxon
probabilities: directory where the likelihood matrix and final posterior abundances/probabilities are stored. The final output for abundance calculation has the suffix posterior_abundance.tsv,
mapdamage: directory that includes all the mapdamage profiles for every taxon in our Database
reads: directory including all the Dirichlet reads for each taxon in out database.