AutoBA / Git / Diff of /softwares_config/hisat2

Models:
AlyssaS/
AutoBA
Downloads: 1
Diff of /softwares_config/hisat2_2.2.1.config [000000] .. [014e6e]
Switch to side-by-side view

--- a
+++ b/softwares_config/hisat2_2.2.1.config
@@ -0,0 +1,140 @@
+HISAT2 version 2.2.1 by Daehwan Kim (infphilo@gmail.com, www.ccb.jhu.edu/people/infphilo)
+Usage: 
+  hisat2 [options]* -x <ht2-idx> {-1 <m1> -2 <m2> | -U <r>} [-S <sam>]
+
+  <ht2-idx>  Index filename prefix (minus trailing .X.ht2).
+  <m1>       Files with #1 mates, paired with files in <m2>.
+             Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2).
+  <m2>       Files with #2 mates, paired with files in <m1>.
+             Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2).
+  <r>        Files with unpaired reads.
+             Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2).
+  <sam>      File for SAM output (default: stdout)
+
+  <m1>, <m2>, <r> can be comma-separated lists (no whitespace) and can be
+  specified many times.  E.g. '-U file1.fq,file2.fq -U file3.fq'.
+
+Options (defaults in parentheses):
+
+ Input:
+  -q                 query input files are FASTQ .fq/.fastq (default)
+  --qseq             query input files are in Illumina's qseq format
+  -f                 query input files are (multi-)FASTA .fa/.mfa
+  -r                 query input files are raw one-sequence-per-line
+  -c                 <m1>, <m2>, <r> are sequences themselves, not files
+  -s/--skip <int>    skip the first <int> reads/pairs in the input (none)
+  -u/--upto <int>    stop after first <int> reads/pairs (no limit)
+  -5/--trim5 <int>   trim <int> bases from 5'/left end of reads (0)
+  -3/--trim3 <int>   trim <int> bases from 3'/right end of reads (0)
+  --phred33          qualities are Phred+33 (default)
+  --phred64          qualities are Phred+64
+  --int-quals        qualities encoded as space-delimited integers
+
+ Presets:                 Same as:
+   --fast                 --no-repeat-index
+   --sensitive            --bowtie2-dp 1 -k 30 --score-min L,0,-0.5
+   --very-sensitive       --bowtie2-dp 2 -k 50 --score-min L,0,-1
+
+ Alignment:
+  --bowtie2-dp <int> use Bowtie2's dynamic programming alignment algorithm (0) - 0: no dynamic programming, 1: conditional dynamic programming, and 2: unconditional dynamic programming (slowest)
+  --n-ceil <func>    func for max # non-A/C/G/Ts permitted in aln (L,0,0.15)
+  --ignore-quals     treat all quality values as 30 on Phred scale (off)
+  --nofw             do not align forward (original) version of read (off)
+  --norc             do not align reverse-complement version of read (off)
+  --no-repeat-index  do not use repeat index
+
+ Spliced Alignment:
+  --pen-cansplice <int>              penalty for a canonical splice site (0)
+  --pen-noncansplice <int>           penalty for a non-canonical splice site (12)
+  --pen-canintronlen <func>          penalty for long introns (G,-8,1) with canonical splice sites
+  --pen-noncanintronlen <func>       penalty for long introns (G,-8,1) with noncanonical splice sites
+  --min-intronlen <int>              minimum intron length (20)
+  --max-intronlen <int>              maximum intron length (500000)
+  --known-splicesite-infile <path>   provide a list of known splice sites
+  --novel-splicesite-outfile <path>  report a list of splice sites
+  --novel-splicesite-infile <path>   provide a list of novel splice sites
+  --no-temp-splicesite               disable the use of splice sites found
+  --no-spliced-alignment             disable spliced alignment
+  --rna-strandness <string>          specify strand-specific information (unstranded)
+  --tmo                              reports only those alignments within known transcriptome
+  --dta                              reports alignments tailored for transcript assemblers
+  --dta-cufflinks                    reports alignments tailored specifically for cufflinks
+  --avoid-pseudogene                 tries to avoid aligning reads to pseudogenes (experimental option)
+  --no-templatelen-adjustment        disables template length adjustment for RNA-seq reads
+
+ Scoring:
+  --mp <int>,<int>   max and min penalties for mismatch; lower qual = lower penalty <6,2>
+  --sp <int>,<int>   max and min penalties for soft-clipping; lower qual = lower penalty <2,1>
+  --no-softclip      no soft-clipping
+  --np <int>         penalty for non-A/C/G/Ts in read/ref (1)
+  --rdg <int>,<int>  read gap open, extend penalties (5,3)
+  --rfg <int>,<int>  reference gap open, extend penalties (5,3)
+  --score-min <func> min acceptable alignment score w/r/t read length
+                     (L,0.0,-0.2)
+
+ Reporting:
+  -k <int>           It searches for at most <int> distinct, primary alignments for each read. Primary alignments mean 
+                     alignments whose alignment score is equal to or higher than any other alignments. The search terminates 
+                     when it cannot find more distinct valid alignments, or when it finds <int>, whichever happens first. 
+                     The alignment score for a paired-end alignment equals the sum of the alignment scores of 
+                     the individual mates. Each reported read or pair alignment beyond the first has the SAM ‘secondary’ bit 
+                     (which equals 256) set in its FLAGS field. For reads that have more than <int> distinct, 
+                     valid alignments, hisat2 does not guarantee that the <int> alignments reported are the best possible 
+                     in terms of alignment score. Default: 5 (linear index) or 10 (graph index).
+                     Note: HISAT2 is not designed with large values for -k in mind, and when aligning reads to long, 
+                     repetitive genomes, large -k could make alignment much slower.
+  --max-seeds <int>  HISAT2, like other aligners, uses seed-and-extend approaches. HISAT2 tries to extend seeds to 
+                     full-length alignments. In HISAT2, --max-seeds is used to control the maximum number of seeds that 
+                     will be extended. For DNA-read alignment (--no-spliced-alignment), HISAT2 extends up to these many seeds
+                     and skips the rest of the seeds. For RNA-read alignment, HISAT2 skips extending seeds and reports 
+                     no alignments if the number of seeds is larger than the number specified with the option, 
+                     to be compatible with previous versions of HISAT2. Large values for --max-seeds may improve alignment 
+                     sensitivity, but HISAT2 is not designed with large values for --max-seeds in mind, and when aligning 
+                     reads to long, repetitive genomes, large --max-seeds could make alignment much slower. 
+                     The default value is the maximum of 5 and the value that comes with -k times 2.
+  -a/--all           HISAT2 reports all alignments it can find. Using the option is equivalent to using both --max-seeds 
+                     and -k with the maximum value that a 64-bit signed integer can represent (9,223,372,036,854,775,807).
+  --repeat           report alignments to repeat sequences directly
+
+ Paired-end:
+  -I/--minins <int>  minimum fragment length (0), only valid with --no-spliced-alignment
+  -X/--maxins <int>  maximum fragment length (500), only valid with --no-spliced-alignment
+  --fr/--rf/--ff     -1, -2 mates align fw/rev, rev/fw, fw/fw (--fr)
+  --no-mixed         suppress unpaired alignments for paired reads
+  --no-discordant    suppress discordant alignments for paired reads
+
+ Output:
+  -t/--time          print wall-clock time taken by search phases
+  --un <path>           write unpaired reads that didn't align to <path>
+  --al <path>           write unpaired reads that aligned at least once to <path>
+  --un-conc <path>      write pairs that didn't align concordantly to <path>
+  --al-conc <path>      write pairs that aligned concordantly at least once to <path>
+  (Note: for --un, --al, --un-conc, or --al-conc, add '-gz' to the option name, e.g.
+  --un-gz <path>, to gzip compress output, or add '-bz2' to bzip2 compress output.)
+  --summary-file <path> print alignment summary to this file.
+  --new-summary         print alignment summary in a new style, which is more machine-friendly.
+  --quiet               print nothing to stderr except serious errors
+  --met-file <path>     send metrics to file at <path> (off)
+  --met-stderr          send metrics to stderr (off)
+  --met <int>           report internal counters & metrics every <int> secs (1)
+  --no-head             suppress header lines, i.e. lines starting with @
+  --no-sq               suppress @SQ header lines
+  --rg-id <text>        set read group id, reflected in @RG line and RG:Z: opt field
+  --rg <text>           add <text> ("lab:value") to @RG line of SAM header.
+                        Note: @RG line only printed when --rg-id is set.
+  --omit-sec-seq        put '*' in SEQ and QUAL fields for secondary alignments.
+
+ Performance:
+  -o/--offrate <int> override offrate of index; must be >= index's offrate
+  -p/--threads <int> number of alignment threads to launch (1)
+  --reorder          force SAM output order to match order of input reads
+  --mm               use memory-mapped I/O for index; many 'hisat2's can share
+
+ Other:
+  --qc-filter        filter out reads that are bad according to QSEQ filter
+  --seed <int>       seed for random number generator (0)
+  --non-deterministic seed rand. gen. arbitrarily instead of using read attributes
+  --remove-chrname   remove 'chr' from reference names in alignment
+  --add-chrname      add 'chr' to reference names in alignment 
+  --version          print version information and quit
+  -h/--help          print this usage message