Switch to unified view

a b/softwares_config/htseq_2.0.1.config
1
usage: htseq-count [-h] [--version] [-f {sam,bam,auto}] [-r {pos,name}]
2
                   [--max-reads-in-buffer MAX_BUFFER_SIZE]
3
                   [-s {yes,no,reverse}] [-a MINAQUAL] [-t FEATURE_TYPE]
4
                   [-i IDATTR] [--additional-attr ADDITIONAL_ATTRIBUTES]
5
                   [--add-chromosome-info]
6
                   [-m {union,intersection-strict,intersection-nonempty}]
7
                   [--nonunique {none,all,fraction,random}]
8
                   [--secondary-alignments {score,ignore}]
9
                   [--supplementary-alignments {score,ignore}] [-o SAMOUTS]
10
                   [-p {SAM,BAM,sam,bam}] [-d OUTPUT_DELIMITER]
11
                   [-c OUTPUT_FILENAME] [--counts-output-sparse]
12
                   [--append-output] [-n NPROCESSES]
13
                   [--feature-query FEATURE_QUERY] [-q] [--with-header]
14
                   samfilenames [samfilenames ...] featuresfilename
15
16
This script takes one or more alignment files in SAM/BAM format and a feature
17
file in GFF format and calculates for each feature the number of reads mapping
18
to it. See http://htseq.readthedocs.io/en/master/count.html for details.
19
20
positional arguments:
21
  samfilenames          Path to the SAM/BAM files containing the mapped reads.
22
                        If '-' is selected, read from standard input
23
  featuresfilename      Path to the GTF file containing the features
24
25
optional arguments:
26
  -h, --help            show this help message and exit
27
  --version             Show software version and exit
28
  -f {sam,bam,auto}, --format {sam,bam,auto}
29
                        Type of <alignment_file> data. DEPRECATED: file format
30
                        is detected automatically. This option is ignored.
31
  -r {pos,name}, --order {pos,name}
32
                        'pos' or 'name'. Sorting order of <alignment_file>
33
                        (default: name). Paired-end sequencing data must be
34
                        sorted either by position or by read name, and the
35
                        sorting order must be specified. Ignored for single-
36
                        end data.
37
  --max-reads-in-buffer MAX_BUFFER_SIZE
38
                        When <alignment_file> is paired end sorted by
39
                        position, allow only so many reads to stay in memory
40
                        until the mates are found (raising this number will
41
                        use more memory). Has no effect for single end or
42
                        paired end sorted by name
43
  -s {yes,no,reverse}, --stranded {yes,no,reverse}
44
                        Whether the data is from a strand-specific assay.
45
                        Specify 'yes', 'no', or 'reverse' (default: yes).
46
                        'reverse' means 'yes' with reversed strand
47
                        interpretation
48
  -a MINAQUAL, --minaqual MINAQUAL
49
                        Skip all reads with MAPQ alignment quality lower than
50
                        the given minimum value (default: 10). MAPQ is the 5th
51
                        column of a SAM/BAM file and its usage depends on the
52
                        software used to map the reads.
53
  -t FEATURE_TYPE, --type FEATURE_TYPE
54
                        Feature type (3rd column in GTF file) to be used, all
55
                        features of other type are ignored (default, suitable
56
                        for Ensembl GTF files: exon)
57
  -i IDATTR, --idattr IDATTR
58
                        GTF attribute to be used as feature ID (default,
59
                        suitable for Ensembl GTF files: gene_id). All feature
60
                        of the right type (see -t option) within the same GTF
61
                        attribute will be added together. The typical way of
62
                        using this option is to count all exonic reads from
63
                        each gene and add the exons but other uses are
64
                        possible as well. You can call this option multiple
65
                        times: in that case, the combination of all attributes
66
                        separated by colons (:) will be used as a unique
67
                        identifier, e.g. for exons you might use -i gene_id -i
68
                        exon_number.
69
  --additional-attr ADDITIONAL_ATTRIBUTES
70
                        Additional feature attributes (default: none, suitable
71
                        for Ensembl GTF files: gene_name). Use multiple times
72
                        for more than one additional attribute. These
73
                        attributes are only used as annotations in the output,
74
                        while the determination of how the counts are added
75
                        together is done based on option -i.
76
  --add-chromosome-info
77
                        Store information about the chromosome of each feature
78
                        as an additional attribute (e.g. colunm in the TSV
79
                        output file).
80
  -m {union,intersection-strict,intersection-nonempty}, --mode {union,intersection-strict,intersection-nonempty}
81
                        Mode to handle reads overlapping more than one feature
82
                        (choices: union, intersection-strict, intersection-
83
                        nonempty; default: union)
84
  --nonunique {none,all,fraction,random}
85
                        Whether and how to score reads that are not uniquely
86
                        aligned or ambiguously assigned to features (choices:
87
                        none, all, fraction, random; default: none)
88
  --secondary-alignments {score,ignore}
89
                        Whether to score secondary alignments (0x100 flag)
90
  --supplementary-alignments {score,ignore}
91
                        Whether to score supplementary alignments (0x800 flag)
92
  -o SAMOUTS, --samout SAMOUTS
93
                        Write out all SAM alignment records into SAM/BAM files
94
                        (one per input file needed), annotating each line with
95
                        its feature assignment (as an optional field with tag
96
                        'XF'). See the -p option to use BAM instead of SAM.
97
  -p {SAM,BAM,sam,bam}, --samout-format {SAM,BAM,sam,bam}
98
                        Format to use with the --samout option.
99
  -d OUTPUT_DELIMITER, --delimiter OUTPUT_DELIMITER
100
                        Column delimiter in output (default: TAB).
101
  -c OUTPUT_FILENAME, --counts_output OUTPUT_FILENAME
102
                        Filename to output the counts to instead of stdout.
103
  --counts-output-sparse
104
                        Store the counts as a sparse matrix (mtx, h5ad, loom).
105
  --append-output       Append counts output to an existing file instead of
106
                        creating a new one. This option is useful if you have
107
                        already creates a TSV/CSV/similar file with a header
108
                        for your samples (with additional columns for the
109
                        feature name and any additionl attributes) and want to
110
                        fill in the rest of the file.
111
  -n NPROCESSES, --nprocesses NPROCESSES
112
                        Number of parallel CPU processes to use (default: 1).
113
                        This option is useful to process several input files
114
                        at once. Each file will use only 1 CPU. It is
115
                        possible, of course, to split a very large input
116
                        SAM/BAM files into smaller chunks upstream to make use
117
                        of this option.
118
  --feature-query FEATURE_QUERY
119
                        Restrict to features descibed in this expression.
120
                        Currently supports a single kind of expression:
121
                        attribute == "one attr" to restrict the GFF to a
122
                        single gene or transcript, e.g. --feature-query
123
                        'gene_name == "ACTB"' - notice the single quotes
124
                        around the argument of this option and the double
125
                        quotes around the gene name. Broader queries might
126
                        become available in the future.
127
  -q, --quiet           Suppress progress report
128
  --with-header         Whether to add a column header to the output TSV file
129
                        indicating which column corresponds to which input BAM
130
                        file. Only used if output to console or tsv or csv
131
                        file. Default to False.
132
133
Written by Simon Anders (sanders@fs.tum.de), European Molecular Biology
134
Laboratory (EMBL), Givanna Putri (g.putri@unsw.edu.au) and Fabio Zanini
135
(fabio.zanini@unsw.edu.au), UNSW Sydney. (c) 2010-2021. Released under the
136
terms of the GNU General Public License v3. Please cite the following paper if
137
you use this script: G. Putri et al. Analysing high-throughput sequencing data
138
in Python with HTSeq 2.0. Bioinformatics (2022).
139
https://doi.org/10.1093/bioinformatics/btac166. Part of the 'HTSeq' framework,
140
version 2.0.1.