Diff of /CHANGELOG.rst [000000] .. [d7cf27]

Switch to unified view

a b/CHANGELOG.rst
1
2
Changelog
3
=========
4
0.10.2 (2021-09-29)
5
-------------------
6
7
- Fixed typos in documentation
8
9
0.10.1 (2021-06-28)
10
-------------------
11
12
- adapted pysam dependency
13
- fixed bug in coverage loading from BAM files
14
15
0.10.0 (2020-10-01)
16
-------------------
17
18
- added predict_variant_effect as stand-alone function.
19
- adapted predict_variant_effect to accept a genome in fasta format in addition to the previously required Bioseq object.
20
- added idx argument input_attribution to allow the selection of the i^th sequence of a Bioseq object.
21
22
23
0.9.9 (2020-08-04)
24
-------------------
25
26
- Added compatibility and tests for tensorflow 2.2.
27
- Adapted setup.py to enable the use with tensorflow 1.14 and tensorflow 2.2 and compatible keras versions, respectively.
28
- Expanded mode option for Cover.create_from_bed. New options: 'bedgraph', 'score_category' (synonym for 'categorical'), 'name_category'
29
- Fixed mode='categorical'/'score_category' behaviour for Cover.create_from_bed
30
- Mode 'bedgraph' enables compatibility with bedgraph format.
31
- Fixed issue with intervals overhanging the chromosome ends and intervals on non-existing chromosomes for Cover.create_from_bigwig and Cover.create_from_bam.
32
- Fixed issue in building DnaConv2D wrapper if the forward and revcomp layers were already built.
33
34
0.9.8 (2020-07-13)
35
------------------
36
37
- Improved JangguSequence which can be used with keras and tensorflow 2.0
38
39
0.9.7 (2020-05-07)
40
------------------
41
42
- Performance improvement for loading BED files.
43
- If no binsize is supplied, the binsize is detected automatically as the longest interval in the roi. Previously automatic detection was only offered if all intervals were of equal length, otherwise, a binsize needed to be specified manually.
44
45
0.9.6 (2020-02-26)
46
------------------
47
48
- Retired support for python 2.7
49
- RandomShift wrapper for data augmentation applied to DNA/Protein sequences was added by (thanks to @remomomo).
50
- Bedgraph files can be read using Cover.create_from_bed
51
- Improved efficiency of Cover.export_to_bigwig
52
- Improved efficiency of Cover.create_from_bed
53
- Internal refactoring
54
55
0.9.5 (2019-10-17)
56
------------------
57
58
- Variant effect prediction: added annotation argument which enables strand-specific variant effect predictions using the strandedness of features in the annotation file.
59
- Variant effect prediction: added ignore_reference_match argument which enables ignores mismatching nucleotides between the VCF reference base and the reference genome. By default, variant effects are only evaluated if the nucleotides agree in the reference genome and the VCF file.
60
- Added file validity check
61
- Added option to control verbosity 
62
- Improved efficiency for reading BAM and BIGWIG files
63
- Create a new cachefile with random_state only for not storing the whole genome
64
- Relaxed constraint for using resolution > 1 with ROI intervals. Still the interval starts have to be divisible by the resolution. Otherwise, weird rounding errors might occur.
65
- Fixed issue due to different numbers of network output layers.
66
- Added seperate dataversion to better control when cache files need to be reloaded from scratch.
67
68
0.9.4 (2019-07-15)
69
------------------
70
71
- Added SqueezeDim wrapper for compatibility with sklearn
72
- Added Transpose wrapper, replaces channel_last option of the datasets
73
- Loading paired-end bam-files with pairedend='5pend' option counts both ends now.
74
- resolution option added to create_from_array
75
- Relaxed restriction for sequence feature order
76
- Cover access via interval now returns nucleotide-resolution data regardless of the store_whole_genome option to ensure consistency.
77
- Refactoring
78
79
80
0.9.3 (2019-07-08)
81
------------------
82
83
- View mechanism added which allows to reuse the same dataset for different purposes, e.g. training set and test set.
84
- Added a dataset randomization which allows to internally randomize the data in order to avoid having to use shuffle=True with the fit method. This allows fetch randomized data in coherent chunks from hdf5 format files which improves access time.
85
- Added lazy loading mechanism for DNA and BED files, which defer the determination of the genome size to the dataset creation phase, but does not perform it when loading cached files to improve reload time.
86
- Caching logic improved in order to maximize the amount of reusability of dataset. For example, when the whole genome is loaded, the data can later be reloaded with different binsizes.
87
- Variant effect prediction functionality added.
88
- Improved efficiency for loading coverage from an array.
89
- Added axis option to ReduceDim
90
- Added Track classes to improve flexibility on plotGenomeTrack
91
92
0.9.2 (2019-05-04)
93
------------------
94
95
- Bugfix: Bioseq caching mechanism fixed.
96
97
0.9.1 (2019-05-03)
98
------------------
99
100
- Removed HTSeq dependence in favour of pybedtools for parsing BED, GFF, etc. This also introduces the requirement to have bedtools installed on the system, but it allows to parse BED-like files faster and more conveniently.
101
- Internal rearrangements for GenomicArray store_whole_genome=False. Now the data is stored as one array in a dict-like handle with the dummy key 'data' rather than storing the data in a fragmented fashion using as key-values the genomic interval and the respective coverages associated with them. This makes storage and processing more efficient.
102
- Bugfix: added conditions property to wrapper datasets.
103
104
0.9.0 (2019-03-20)
105
------------------
106
107
Added various features and bug fixes:
108
109
Changes in janggu.data
110
111
- Added new dataset wrapper to remove NaNs: NanToNumConverter
112
- Added new dataset wrappers for data augmentation: RandomOrientation, RandomSignalScale
113
- Adapted ReduceDim wrapper: added aggregator argument
114
- plotGenomeTrack added figsize option
115
- plotGenomeTrack added other plot types, including heatmap and seqplot.
116
- plotGenomeTrack refactoring of internal code
117
- Bioseq bugfix: Fixed issue for reverse complementing N's in the sequence.
118
- GenomicArray: condition, order, resolution are not read from the cache anymore, but from the arguments to avoid inconsistencies
119
- Normalization of Cover can handle a list of normalizer callables which are applied in turn
120
- Normaliation and Transformation: Added PercentileTrimming, RegionLengthNormalization, LogTransform
121
- ZScore and ZScoreLog do not apply RegionLengthNormalization by default anymore.
122
- janggu.data version-aware caching of datasets included
123
- Added copy method for janggu datasets.
124
- split_train_test refactored
125
- removed obsolete transformations attribute from the datasets
126
- Adapted the documentation
127
- Refactoring according to suggestions from isort and pylint
128
129
Changes in janggu
130
131
- Added input_attribution via integrated gradients for feature importance assignment
132
- Performance scoring by name for Janggu.evaluate for a number common metrices, including ROC, PRC, correlation, variance explained, etc.
133
- training.log is stored by default for each model
134
- Added model_from_json, model_from_yaml wrappers
135
- inputlayer decorator only instantiates Input layers if inputs == None, which makes the use of inputlayer less restrictive when using nested functions
136
- Added create_model method to create a keras model directly
137
- Adapted the documentation
138
- Refactoring according to suggestions from isort and pylint
139
140
141
0.8.6 (2019-03-03)
142
------------------
143
144
- Bugfix for ROIs that reach beyond the chromosome when loading Bioseq datasets. Now, zero-padding is performed for intervals that stretch over the sequence ends.
145
146
0.8.5 (2019-01-09)
147
------------------
148
149
- Updated abstract, added logo
150
- Utility: janggutrim command line tool for cutting bed file regions to avoid unwanted rounding effects. If rounding issues are detected an error is raised.
151
- Caching mechanism revisited. Caching of datasets is based on determining the sha256 hash of the dataset. If the data or some parameters change, the files are automatically reloaded. Consequently, the arguments overwrite and datatags become obsolete and have been marked for deprecation.
152
- Refactored access of GenomicArray
153
- Added ReduceDim wrapper to convert a 4D Cover object to a 2D table-like object.
154
155
0.8.4 (2018-12-11)
156
------------------
157
158
- Updated installation instructions in the readme
159
160
0.8.3 (2018-12-05)
161
------------------
162
163
- Fixed issues for loading SparseGenomicArray
164
- Made GenomicIndexer.filter_by_region aware of flank
165
- Fixed BedLoader of partially overlapping ROI and bedfiles issue using filter_by_region.
166
- Adapted classifier, license and keywords in setup.py
167
- Fixed hyperlinks
168
169
0.8.2 (2018-12-04)
170
------------------
171
172
- Bugfix for zero-padding functionality
173
- Added ndim for keras compatibility
174
175
0.8.1 (2018-12-03)
176
------------------
177
178
- Bugfix in GenomicIndexer.create_from_region
179
180
0.8.0 (2018-12-02)
181
------------------
182
183
- Improved test coverage
184
- Improved linter issues
185
- Bugs fixed
186
- Improved documentation for scorers
187
- Removed kwargs for scorers and exporters
188
- Adapted exporters to classes
189
190
191
0.7.0 (2018-12-01)
192
------------------
193
194
- First public version