Diff of /CHANGELOG.md [000000] .. [1aa732]

Switch to side-by-side view

--- a
+++ b/CHANGELOG.md
@@ -0,0 +1,180 @@
+# Changelog
+
+## v2.2.0 - 15.01.2025
+- Fixed bug in the kekulization of molecules with radicals (thanks Olabisi-Aishat-Bello for reporting, thanks Robert Pollice for fixing)
+- Fixed constraints for validity of molecules with changed C, P or S, to align with validity-definition of RDKit.
+  
+## v2.1.2 - 15.07.2024
+- Fixed recursion bug for very long molecules (thanks haydn-jones)
+- Added warning when dot-symbol (".") exists in peculiar cases (thanks vandrw)
+
+## v2.1.1 - 14.07.2022
+- Fixed index bug in attribution 
+
+## v2.1.0 - 17.05.2022
+
+### Changed:
+- Dropped support for Python 3.5-3.6 and will continue to support only current Python versions.
+
+### Added:
+- optional attribution to map encoder/decoder output string back to input string (Issue #48, #79)
+
+## v2.0.0 - 21.10.2021
+
+### Changed:
+- Improved SMILES parsing (by using adjacencey lists internally), with tighter error handling
+  (e.g. issues #62 and #60).
+- Faster and improved kekulization algorithm (issue #55 fixed).
+- Support for symbols that are constrained to 0 bonds (e.g., `[CH4]`) or >8 bonds
+  (users can now specify custon bond constraints with over 8 bonds).
+- New `strict=True` flag to `selfies.encoder`, which raises an error if the input
+  SMILES violates the current bond constraints. `True` by default, can be `False` for speed-up (if
+  SMILES are guaranteed to be correct).
+- Added bond constraints for B (max. 3 bonds) to the default and preset constraints.
+- Updated the syntax of SELFIES symbols to be cleaner and more readable.
+    - Removing `expl` from atomic symbols, e.g., `[C@@Hexpl]` becommes `[C@@H]`
+    - Cleaner branch symbols, e.g., `[BranchL_2]` becomes `[=BranchL]`
+    - Cleaner ring symbols, e.g., `[Expl=RingL]` becomes `[=RingL]`
+    - If you want to use the old symbols, use the `compatible=True` flag to `selfies.decoder`,
+      e.g., `sf.decoder('[C][C][Expl=Ring1]',compatible=True)` (not recommended!)
+- More logically consistent behaviour of `[Ring]` symbols.
+- Standardized SELFIES alphabet, i.e., no two symbols stand for the same atom/ion (issue #58), e.g.,
+  `[N+1]` and `[N+]` are equivalent now.
+- Indexing symbols are now included in the alphabet returned by `selfies.get_semantic_robust_alphabet`.
+
+### Removed
+- Removed `constraints` flag from `selfies.decoder`; please use `selfies.set_semantic_constraints()`
+  and pass in `"hypervalent"` or `"octet_rule"` instead.
+- Removed `print_error` flag in `selfies.encoder` and `selfies.decoder`,
+  which now raise errors `selfies.EncoderError` and `selfies.DecoderError`, respectively.
+
+### Bug Fixes
+- Potential chirality inversion of atoms making ring bonds (e.g. ``[C@@H]12CCC2CCNC1``):
+  fixed by inverting their chirality in ``selfies.encoder`` such that they are decoded with
+  the original chirality preserved.
+- Failure to represent mismatching stereochemical specifications at ring bonds
+  (e.g. ``F/C=C/C/C=C\C``): fixed by adding new ring symbols (e.g. ``[-/RingL]``, ``[\/RingL]``, etc.).
+
+---
+
+## v1.0.4 - 23.04.2021
+### Added:
+ * decoder option for relaxed hypervalence rules, `decoder(...,constraints='hypervalent')`
+ * decoder option for strict octet rules, `decoder(...,constraints='octet_rule')`
+### Bug Fix:
+ * Fixed constraint for Phosphorus
+
+ ---
+
+## v1.0.3 - 13.01.2021
+### Added:
+ * Support for aromatic Si and Al (is not officially supported by Daylight SMILES, but RDKit supports it and examples exist in PubChem).
+
+ ---
+
+## v1.0.2 - 14.10.2020
+### Added:
+ * Support for aromatic Te and triple bonds.
+ * Inbuild SELFIES to 1hot encoding, and 1hot encoding to SELFIES
+
+### Changed:
+ * Added default semantic constraints for charged atoms (single positive/negative charge of `[C]`, `[N]`, `[O]`, `[S]`, `[P]`)
+ * Raised the bond capacity of `P` to 7 bonds (from 5 bonds).
+
+### Bug Fixes:
+ * Fixed bug: `selfies.decoder` did not terminate for malformed SELFIES
+   that are missing the closed bracket `']'`.
+
+---
+
+## v1.0.1 - 25.08.2020
+### Changed:
+ *  Code so that is compatible with python >= 3.5.
+ *  More descriptive error messages.
+
+### Bug Fixes:
+ *  Minor bug fixes in the encoder for SMILES ending in branches (e.g. `C(Cl)(F)`),
+    and SMILES with ring numbers between branches (e.g. `C(Cl)1(Br)CCCC1`)
+ *  Minor bug fix with ring ordering in decoder (e.g. `C1CC2CCC12` vs `C1CC2CCC21`).
+
+---
+
+## v1.0.0 - 17.08.2020:
+### Added:
+ *  Added semantic handling of aromaticity / delocalization (by kekulizing SMILES with aromatic symbols before
+    they are translated into SELFIES by `selfies.encoder`).
+ *  Added semantic handling of charged species (e.g. `[CH+]1CCC1`).
+ *  Added semantic handling of radical species (`[CH]1CCC1`) or any species with explicit hydrogens (e.g. `CC[CH2]`).
+ *  Added semantic handling of isotopes (e.g. `[14CH2]=C` or `[235U]`).
+ *  Improved semantic handling of explicit atom symbols in square brackets, e.g. Carbene (`[C]=C`).
+ *  Improved semantic handling of chirality (e.g. `O=C[Co@@](F)(Cl)(Br)(I)S`).
+ *  Improved semantic handling of double-bond configuration (e.g. `F/C=C/C=C/C`).
+ *  Added new functions to the library, such as `selfies.len_selfies` and
+    `selfies.split_selfies`.
+ *  Added advanced-user functions to the library to customize the SELFIES semantic constraints, e.g.
+    `selfies.set_semantic_constraints`. Allows to encode for instance diborane, `[BH2]1[H][BH2][H]1`.
+ *  Introduced new padding `[nop]` (no operation) symbol.
+
+### Changed:
+ *  Optimized the indexing alphabet (it is base-16 now).
+ *  Optimized the behaviours of rings and branches to fix an issue with specific non-standard molecules that could not be translated.
+ *  Changed behaviour of Ring/Branch, such that states `X9991-X9993` are not necessary anymore.
+ *  Significantly improved encoding and decoding algorithms, it is much faster now.
+
+---
+
+## v0.2.4 - 01.10.2019:
+### Added:
+ *  Function ``get_alphabet()`` which returns a list of 29 selfies symbols
+    whose arbitrary combination produce >99.99% valid molecules.
+
+### Bug Fixes:
+ *  Fixed bug which happens when three rings start at one node, and two of
+    them form a double ring.
+ *  Enabled rings with sizes of up to 8000 SELFIES symbols.
+ *  Bug fix for tiny ring to RDKit syntax conversion, spanning multiple
+    branches.
+
+We thank Kevin Ryan (LeanAndMean@github), Theophile Gaudin and Andrew Brereton
+for suggestions and bug reports.
+
+---
+
+## v0.2.2 - 19.09.2019:
+
+### Added:
+ *  Enabled ``[C@],[C@H],[C@@],[C@@H],[H]`` to use in a semantic
+    constrained way.
+
+We thank Andrew Brereton for suggestions and bug reports.
+
+---
+
+## v0.2.1 - 02.09.2019:
+
+### Added:
+ *  Decoder: added optional argument to restrict nitrogen to 3 bonds.
+    ``decoder(...,N_restrict=False)`` to allow for more bonds;
+    standard: ``N_restrict=True``.
+ *  Decoder: added optional argument make ring-function bi-local
+    (i.e. confirms bond number at target).
+    ``decoder(...,bilocal_ring_function=False)`` to not allow bi-local ring
+    function; standard: ``bilocal_ring_function=True``. The bi-local ring
+    function will allow validity of >99.99% of random molecules.
+ *  Decoder: made double-bond ring RDKit syntax conform.
+ *  Decoder: added state X5 and X6 for having five and six bonds free.
+
+### Bug Fixes:
+ * Decoder + Encoder: allowing for explicit brackets for organic atoms, for
+   instance ``[I]``.
+ * Encoder: explicit single/double bond for non-canonical SMILES input
+   issue fixed.
+ * Decoder: bug fix for ``[Branch*]`` in state X1.
+
+We thank Benjamin Sanchez-Lengeling, Theophile Gaudin and Zhenpeng Yao
+for suggestions and bug reports.
+
+---
+
+## v0.1.1 - 04.06.2019:
+ * initial release