Diff of /README.md [000000] .. [bf1564]

Switch to side-by-side view

--- a
+++ b/README.md
@@ -0,0 +1,50 @@
+# wsi_preprocessing
+
+## Processing and tiling of histological slides
+
+openslide-based processing and filtering (Only tissue filtering right now, more will follow) 
+The process can be configured using a config json file.
+
+The tissue detection is processed on a higher level to speed up the process. Thereby rough tiles will be sampled and 
+discarded if there isn't enough tissue coverage. The tiles will then be divided into patches for training etc.
+
+Supported annotation types are .xml (Camelyon17 and some other public datasets) or .geojson (QuPath)
+Right now only binary annotation types are supported (tumor - non-tumor)
+
+Supported slide formats are .tif and .svs right now
+
+### Usage:
+
+This script is designed to be used together with CuPath in case there are no annotations.
+Main file is "tile_generator.py" - Configure the process via the config file and execute this file to start the process
+
+### Additional information:
+
+NOTE:
+Right now there is a bug on Unix systems regarding openslide where image data isn't properly loaded. To fix this follow:
+https://github.com/openslide/openslide-python/issues/58#issuecomment-883446558
+
+### Config Explanation:
+
+| Dictionary Entry                                        | Explanation                                                                                                                                                                                                                      |
+|---------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| tissue_coverage                                         | Threshold [0,1] for how much tissue coverage is necessary, default is 0.75                                                                                                                                                       |
+| keep_annotated_tiles_despite_too_little_tissue_coverage | legacy option. Old behaviour: Keep annotated tiles even if not covered by tissue. New behaviour (to allow easier tile clean-up around the edges): discard tiles with too little tissue coverage regardless of annotation status. |
+| processing_level                                        | Level of downscaling by openslide - Lowering the level will increase precision but more time is needed, default is 5                                                                                                             |
+| blocked_threads                                         | Number of threads that wont be used by the program                                                                                                                                                                               |
+| patches_per_tile                                        | Number of patches used for lower resolution operations like tissue detection                                                                                                                                                     |
+| overlap                                                 | Value [0,1[ to set the overlap between neighbouring unannotated patches                                                                                                                                                          |
+| annotation_overlap                                      | Value [0,1[ to set the overlap between neighbouring annotated patches                                                                                                                                                            |
+| patch_size                                              | Output pixel size of the quadratic patches                                                                                                                                                                                       |
+| slides_dir                                              | Directory where the different slides and subdirs are located                                                                                                                                                                     |
+| slides_file                                             | txt file containing paths to all slides to process (absolute paths)                                                                                                                                                              |
+| annotation_dir                                          | Directory where the annotations are located                                                                                                                                                                                      |
+| annotation_file_format                                  | File format of the input annotations ("xml","geojson")                                                                                                                                                                           |
+| output_path                                             | Output directory to where the resulting images will be stored                                                                                                                                                                    |
+| skip_unlabeled_slides                                   | Boolean to skip slides without an annotation file                                                                                                                                                                                |
+| save_annotated_only                                     | Boolean to only save annotated patches                                                                                                                                                                                           | 
+| output_format                                           | Image output format default is "png"                                                                                                                                                                                             |
+| show_mode                                               | Boolean to enable plotting of some intermediate results/visualizations                                                                                                                                                           |
+| label_dict                                              | Structure to set up the operator and the threshold for checking the coverage of a certain class. Up to one unannotated tissue type (e.g. non-tumor) is possible and must go first for implementation reasons.                    |
+| type                                                    | Operator type [ "==", ">=", "<="]                                                                                                                                                                                                |
+| threshold                                               | Coverage threshold for the individual class                                                                                                                                                                                      |