Diff of /docs/tutorial.tasks.rst [000000] .. [dc40d0]

Switch to side-by-side view

--- a
+++ b/docs/tutorial.tasks.rst
@@ -0,0 +1,184 @@
+Adding Tasks
+####################################
+
+This is a tutorial on adding new machine learning tasks using ``lavis.tasks`` module.
+
+The LAVIS library includes a standard task module that centralizes the model training and evaluation procedure of machine learning tasks. 
+The ``lavis.tasks`` module is designed such that any new tasks can be added and integrated, catering to any customization in the training and testing procedures. 
+In this tutorial, we will replicate the steps to add a new task into LAVIS for the `video-grounded dialogue tasks <https://arxiv.org/pdf/1901.09107.pdf>`_. 
+
+Base Task ``lavis.tasks.base_task``
+********************************************************************************
+
+Note that any new model definition should inherit the base task class ``BaseTask``:
+
+.. code-block:: python
+
+    import logging
+    import os
+    
+    import torch.distributed as dist
+    from lavis.common.dist_utils import get_rank, get_world_size, is_main_process
+    from lavis.common.logger import MetricLogger, SmoothedValue
+    from lavis.common.registry import registry
+    from lavis.datasets.data_utils import prepare_sample
+    
+    class BaseTask:
+        def __init__(self, **kwargs):
+            super().__init__()
+    
+            self.inst_id_key = "instance_id"
+    
+        @classmethod
+        def setup_task(cls, **kwargs):
+            return cls()
+    
+        def build_model(self, cfg):
+            model_config = cfg.model_cfg
+    
+            model_cls = registry.get_model_class(model_config.arch)
+            return model_cls.from_config(model_config)
+    
+        def build_datasets(self, cfg):
+            """
+            Build a dictionary of datasets, keyed by split 'train', 'valid', 'test'.
+            Download dataset and annotations automatically if not exist.
+    
+            Args:
+                cfg (common.config.Config): _description_
+    
+            Returns:
+                dict: Dictionary of torch.utils.data.Dataset objects by split.
+            """
+    
+            datasets = dict()
+    
+            datasets_config = cfg.datasets_cfg
+    
+            assert len(datasets_config) > 0, "At least one dataset has to be specified."
+    
+            for name in datasets_config:
+                dataset_config = datasets_config[name]
+    
+                builder = registry.get_builder_class(name)(dataset_config)
+                dataset = builder.build_datasets()
+    
+                datasets[name] = dataset
+    
+            return datasets
+    
+        def train_step(self, model, samples):
+            loss = model(samples)["loss"]
+            return loss
+    
+        ...
+
+In this base task, we already declare and standardize many common methods such as ``train_step``, ``build_model``, and ``build_datasets``. 
+Inheriting this base task class allows us to standardize operations of tasks across all task classes.
+We recommend users not change the implementation of the base task class as this will have an impact on all existing task subclasses.
+
+Dialogue Task ``lavis.tasks.dialogue``
+********************************************************************************
+
+In this step, we can define a new task class, e.g. under ``lavis.tasks.dialogue``, for video-grounded dialogues.
+For instance, we define a new task class ``DialogueTask`` that inherits the super task class ``BaseTask``.
+
+.. code-block:: python
+
+    import json
+    import os
+    
+    from lavis.common.dist_utils import main_process
+    from lavis.common.logger import MetricLogger
+    from lavis.common.registry import registry
+    from lavis.tasks.base_task import BaseTask
+    from lavis.datasets.data_utils import prepare_sample
+    
+    import numpy as np 
+    
+    @registry.register_task("dialogue")
+    class DialogueTask(BaseTask):
+        def __init__(self, num_beams, max_len, min_len, evaluate, report_metric=True):
+            super().__init__()
+    
+            self.num_beams = num_beams
+            self.max_len = max_len
+            self.min_len = min_len
+            self.evaluate = evaluate
+    
+            self.report_metric = report_metric
+    
+        @classmethod
+        def setup_task(cls, cfg):
+            run_cfg = cfg.run_cfg
+    
+            num_beams = run_cfg.num_beams
+            max_len = run_cfg.max_len
+            min_len = run_cfg.min_len
+            evaluate = run_cfg.evaluate
+    
+            report_metric = run_cfg.get("report_metric", True)
+    
+            return cls(
+                num_beams=num_beams,
+                max_len=max_len,
+                min_len=min_len,
+                evaluate=evaluate,
+                report_metric=report_metric,
+            )
+    
+        def valid_step(self, model, samples):
+            results = []        
+            loss = model(samples)["loss"].item() 
+            
+            return [loss] 
+        ...
+
+Note that for any new task, we advise the users to review carefully the functions implemented within ``BaseTask`` and consider which methods should be modified. 
+For instance, the base task class already contains a standard implementation of model training steps that are common among machine learning steps. 
+Some major methods we want to emphasize and should be customized by each task are the ``valid_step`` and ``evaluation``. 
+These operations were not fully implemented in the base task class due to the differences in evaluation procedures among many machine learning tasks. 
+Another method that should be considered is the ``setup_task`` method. 
+This method will receive configurations that set task-specific parameters to initialize any task instance.
+
+Registering New Task ``lavis.tasks.__init__`` 
+********************************************************************************
+
+Any new task must be officially registered as part of the ``lavis.tasks`` module. For instance, to add a new task for video-grounded dialogues, we can modify the ``__init__.py`` as follows:
+
+.. code-block:: python
+
+    from lavis.tasks.dialogue import DialogueTask
+    
+    ...
+    __all__ = [
+        ...
+        "DialogueTask"
+    ]
+
+Assigning Task 
+***************
+
+From the above example of task class, note that we define a ``setup_task`` method for each task class. 
+This method will process a configuration file and pass specific parameters e.g. ``num_beams`` (for beam search generative tasks during the inference stage), to initialize the task classes properly. 
+To assign and associate any task, we need to specify the correct registry of task classes in a configuration file. 
+For instance, the following should be specified in a configuration file e.g. ``dialogue_avsd_ft.yaml``:
+
+.. code-block:: yaml
+
+    run:
+      task: dialogue # name of the task 
+      
+      # optimizer
+      ...
+    
+      max_len: 20
+      min_len: 5
+      num_beams: 3    
+      ...
+    
+Subsequently, any processes (e.g. training) should load this configuration file to assign the correct task.
+
+.. code-block:: sh
+
+    python train.py --cfg-path dialogue_avsd_ft.yaml
\ No newline at end of file