--- a
+++ b/README.md
@@ -0,0 +1,54 @@
+# AI-Genomics: Genetics with Artificial Intelligence Project
+
+## Description
+Implemented AI for genetic analysis using PyTorch (machine learning framework) with two colleagues, mentored by a PhD geneticist from Universidad de los Andes and a Master's student in Applied Mathematics at Universidad Nacional de Colombia. This project excels in analyzing DNA sequences and classifying them based on discernible motifs.
+
+## Requirements
+- Python 3
+- Jupyter Notebook (recommended for running in Google Colab)
+
+## Instructions to Run the Code
+1. Clone the repository to your local machine:
+   ```bash
+   git clone https://github.com/anjimenezp/AI-Genetics.git
+   cd AI-Genetics
+
+## Project Overview
+This repository contains code for a genomics project utilizing artificial intelligence for the classification of DNA sequences. The code includes the following components:
+
+### 1. Data Extraction:
+- Gene sequence data is extracted from a CSV file using Pandas.
+
+### 2. Simulated Sequence Generation (commented out):
+- The code provides functionality for generating simulated DNA sequences, but it is not used in the main code.
+
+### 3. Label Quantification:
+- Sequence labels are encoded using scikit-learn's LabelEncoder.
+
+### 4. One-Hot Encoding:
+- DNA sequences are cleaned and converted to one-hot encoding using PyTorch.
+
+### 5. Training Splits:
+- The data is split into training, validation, and test sets for model training and evaluation.
+
+### 6. DataLoader Preparation:
+- PyTorch DataLoaders are prepared for efficient batch processing during training.
+
+### 7. CNN Model Definition:
+- A Convolutional Neural Network (CNN) is defined for classifying DNA sequences.
+
+### 8. Training Loop Functions:
+- Functions for training and validation loops are defined.
+
+### 9. Model Evaluation:
+- The trained model is evaluated on a test set, and performance metrics are displayed.
+
+### 10. Plotting:
+- Matplotlib is used to plot training and validation loss curves.
+
+### 11. Example Prediction:
+- An example DNA sequence is provided, and the trained model predicts its class.
+
+Feel free to explore the code and adapt it to your genomics classification tasks. If you have any questions or suggestions, please open an issue.
+
+Note: The code assumes the availability of PyTorch, scikit-learn, pandas, and matplotlib libraries. Make sure to install these dependencies before running the code.