{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "IqM-T1RTzY6C" }, "source": [ "To run this, press \"*Runtime*\" and press \"*Run all*\" on a **free** Tesla T4 Google Colab instance!\n", "
\n", "\n", "To install Unsloth on your own computer, follow the installation instructions on our Github page [here](https://github.com/unslothai/unsloth#installation-instructions---conda).\n", "\n", "You will learn how to do [data prep](#Data), how to [train](#Train), how to [run the model](#Inference), & [how to save it](#Save) (eg for Llama.cpp).\n", "\n", "**[NEW] Llama-3 8b is trained on a crazy 15 trillion tokens! Llama-2 was 2 trillion.**" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "id": "2eSvM9zX_2d3" }, "outputs": [], "source": [ "%%capture\n", "import torch\n", "major_version, minor_version = torch.cuda.get_device_capability()\n", "# Must install separately since Colab has torch 2.2.1, which breaks packages\n", "!pip install \"unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git\"\n", "if major_version >= 8:\n", " # Use this for new GPUs like Ampere, Hopper GPUs (RTX 30xx, RTX 40xx, A100, H100, L40)\n", " !pip install --no-deps packaging ninja einops flash-attn xformers trl peft accelerate bitsandbytes\n", "else:\n", " # Use this for older GPUs (V100, Tesla T4, RTX 20xx)\n", " !pip install --no-deps xformers trl peft accelerate bitsandbytes\n", "pass\n", "# Llama 3 Video Tutorial https://www.youtube.com/watch?v=aQmoog_s8HE" ] }, { "cell_type": "markdown", "metadata": { "id": "r2v_X2fA0Df5" }, "source": [ "* We support Llama, Mistral, CodeLlama, TinyLlama, Vicuna, Open Hermes etc\n", "* And Yi, Qwen ([llamafied](https://huggingface.co/models?sort=trending&search=qwen+llama)), Deepseek, all Llama, Mistral derived archs.\n", "* We support 16bit LoRA or 4bit QLoRA. Both 2x faster.\n", "* `max_seq_length` can be set to anything, since we do automatic RoPE Scaling via [kaiokendev's](https://kaiokendev.github.io/til) method.\n", "* [**NEW**] With [PR 26037](https://github.com/huggingface/transformers/pull/26037), we support downloading 4bit models **4x faster**! [Our repo](https://huggingface.co/unsloth) has Llama, Mistral 4bit models." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 731, "referenced_widgets": [ "0c2790d97ffc43aeabfb64387ae839dd", "2079f37f7dca43bba98c2455aff7f140", "26c7ef4799d7408192d5467538996bc0", "1cd9a77da50343f8b4e5766fffbaae6e", "784397bd1881417a832af6958441d521", "19ab2550416d4fd5a9e77f41ec026d2b", "d6b80c8a1af6410abe9e1be645296ba8", "c87d81a6a3c14d6d89a77c6044e1a2b3", "8d275321b340448584b4c9a06a8673ff", "5255c4d84fab451486bfb3318e0ce9cb", "f3d2e7e29e8047719525f02d6f47d338", "70a00f18fd1c48ab92cb0e4116e664b7", "b5cf4a917b0f43c48b1ae7101e2e12b4", "427de6dc1ff64273be181f99137b6522", "ea07b009f331438fa1572c077bfd71eb", "b4ee02b4ff6d4cda813eb29195c87e46", "114ef5c665c64e40927ab860e82ac0f8", "aa7a505cb0084db981d24424a490b51a", "8209535d83c64faa9a229cc31bb92191", "6a47eb4baacc4553abda62e5dac80e55", "fbe2e70d370e4d63be1b37a3c77defc7", "08d4883410bf420fa2a8e838a135a1a0", "ad17cac782f845afa435632ef48efa09", "77140591cda64bf1a28d4a1b75e2a27a", "8fbdd42ecd8140f59ff18f0af05b9fa6", "94e57da585464e23b16cc03a662b2d8e", "72b03747ecda48fd96b8ff83ee0ff49f", "69829f29648749619500ce3ac1b85258", "c1a6929e5ffc43709d212aab8978ef98", "6bfdfa87f2ae4d648fd131a3d44d138b", "40eacf125fd14f608305dc08851e7128", "d6502ccd73b9414b8ec68e3fa3ad03b6", "0b3e118d566e4a8e92499a45ede7879a", "e38d905b07a041f98e743b7d3c71aa25", "e15906d685014d23bcc0bac23f927fa7", "a5bdab92e623410e86cb2e32c614aa8a", "fec91f513381482baf59bc0cc6936fc6", "0b3e357a6f0446569a456828855e0caa", "b220fa1b6486446994f3ba2c894cb51b", "50970637e429405193419be8b6ffbed2", "8622f38ba818499a82e5bb4562aaba8e", "ae8d90af8052406f9f8d790632239e16", "e99b1589bed249d98946fe584246652e", "e3166b45009a425a871c11684ec0a926", "f8956563b28e4414b1e1d59b1e1eb9c6", "21d3d43e079b4573b81288c32bd80ddd", "16a482bcdf6b4295bb741f1f8c3f0e47", "7c8aca8babbb450882c01c8ec912bd7e", "6427107d1bb94968b96d596aeead083b", "fab804e5ddc545908c0bfa2c76a58c8d", "dd4544830fb442d99d7565f1b359ce22", "fc5c3c1a94864d81b62f71e1a555e668", "9b587792516943aea23b0d050f2a67bc", "b8d9c2f99e944bfebc900f9ecc0d2c51", "2daf031450914c82942a5506130277d0", "ca4e04ed71104524a6ca6909f5ab42a0", "99668d6caf4746d088fe515086ad8adc", "1d5f07791af642b48e81952ec36736b9", "34abf22120594152b8b8b5f9a8da7d99", "6e8f1eeabea742b3adbaf889505fcf9d", "55b4ef75b5494472906a7dd83fec6fcb", "b555eacd68e54346aa11de416b9c6f5a", "3b9dad253e52445aa6e5b2c2876e3e14", "7b696d1445ad4f24b99b2cada4299071", "8753e5bb94844ce9ae1bab6f1f2dbf96", "4e0afc8334da4451bbf58c7924c412f6", "0929c74c4bca4c59aba4ad73d08cea28", "4d33eb51d68d4c179051463d47d854b0", "b392a077bf244e85b7df619c493a438a", "d2e2a4a6a38a4d45b7e635e5aeb5e5a0", "0c00d271a15b4a21a238662115979c2d", "f8bdf78259bb4ab1aa26b5c39b586bfb", "fc8705cae64c423e8e730df441a96ed5", "244efa0c13d84c32b3db3608eb6a5aa5", "53479251af864348b3a3a31e9b54b506", "14cf1c0708ea4c6e901ba69b740ddc6b", "abdc59e997704ae29d65c0fea613f55c", "9815f4bb2cf443d49790f161885049a4", "b30dfb04b33841eba0e0d6bf39d026aa", "7701d57ed2b04db6b2942c6ccbfc1754", "221f45f35d514c3f952267dacb1b276a", "4641a55dd17443019f8a432d5b3eaef7", "098950430429446586f6716abc585786", "715356872bb94ac49d81cb51373e8a2e", "e087f0f0f46d441f9adc78675a396a18", "b8f1fb5bf43e44b881142c15c5676b2b", "127de142d4b94565b2ad157c9c8e7b04", "539e7c2e704e449bbc9d1d2246ee446b", "b1cbaadba52546e1a120a10713daf739", "8bd91889a993440c9bf476fcdb2c5e8c", "3570003ce154404492edd7018ede4a20", "3dd2d103beab40fbbd37429091ef9d5d", "c749239f95144f45bf3cccfaa93e2dee", "74ca05e441ca4b6e9e396c730eec984f", "44d6507aad7d49848a7798a9c726e3de", "a5c347e9b6ac4c778a83f7bf174d717f", "9640284656554505bb48f7f7424f1c01", "947fbf91b34d4b219e11708bea96c7b1", "368423d073874cffbdbb2dabe2a48f53", "d1de217c698342468799f5a525aecdf0", "58d404b54be2411c9858a6ba480b11e3", "83eb67b48e41418eb80f93a211c81866", "d3a7869b20a244eaa7a9fb40cc3402c5", "95395e88c5bc4dd6862f71b75339a7d2", "e75c884fbd6b49ffb7521827b150247c", "384df88630ae4824a372f469731ad140", "39337b5ffa3c40eea60ca8b8e7b5049d", "0a1daa6cdc054290862a585630cf427e", "49da30a099f54658b2c1d64d78947b18", "e69ea44dd69a4d1da475ab5710d945f9", "7a206c6511324680b82a398dc2248d3c", "90ad5d79ca794f828d69c0d477ebaf54", "5c8ee631d38f47d098f8aa5102dce1ce", "e262ce0f287d4742b25242e53bf8a812", "7d318a37954c489ba962ea3f92c2c8ce", "90ab5c48207b4f79a34111a5548074fc", "81f2cc850cd448308c803140e6b198f9", "3a4b22722c5d415b8734bd98e4bfeed5", "516102ccb4aa421f85a592deecfa2da2", "b33f384ec52c4ebda252948334622817", "1f5d2fb310224c34a1d73cc296b7e154", "d49276d392da4a729dcfce2fc894b538", "e4bf641a71c046acba0003566f280c27", "35d28e0c6048460e8b29f67e444a4c82", "9de0270e7e6a42b0b401f56898ec7c08", "b8d2262d3d004db6acbb810fac7c3576", "c1c097dac2aa43829a420a52cc76d855", "584d80a80ff146219ea10da2c4a9edcc", "350aca79988e4851a52596837cd700b6", "ed5ba80a383e426fababce8c262360be", "cebbca06469546f78229971285493e58", "224cf4cee9e44f0bb3875abebb01a9ab" ] }, "id": "QmUBVEnvCDJv", "outputId": "5dc6569e-79dc-48f9-9b28-e23b5dbd089b" }, "outputs": [ { "output_type": "stream", "name": "stderr", "text": [ "WARNING:xformers:WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:\n", " PyTorch 2.3.0+cu121 with CUDA 1201 (you have 2.2.1+cu121)\n", " Python 3.10.14 (you have 3.10.12)\n", " Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)\n", " Memory-efficient attention, SwiGLU, sparse and more won't be available.\n", " Set XFORMERS_MORE_DETAILS=1 for more details\n", "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.\n", " warnings.warn(\n" ] }, { "output_type": "display_data", "data": { "text/plain": [ "config.json: 0%| | 0.00/719 [00:00, ?B/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "0c2790d97ffc43aeabfb64387ae839dd" } }, "metadata": {} }, { "output_type": "stream", "name": "stdout", "text": [ "==((====))== Unsloth: Fast Llama patching release 2024.4\n", " \\\\ /| GPU: NVIDIA A100-SXM4-40GB. Max memory: 39.564 GB. Platform = Linux.\n", "O^O/ \\_/ \\ Pytorch: 2.2.1+cu121. CUDA = 8.0. CUDA Toolkit = 12.1.\n", "\\ / Bfloat16 = TRUE. Xformers = 0.0.26.post1. FA = True.\n", " \"-____-\" Free Apache license: http://github.com/unslothai/unsloth\n" ] }, { "output_type": "display_data", "data": { "text/plain": [ "model.safetensors.index.json: 0%| | 0.00/23.9k [00:00, ?B/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "70a00f18fd1c48ab92cb0e4116e664b7" } }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "Downloading shards: 0%| | 0/4 [00:00, ?it/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "ad17cac782f845afa435632ef48efa09" } }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "model-00001-of-00004.safetensors: 0%| | 0.00/4.98G [00:00, ?B/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "e38d905b07a041f98e743b7d3c71aa25" } }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "model-00002-of-00004.safetensors: 0%| | 0.00/5.00G [00:00, ?B/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "f8956563b28e4414b1e1d59b1e1eb9c6" } }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "model-00003-of-00004.safetensors: 0%| | 0.00/4.92G [00:00, ?B/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "ca4e04ed71104524a6ca6909f5ab42a0" } }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "model-00004-of-00004.safetensors: 0%| | 0.00/1.17G [00:00, ?B/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "0929c74c4bca4c59aba4ad73d08cea28" } }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "Loading checkpoint shards: 0%| | 0/4 [00:00, ?it/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "9815f4bb2cf443d49790f161885049a4" } }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "generation_config.json: 0%| | 0.00/194 [00:00, ?B/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "b1cbaadba52546e1a120a10713daf739" } }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "tokenizer_config.json: 0%| | 0.00/51.0k [00:00, ?B/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "d1de217c698342468799f5a525aecdf0" } }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "tokenizer.json: 0%| | 0.00/9.09M [00:00, ?B/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "7a206c6511324680b82a398dc2248d3c" } }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "special_tokens_map.json: 0%| | 0.00/301 [00:00, ?B/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "d49276d392da4a729dcfce2fc894b538" } }, "metadata": {} }, { "output_type": "stream", "name": "stderr", "text": [ "Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.\n", "Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.\n", "gradientai/Llama-3-8B-Instruct-Gradient-1048k does not have a padding or unknown token!\n", "Will use the EOS token of id 128001 as padding.\n" ] } ], "source": [ "from unsloth import FastLanguageModel\n", "import torch\n", "max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!\n", "dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+\n", "load_in_4bit = False # Use 4bit quantization to reduce memory usage. Can be False.\n", "\n", "# 4bit pre quantized models we support for 4x faster downloading + no OOMs.\n", "# fourbit_models = [\n", "# \"unsloth/mistral-7b-bnb-4bit\",\n", "# \"unsloth/mistral-7b-instruct-v0.2-bnb-4bit\",\n", "# \"unsloth/llama-2-7b-bnb-4bit\",\n", "# \"unsloth/gemma-7b-bnb-4bit\",\n", "# \"unsloth/gemma-7b-it-bnb-4bit\", # Instruct version of Gemma 7b\n", "# \"unsloth/gemma-2b-bnb-4bit\",\n", "# \"unsloth/gemma-2b-it-bnb-4bit\", # Instruct version of Gemma 2b\n", "# \"unsloth/llama-3-8b-bnb-4bit\", # [NEW] 15 Trillion token Llama-3\n", "# ] # More models at https://huggingface.co/unsloth\n", "\n", "model, tokenizer = FastLanguageModel.from_pretrained(\n", " model_name = \"gradientai/Llama-3-8B-Instruct-Gradient-1048k\",\n", " max_seq_length = max_seq_length,\n", " dtype = dtype,\n", " load_in_4bit = load_in_4bit,\n", " token = \"hf_\", # use one if using gated models like meta-llama/Llama-2-7b-hf\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "SXd9bTZd1aaL" }, "source": [ "We now add LoRA adapters so we only need to update 1 to 10% of all parameters!" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 0 }, "id": "6bZsfBuZDeCL", "outputId": "6e9c4ab6-d8ed-49ba-cb32-a7a6a9a9b4fa" }, "outputs": [ { "output_type": "stream", "name": "stderr", "text": [ "Unsloth 2024.4 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.\n" ] } ], "source": [ "model = FastLanguageModel.get_peft_model(\n", " model,\n", " r = 1, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128\n", " target_modules = [\"q_proj\", \"k_proj\", \"v_proj\", \"o_proj\",\n", " \"gate_proj\", \"up_proj\", \"down_proj\",],\n", " lora_alpha = 5,\n", " lora_dropout = 0, # Supports any, but = 0 is optimized\n", " bias = \"none\", # Supports any, but = \"none\" is optimized\n", " # [NEW] \"unsloth\" uses 30% less VRAM, fits 2x larger batch sizes!\n", " use_gradient_checkpointing = \"unsloth\", # True or \"unsloth\" for very long context\n", " random_state = 3407,\n", " use_rslora = False, # We support rank stabilized LoRA\n", " loftq_config = None, # And LoftQ\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "vITh0KVJ10qX" }, "source": [ "\n", "### Data Prep\n", "We now use the Alpaca dataset from [yahma](https://huggingface.co/datasets/yahma/alpaca-cleaned), which is a filtered version of 52K of the original [Alpaca dataset](https://crfm.stanford.edu/2023/03/13/alpaca.html). You can replace this code section with your own data prep.\n", "\n", "**[NOTE]** To train only on completions (ignoring the user's input) read TRL's docs [here](https://huggingface.co/docs/trl/sft_trainer#train-on-completions-only).\n", "\n", "**[NOTE]** Remember to add the **EOS_TOKEN** to the tokenized output!! Otherwise you'll get infinite generations!\n", "\n", "If you want to use the `ChatML` template for ShareGPT datasets, try our conversational [notebook](https://colab.research.google.com/drive/1Aau3lgPzeZKQ-98h69CCu1UJcvIBLmy2?usp=sharing).\n", "\n", "For text completions like novel writing, try this [notebook](https://colab.research.google.com/drive/1ef-tab5bhkvWmBOObepl1WgJvfvSzn5Q?usp=sharing)." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 430, "referenced_widgets": [ "f49688387e874bb89b61a7c414843df4", "955bda7aaba044cabc23be7fee70f048", "1e764f6b39394d33833c69b0aea99165", "b3580b2994f1448a86d3b0b60b237fcf", "9920b8f668cd487f9dd5bdb32894fb9d", "c928f7d6c7974f67a87c9b7e619dd791", "de0538163b5845c7825ded7c090091d1", "9f932bf6b940473083bc6fc9264b6bf4", "b076371f0f2544bc84391df71c5d4c57", "60252c7aa12c4d8b83cf1a173c1a14f3", "ed38499d875b4515bdfa5af0dc5e9204", "c9e0afe8d1cd4f49a5bafca9e6ee012b", "530c046765d44df9a20f2a9b0854df94", "b87210f62e9c4819a6a246a0f0c5c2a1", "7d7f674f96a54d00bdccf3390f1ea3e1", "d56c88ed6b4748d7abdee2db17449fc6", "9d28d0b2986a46e5b460a4ea663d6199", "f00f43364c664e959a9486535ab99d10", "66a041e68dc54dbf8011b5dd9d4f9a26", "34027a94bca8421d987900fe52537a88", "ee0914813f49433f922fad555e334a8f", "8fb0515dd726466e9c73f1e47032db1d", "746db02fb17e42979694a1dad9d81af2", "9f9ca4d05b1d426ba99b06dd21dccb77", "2452dbc0801542cfa3bb108bdd7633ae", "cee5d8963b1d4aac9df9732904670b95", "abcadfa4d18e4ea28db55698aa140269", "e02ee4ced9f64548b85083de4f575d80", "c4b94cbddc0f4bd0be580e52c48e2e41", "2ad22f0056d9409d9a1ac84866683729", "f0d40fa27b4c4bae86fe7d78689c62c6", "309ab7b61514413ebc4fb9cad7d2848b", "542b0dfe51174d3285bcf18dab0202d1", "3e07314f80874faebd7f1bd63a2eb3bf", "6e1d8c217f0e47f4b93f74a11eb4124c", "c0eb9bf8f9e94e3d8c0a50893f0dc2c2", "b753c8ad4857473b978ad7f015715703", "c76268d5167f46bda30107f61bfba6d7", "34531d180c624040b222ad2e91c0cb66", "aaa163859d474182b82491b3a21341a4", "00db815c2c894f9493b6a2ae04eff748", "8599d9d983034980a6a7689b6242eeab", "5432f0e87be14712911081d13e287c99", "06b99b3d1a804bca8e7d72effbb97fe6", "6b124cdccf27458ea80a61386f53b669", "2a8364c955394a4aa8c007f98016db04", "619f4d7cac9a43889e9cb1a461e79eb8", "d877d7d47db04427a50ffd904dfd2f05", "b97c9f4a33cb45cebe5885a68c0768ea", "131cdeec8306409282052ec6641ff86c", "7f9ca010469f4ce9876dc3faae1fda65", "aafba16851164bdeb48ae9ab1136f22e", "e8f75ed6981a47ac82245146652107e4", "e89ab5a69fdb4d6b98499aadd5c387e0", "81c52655cde142259f60ce861ff23091", "c2e5f47253ed467ea665dce6e791c4c2", "2441c3a41a7b4a41ba3acba28ce934d4", "d76174cc5b5c410d810d6a7c56ba2dff", "280222618e784d8eb765b6c13d6a9a5b", "a58f05ce60084156ac6e60ec591c5455", "295a0db9f2334f5fa6167a53b6b459a5", "d8da5d2961d94ba59cc387b9161d41da", "389fa4920be84d969b823021a73c615b", "3c5aef724ecb4bd4998937d5bed2c34f", "df40e0ddf3d642bba3de16c898245cad", "6daafe9a98154ad0be0868c98651e574", "4920cd99c70b4757b012fcd682e4fbbe", "a242a72e4be6440197afb19c6a66dae1", "08087558162244f9821efe4b5fe4d423", "d76847087b4c4424bb1d46473b265ac8", "72467708f5ff49b39bb87125b7a79bf5", "0d04f7c57a81451bacbe5a82d8537859", "06b1280e116d4d1a8735f6abe06a0139", "73f8a3473b564759b08e638c8710fe3c", "c8107a2294a6480da0680d390f9bc798", "16e2ff2aff8b4dc1aa67aab15f926915", "caf9a6acdf3040cd8e08e17363dfc149", "ed3b09541193405584e42977eecfe46a", "d7d35951278846a197cf8b23c056877e", "597c5afd4f244e3c92f051abb7a172b9", "eeba10e25abe4b3696889d4089a05571", "97cce2127a0a448f8b4e834d2d67adcc", "c7c24a04dac9451ab89889e89a3c2d01", "1f4391b8408749739e82b6bf31d8b423", "c7b90cad25b34230a07ad9be0aa9434a", "8c7ecc6307e14a68b4bc6e553614fedb", "25ea97d31ea14c06bbe9c052558b80c5", "d1d3beb05f6442e08a20fae88ae40c3c", "350c94fc933f4ab498606512219fc578", "44db8238143a4577bdb932ade78debf0", "bf8fcbec5a7e41349ade9485eea6de62", "a59ab39214294a1e975a4ee60bac2566", "ce2b9de243c34a83add3dd233d857783", "601b0b2bc81244f3856c4ed6daccd592", "386f8d4e4b38490b8e420dc35b53827f", "6f0078f22a634a06a3d702fbfd267db6", "a2e8e0b44ba641d6b49973cda25287e9", "9357c4ee3e634e3283cd195b49e59404", "6a95ea285fca4141b035a3fcc990afea", "9e51776b088e4179a08b87508113c3ae", "f749e5e503a845cd8f1944ccf9865bcf", "339fe2851003462caf3522d0d7e79231", "801238eafc9d44d1a1dbc07fb1a5edff", "ab45757b6d77440b9b17b6cd55f04158", "4849000bd77e4fb6b96c0bed4ed12fc4", "ac038235bdb44d1b84eb124b45480520", "9b4ca7e95b8f4e80b6672651c5ef2962", "a43c104685d5480b93ee87b5936b773b", "072a4ee5cfee47d8b5aa159342bf9620", "1f615329cfd54990be692a48e79a2282" ] }, "id": "LjY75GoYUCB8", "outputId": "cdd48e99-9be1-48b7-e9d8-7823fa70a249" }, "outputs": [ { "output_type": "stream", "name": "stderr", "text": [ "/usr/local/lib/python3.10/dist-packages/datasets/load.py:1486: FutureWarning: The repository for zjunlp/Mol-Instructions contains custom code which must be executed to correctly load the dataset. You can inspect the repository content at https://hf.co/datasets/zjunlp/Mol-Instructions\n", "You can avoid this message in future by passing the argument `trust_remote_code=True`.\n", "Passing `trust_remote_code=True` will be mandatory to load this dataset from the next major release of `datasets`.\n", " warnings.warn(\n" ] }, { "output_type": "display_data", "data": { "text/plain": [ "Downloading builder script: 0%| | 0.00/7.34k [00:00, ?B/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "f49688387e874bb89b61a7c414843df4" } }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "Downloading readme: 0%| | 0.00/19.6k [00:00, ?B/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "c9e0afe8d1cd4f49a5bafca9e6ee012b" } }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "Downloading data: 0%| | 0.00/73.2M [00:00, ?B/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "746db02fb17e42979694a1dad9d81af2" } }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "Generating description_guided_molecule_design split: 0 examples [00:00, ? examples/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "3e07314f80874faebd7f1bd63a2eb3bf" } }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "Generating forward_reaction_prediction split: 0 examples [00:00, ? examples/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "6b124cdccf27458ea80a61386f53b669" } }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "Generating molecular_description_generation split: 0 examples [00:00, ? examples/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "c2e5f47253ed467ea665dce6e791c4c2" } }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "Generating property_prediction split: 0 examples [00:00, ? examples/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "4920cd99c70b4757b012fcd682e4fbbe" } }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "Generating reagent_prediction split: 0 examples [00:00, ? examples/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "ed3b09541193405584e42977eecfe46a" } }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "Generating retrosynthesis split: 0 examples [00:00, ? examples/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "350c94fc933f4ab498606512219fc578" } }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "Map: 0%| | 0/298319 [00:00, ? examples/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "9e51776b088e4179a08b87508113c3ae" } }, "metadata": {} } ], "source": [ "alpaca_prompt = \"\"\"Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n", "\n", "### Instruction:\n", "{}\n", "\n", "### Input:\n", "{}\n", "\n", "### Response:\n", "{}\"\"\"\n", "\n", "EOS_TOKEN = tokenizer.eos_token # Must add EOS_TOKEN\n", "def formatting_prompts_func(examples):\n", " instructions = examples[\"instruction\"]\n", " inputs = examples[\"input\"]\n", " outputs = examples[\"output\"]\n", " texts = []\n", " for instruction, input, output in zip(instructions, inputs, outputs):\n", " # Must add EOS_TOKEN, otherwise your generation will go on forever!\n", " text = alpaca_prompt.format(instruction, input, output) + EOS_TOKEN\n", " texts.append(text)\n", " return { \"text\" : texts, }\n", "pass\n", "\n", "from datasets import load_dataset\n", "dataset = load_dataset(\"zjunlp/Mol-Instructions\", \"Molecule-oriented Instructions\", split=\"description_guided_molecule_design\")\n", "dataset = dataset.map(formatting_prompts_func, batched = True,)" ] }, { "cell_type": "markdown", "metadata": { "id": "idAEIeSQ3xdS" }, "source": [ "\n", "### Train the model\n", "Now let's use Huggingface TRL's `SFTTrainer`! More docs here: [TRL SFT docs](https://huggingface.co/docs/trl/sft_trainer). We do 60 steps to speed things up, but you can set `num_train_epochs=1` for a full run, and turn off `max_steps=None`. We also support TRL's `DPOTrainer`!" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 124, "referenced_widgets": [ "362900a748bd4e749c7905ab91ece526", "597b9b093c424c76a9c846b360ef4907", "1d6b2cbc29524718ad5a58b8db8de5d0", "ba13ecb422e44bb19b825843c5e893be", "4d3f2990e5a846d3a70a663b386ba723", "d18b150834d74413a22663f3d26b2f62", "4cc7ad7a048d4617848b17a02c197031", "cb13ab93efac498c84ae1e94f3d9efe8", "d6d552d01b844eaa8562bb65c8e05182", "e9ccf9d1617c453eaf547a30caf616da", "ff7397d4a5c5498c84f55a77f26e5208" ] }, "id": "95_Nn-89DhsL", "outputId": "a8d0ca7f-dd97-4c40-b187-3b73d64980b8" }, "outputs": [ { "output_type": "stream", "name": "stderr", "text": [ "/usr/local/lib/python3.10/dist-packages/multiprocess/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.\n", " self.pid = os.fork()\n" ] }, { "output_type": "display_data", "data": { "text/plain": [ "Map (num_proc=2): 0%| | 0/298319 [00:00, ? examples/s]" ], "application/vnd.jupyter.widget-view+json": { "version_major": 2, "version_minor": 0, "model_id": "362900a748bd4e749c7905ab91ece526" } }, "metadata": {} }, { "output_type": "stream", "name": "stderr", "text": [ "max_steps is given, it will override any value given in num_train_epochs\n" ] } ], "source": [ "from trl import SFTTrainer\n", "from transformers import TrainingArguments\n", "\n", "trainer = SFTTrainer(\n", " model = model,\n", " tokenizer = tokenizer,\n", " train_dataset = dataset,\n", " dataset_text_field = \"text\",\n", " max_seq_length = max_seq_length,\n", " dataset_num_proc = 2,\n", " packing = False, # Can make training 5x faster for short sequences.\n", " args = TrainingArguments(\n", " per_device_train_batch_size = 2,\n", " gradient_accumulation_steps = 4,\n", " warmup_steps = 5,\n", " max_steps = 400,\n", " learning_rate = 2e-4,\n", " fp16 = not torch.cuda.is_bf16_supported(),\n", " bf16 = torch.cuda.is_bf16_supported(),\n", " logging_steps = 1,\n", " optim = \"adamw_8bit\",\n", " weight_decay = 0.01,\n", " lr_scheduler_type = \"linear\",\n", " seed = 3407,\n", " output_dir = \"outputs\",\n", " ),\n", ")" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 0 }, "id": "2ejIt2xSNKKp", "outputId": "8b6bdda2-6667-4e91-8f1c-8f71c94e4688" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "GPU = NVIDIA A100-SXM4-40GB. Max memory = 39.564 GB.\n", "31.113 GB of memory reserved.\n" ] } ], "source": [ "#@title Show current memory stats\n", "gpu_stats = torch.cuda.get_device_properties(0)\n", "start_gpu_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)\n", "max_memory = round(gpu_stats.total_memory / 1024 / 1024 / 1024, 3)\n", "print(f\"GPU = {gpu_stats.name}. Max memory = {max_memory} GB.\")\n", "print(f\"{start_gpu_memory} GB of memory reserved.\")" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 12705 }, "id": "yqxqAZ7KJ4oL", "outputId": "925325d3-7793-4d1d-a03f-026fd2de00be" }, "outputs": [ { "output_type": "stream", "name": "stderr", "text": [ "==((====))== Unsloth - 2x faster free finetuning | Num GPUs = 1\n", " \\\\ /| Num examples = 298,319 | Num Epochs = 1\n", "O^O/ \\_/ \\ Batch size per device = 2 | Gradient Accumulation steps = 4\n", "\\ / Total batch size = 8 | Total steps = 400\n", " \"-____-\" Number of trainable parameters = 2,621,440\n" ] }, { "output_type": "display_data", "data": { "text/plain": [ "Step | \n", "Training Loss | \n", "
---|---|
1 | \n", "2.063800 | \n", "
2 | \n", "2.284100 | \n", "
3 | \n", "2.350400 | \n", "
4 | \n", "1.959100 | \n", "
5 | \n", "2.012800 | \n", "
6 | \n", "2.212900 | \n", "
7 | \n", "1.806100 | \n", "
8 | \n", "1.637100 | \n", "
9 | \n", "1.620500 | \n", "
10 | \n", "1.375400 | \n", "
11 | \n", "1.337200 | \n", "
12 | \n", "1.505800 | \n", "
13 | \n", "1.208200 | \n", "
14 | \n", "1.227500 | \n", "
15 | \n", "1.099000 | \n", "
16 | \n", "1.005600 | \n", "
17 | \n", "1.076100 | \n", "
18 | \n", "0.903000 | \n", "
19 | \n", "0.915700 | \n", "
20 | \n", "0.898300 | \n", "
21 | \n", "0.842600 | \n", "
22 | \n", "0.820800 | \n", "
23 | \n", "0.822900 | \n", "
24 | \n", "0.841900 | \n", "
25 | \n", "0.657700 | \n", "
26 | \n", "0.822500 | \n", "
27 | \n", "0.682300 | \n", "
28 | \n", "0.702500 | \n", "
29 | \n", "0.839700 | \n", "
30 | \n", "0.711300 | \n", "
31 | \n", "0.697000 | \n", "
32 | \n", "0.737800 | \n", "
33 | \n", "0.706900 | \n", "
34 | \n", "0.691200 | \n", "
35 | \n", "0.626900 | \n", "
36 | \n", "0.696300 | \n", "
37 | \n", "0.640600 | \n", "
38 | \n", "0.590500 | \n", "
39 | \n", "0.734600 | \n", "
40 | \n", "0.599000 | \n", "
41 | \n", "0.577900 | \n", "
42 | \n", "0.659300 | \n", "
43 | \n", "0.565900 | \n", "
44 | \n", "0.658700 | \n", "
45 | \n", "0.751400 | \n", "
46 | \n", "0.619700 | \n", "
47 | \n", "0.602800 | \n", "
48 | \n", "0.664300 | \n", "
49 | \n", "0.593500 | \n", "
50 | \n", "0.607200 | \n", "
51 | \n", "0.605100 | \n", "
52 | \n", "0.576300 | \n", "
53 | \n", "0.627100 | \n", "
54 | \n", "0.584200 | \n", "
55 | \n", "0.581300 | \n", "
56 | \n", "0.602500 | \n", "
57 | \n", "0.510300 | \n", "
58 | \n", "0.514600 | \n", "
59 | \n", "0.580700 | \n", "
60 | \n", "0.697100 | \n", "
61 | \n", "0.579200 | \n", "
62 | \n", "0.617400 | \n", "
63 | \n", "0.583500 | \n", "
64 | \n", "0.599700 | \n", "
65 | \n", "0.539800 | \n", "
66 | \n", "0.526200 | \n", "
67 | \n", "0.732300 | \n", "
68 | \n", "0.557800 | \n", "
69 | \n", "0.543200 | \n", "
70 | \n", "0.532700 | \n", "
71 | \n", "0.684400 | \n", "
72 | \n", "0.774700 | \n", "
73 | \n", "0.495600 | \n", "
74 | \n", "0.516500 | \n", "
75 | \n", "0.466400 | \n", "
76 | \n", "0.671500 | \n", "
77 | \n", "0.528400 | \n", "
78 | \n", "0.546200 | \n", "
79 | \n", "0.479500 | \n", "
80 | \n", "0.469000 | \n", "
81 | \n", "0.517700 | \n", "
82 | \n", "0.544300 | \n", "
83 | \n", "0.508300 | \n", "
84 | \n", "0.531700 | \n", "
85 | \n", "0.552100 | \n", "
86 | \n", "0.539000 | \n", "
87 | \n", "0.529600 | \n", "
88 | \n", "0.463000 | \n", "
89 | \n", "0.571600 | \n", "
90 | \n", "0.502300 | \n", "
91 | \n", "0.449700 | \n", "
92 | \n", "0.483700 | \n", "
93 | \n", "0.524700 | \n", "
94 | \n", "0.608000 | \n", "
95 | \n", "0.538300 | \n", "
96 | \n", "0.514600 | \n", "
97 | \n", "0.459000 | \n", "
98 | \n", "0.524500 | \n", "
99 | \n", "0.521800 | \n", "
100 | \n", "0.534600 | \n", "
101 | \n", "0.564900 | \n", "
102 | \n", "0.530300 | \n", "
103 | \n", "0.487100 | \n", "
104 | \n", "0.534700 | \n", "
105 | \n", "0.619500 | \n", "
106 | \n", "0.471700 | \n", "
107 | \n", "0.538800 | \n", "
108 | \n", "0.471100 | \n", "
109 | \n", "0.443700 | \n", "
110 | \n", "0.539100 | \n", "
111 | \n", "0.621100 | \n", "
112 | \n", "0.455100 | \n", "
113 | \n", "0.620300 | \n", "
114 | \n", "0.525600 | \n", "
115 | \n", "0.507600 | \n", "
116 | \n", "0.515900 | \n", "
117 | \n", "0.501600 | \n", "
118 | \n", "0.490900 | \n", "
119 | \n", "0.459900 | \n", "
120 | \n", "0.513900 | \n", "
121 | \n", "0.601200 | \n", "
122 | \n", "0.473700 | \n", "
123 | \n", "0.480300 | \n", "
124 | \n", "0.458400 | \n", "
125 | \n", "0.585700 | \n", "
126 | \n", "0.539100 | \n", "
127 | \n", "0.567700 | \n", "
128 | \n", "0.488700 | \n", "
129 | \n", "0.475000 | \n", "
130 | \n", "0.505900 | \n", "
131 | \n", "0.541500 | \n", "
132 | \n", "0.566000 | \n", "
133 | \n", "0.706600 | \n", "
134 | \n", "0.699000 | \n", "
135 | \n", "0.459800 | \n", "
136 | \n", "0.524000 | \n", "
137 | \n", "0.523900 | \n", "
138 | \n", "0.461800 | \n", "
139 | \n", "0.518400 | \n", "
140 | \n", "0.441500 | \n", "
141 | \n", "0.421700 | \n", "
142 | \n", "0.444300 | \n", "
143 | \n", "0.571700 | \n", "
144 | \n", "0.449400 | \n", "
145 | \n", "0.581900 | \n", "
146 | \n", "0.406000 | \n", "
147 | \n", "0.448300 | \n", "
148 | \n", "0.463800 | \n", "
149 | \n", "0.442700 | \n", "
150 | \n", "0.460600 | \n", "
151 | \n", "0.453200 | \n", "
152 | \n", "0.472400 | \n", "
153 | \n", "0.403000 | \n", "
154 | \n", "0.450000 | \n", "
155 | \n", "0.522700 | \n", "
156 | \n", "0.480400 | \n", "
157 | \n", "0.501900 | \n", "
158 | \n", "0.462600 | \n", "
159 | \n", "0.533800 | \n", "
160 | \n", "0.508600 | \n", "
161 | \n", "0.432600 | \n", "
162 | \n", "0.518300 | \n", "
163 | \n", "0.358400 | \n", "
164 | \n", "0.494800 | \n", "
165 | \n", "0.437500 | \n", "
166 | \n", "0.374400 | \n", "
167 | \n", "0.489200 | \n", "
168 | \n", "0.545400 | \n", "
169 | \n", "0.425800 | \n", "
170 | \n", "0.510000 | \n", "
171 | \n", "0.486900 | \n", "
172 | \n", "0.416500 | \n", "
173 | \n", "0.565300 | \n", "
174 | \n", "0.382600 | \n", "
175 | \n", "0.559100 | \n", "
176 | \n", "0.459400 | \n", "
177 | \n", "0.431500 | \n", "
178 | \n", "0.499000 | \n", "
179 | \n", "0.466100 | \n", "
180 | \n", "0.460100 | \n", "
181 | \n", "0.536700 | \n", "
182 | \n", "0.434600 | \n", "
183 | \n", "0.475200 | \n", "
184 | \n", "0.462500 | \n", "
185 | \n", "0.429200 | \n", "
186 | \n", "0.393700 | \n", "
187 | \n", "0.579800 | \n", "
188 | \n", "0.478900 | \n", "
189 | \n", "0.460000 | \n", "
190 | \n", "0.408600 | \n", "
191 | \n", "0.492700 | \n", "
192 | \n", "0.500800 | \n", "
193 | \n", "0.541000 | \n", "
194 | \n", "0.446000 | \n", "
195 | \n", "0.623600 | \n", "
196 | \n", "0.661400 | \n", "
197 | \n", "0.508600 | \n", "
198 | \n", "0.731800 | \n", "
199 | \n", "0.483200 | \n", "
200 | \n", "0.435100 | \n", "
201 | \n", "0.511500 | \n", "
202 | \n", "0.516100 | \n", "
203 | \n", "0.539100 | \n", "
204 | \n", "0.510300 | \n", "
205 | \n", "0.400000 | \n", "
206 | \n", "0.478500 | \n", "
207 | \n", "0.432200 | \n", "
208 | \n", "0.471600 | \n", "
209 | \n", "0.541300 | \n", "
210 | \n", "0.484400 | \n", "
211 | \n", "0.487700 | \n", "
212 | \n", "0.391600 | \n", "
213 | \n", "0.546900 | \n", "
214 | \n", "0.470200 | \n", "
215 | \n", "0.465100 | \n", "
216 | \n", "0.590800 | \n", "
217 | \n", "0.396900 | \n", "
218 | \n", "0.579500 | \n", "
219 | \n", "0.424400 | \n", "
220 | \n", "0.491600 | \n", "
221 | \n", "0.476500 | \n", "
222 | \n", "0.476000 | \n", "
223 | \n", "0.459300 | \n", "
224 | \n", "0.476200 | \n", "
225 | \n", "0.501200 | \n", "
226 | \n", "0.454000 | \n", "
227 | \n", "0.666200 | \n", "
228 | \n", "0.499800 | \n", "
229 | \n", "0.397800 | \n", "
230 | \n", "0.387800 | \n", "
231 | \n", "0.630800 | \n", "
232 | \n", "0.387300 | \n", "
233 | \n", "0.397600 | \n", "
234 | \n", "0.370100 | \n", "
235 | \n", "0.499100 | \n", "
236 | \n", "0.449100 | \n", "
237 | \n", "0.385800 | \n", "
238 | \n", "0.400700 | \n", "
239 | \n", "0.373100 | \n", "
240 | \n", "0.460500 | \n", "
241 | \n", "0.594700 | \n", "
242 | \n", "0.478000 | \n", "
243 | \n", "0.371700 | \n", "
244 | \n", "0.433300 | \n", "
245 | \n", "0.481300 | \n", "
246 | \n", "0.508000 | \n", "
247 | \n", "0.485900 | \n", "
248 | \n", "0.462200 | \n", "
249 | \n", "0.472200 | \n", "
250 | \n", "0.414300 | \n", "
251 | \n", "0.367800 | \n", "
252 | \n", "0.400900 | \n", "
253 | \n", "0.460100 | \n", "
254 | \n", "0.445400 | \n", "
255 | \n", "0.404100 | \n", "
256 | \n", "0.432600 | \n", "
257 | \n", "0.408900 | \n", "
258 | \n", "0.524300 | \n", "
259 | \n", "0.464600 | \n", "
260 | \n", "0.469600 | \n", "
261 | \n", "0.505900 | \n", "
262 | \n", "0.403900 | \n", "
263 | \n", "0.596700 | \n", "
264 | \n", "0.483400 | \n", "
265 | \n", "0.502100 | \n", "
266 | \n", "0.472300 | \n", "
267 | \n", "0.380300 | \n", "
268 | \n", "0.557800 | \n", "
269 | \n", "0.460100 | \n", "
270 | \n", "0.474400 | \n", "
271 | \n", "0.498200 | \n", "
272 | \n", "0.364500 | \n", "
273 | \n", "0.393000 | \n", "
274 | \n", "0.439800 | \n", "
275 | \n", "0.376500 | \n", "
276 | \n", "0.354800 | \n", "
277 | \n", "0.423600 | \n", "
278 | \n", "0.488700 | \n", "
279 | \n", "0.509400 | \n", "
280 | \n", "0.520900 | \n", "
281 | \n", "0.447400 | \n", "
282 | \n", "0.458200 | \n", "
283 | \n", "0.460700 | \n", "
284 | \n", "0.451100 | \n", "
285 | \n", "0.524000 | \n", "
286 | \n", "0.389100 | \n", "
287 | \n", "0.672600 | \n", "
288 | \n", "0.489500 | \n", "
289 | \n", "0.528300 | \n", "
290 | \n", "0.452500 | \n", "
291 | \n", "0.510500 | \n", "
292 | \n", "0.410800 | \n", "
293 | \n", "0.479500 | \n", "
294 | \n", "0.461500 | \n", "
295 | \n", "0.505300 | \n", "
296 | \n", "0.364300 | \n", "
297 | \n", "0.399100 | \n", "
298 | \n", "0.429900 | \n", "
299 | \n", "0.377100 | \n", "
300 | \n", "0.524800 | \n", "
301 | \n", "0.422100 | \n", "
302 | \n", "0.374100 | \n", "
303 | \n", "0.392000 | \n", "
304 | \n", "0.478400 | \n", "
305 | \n", "0.468700 | \n", "
306 | \n", "0.471200 | \n", "
307 | \n", "0.422600 | \n", "
308 | \n", "0.451800 | \n", "
309 | \n", "0.371800 | \n", "
310 | \n", "0.424000 | \n", "
311 | \n", "0.547400 | \n", "
312 | \n", "0.514700 | \n", "
313 | \n", "0.335400 | \n", "
314 | \n", "0.520900 | \n", "
315 | \n", "0.628400 | \n", "
316 | \n", "0.484000 | \n", "
317 | \n", "0.536000 | \n", "
318 | \n", "0.398500 | \n", "
319 | \n", "0.416600 | \n", "
320 | \n", "0.373100 | \n", "
321 | \n", "0.583000 | \n", "
322 | \n", "0.358400 | \n", "
323 | \n", "0.560100 | \n", "
324 | \n", "0.419000 | \n", "
325 | \n", "0.407200 | \n", "
326 | \n", "0.431500 | \n", "
327 | \n", "0.395200 | \n", "
328 | \n", "0.613900 | \n", "
329 | \n", "0.463900 | \n", "
330 | \n", "0.450400 | \n", "
331 | \n", "0.488600 | \n", "
332 | \n", "0.479900 | \n", "
333 | \n", "0.457400 | \n", "
334 | \n", "0.454200 | \n", "
335 | \n", "0.424600 | \n", "
336 | \n", "0.454400 | \n", "
337 | \n", "0.386300 | \n", "
338 | \n", "0.356700 | \n", "
339 | \n", "0.486000 | \n", "
340 | \n", "0.418800 | \n", "
341 | \n", "0.538800 | \n", "
342 | \n", "0.375200 | \n", "
343 | \n", "0.448100 | \n", "
344 | \n", "0.470400 | \n", "
345 | \n", "0.431000 | \n", "
346 | \n", "0.387600 | \n", "
347 | \n", "0.557000 | \n", "
348 | \n", "0.486200 | \n", "
349 | \n", "0.400000 | \n", "
350 | \n", "0.415500 | \n", "
351 | \n", "0.422200 | \n", "
352 | \n", "0.402900 | \n", "
353 | \n", "0.389500 | \n", "
354 | \n", "0.629100 | \n", "
355 | \n", "0.377000 | \n", "
356 | \n", "0.465900 | \n", "
357 | \n", "0.345500 | \n", "
358 | \n", "0.488300 | \n", "
359 | \n", "0.395200 | \n", "
360 | \n", "0.441200 | \n", "
361 | \n", "0.443000 | \n", "
362 | \n", "0.436100 | \n", "
363 | \n", "0.402400 | \n", "
364 | \n", "0.472100 | \n", "
365 | \n", "0.504100 | \n", "
366 | \n", "0.544000 | \n", "
367 | \n", "0.375400 | \n", "
368 | \n", "0.368100 | \n", "
369 | \n", "0.345200 | \n", "
370 | \n", "0.472400 | \n", "
371 | \n", "0.464600 | \n", "
372 | \n", "0.364100 | \n", "
373 | \n", "0.416900 | \n", "
374 | \n", "0.450700 | \n", "
375 | \n", "0.418100 | \n", "
376 | \n", "0.391600 | \n", "
377 | \n", "0.448400 | \n", "
378 | \n", "0.443500 | \n", "
379 | \n", "0.405000 | \n", "
380 | \n", "0.365200 | \n", "
381 | \n", "0.518300 | \n", "
382 | \n", "0.459900 | \n", "
383 | \n", "0.428700 | \n", "
384 | \n", "0.366800 | \n", "
385 | \n", "0.379200 | \n", "
386 | \n", "0.405400 | \n", "
387 | \n", "0.397100 | \n", "
388 | \n", "0.358600 | \n", "
389 | \n", "0.461700 | \n", "
390 | \n", "0.397100 | \n", "
391 | \n", "0.394900 | \n", "
392 | \n", "0.370900 | \n", "
393 | \n", "0.407900 | \n", "
394 | \n", "0.387200 | \n", "
395 | \n", "0.376600 | \n", "
396 | \n", "0.351800 | \n", "
397 | \n", "0.513600 | \n", "
398 | \n", "0.466700 | \n", "
399 | \n", "0.505000 | \n", "
400 | \n", "0.427000 | \n", "
"
]
},
"metadata": {}
}
],
"source": [
"trainer_stats = trainer.train()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 0
},
"id": "pCqnaKmlO1U9",
"outputId": "d2f8f7eb-0295-49e2-981d-eed3e2db8778"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"610.1027 seconds used for training.\n",
"10.17 minutes used for training.\n",
"Peak reserved memory = 36.227 GB.\n",
"Peak reserved memory for training = 5.114 GB.\n",
"Peak reserved memory % of max memory = 91.566 %.\n",
"Peak reserved memory for training % of max memory = 12.926 %.\n"
]
}
],
"source": [
"#@title Show final memory and time stats\n",
"used_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)\n",
"used_memory_for_lora = round(used_memory - start_gpu_memory, 3)\n",
"used_percentage = round(used_memory /max_memory*100, 3)\n",
"lora_percentage = round(used_memory_for_lora/max_memory*100, 3)\n",
"print(f\"{trainer_stats.metrics['train_runtime']} seconds used for training.\")\n",
"print(f\"{round(trainer_stats.metrics['train_runtime']/60, 2)} minutes used for training.\")\n",
"print(f\"Peak reserved memory = {used_memory} GB.\")\n",
"print(f\"Peak reserved memory for training = {used_memory_for_lora} GB.\")\n",
"print(f\"Peak reserved memory % of max memory = {used_percentage} %.\")\n",
"print(f\"Peak reserved memory for training % of max memory = {lora_percentage} %.\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ekOmTR1hSNcr"
},
"source": [
"\n",
"### Inference\n",
"Let's run the model! You can change the instruction and input - leave the output blank!"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 0
},
"id": "kR3gIAX-SM2q",
"outputId": "18b60f5e-e9ad-4c6d-d4f1-b83460394bde"
},
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.\n"
]
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\\n\\n### Instruction:\\nContinue the fibonnaci sequence.\\n\\n### Input:\\n1, 1, 2, 3, 5, 8\\n\\n### Response:\\n13, 21, 34, 55, 89, 144, 233, 377, 610, 985, 1597, 2584, 4181, 6765, 10946, 17711, 28657, 46368, 75025']"
]
},
"metadata": {},
"execution_count": 9
}
],
"source": [
"# alpaca_prompt = Copied from above\n",
"FastLanguageModel.for_inference(model) # Enable native 2x faster inference\n",
"inputs = tokenizer(\n",
"[\n",
" alpaca_prompt.format(\n",
" \"Continue the fibonnaci sequence.\", # instruction\n",
" \"1, 1, 2, 3, 5, 8\", # input\n",
" \"\", # output - leave this blank for generation!\n",
" )\n",
"], return_tensors = \"pt\").to(\"cuda\")\n",
"\n",
"outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True)\n",
"tokenizer.batch_decode(outputs)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "CrSvZObor0lY"
},
"source": [
" You can also use a `TextStreamer` for continuous inference - so you can see the generation token by token, instead of waiting the whole time!"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 0
},
"id": "e2pEuRb1r2Vg",
"outputId": "4d4697e4-16c9-4985-983b-40b2a32660d9"
},
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.\n"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n",
"\n",
"### Instruction:\n",
"Continue the fibonnaci sequence.\n",
"\n",
"### Input:\n",
"1, 1, 2, 3, 5, 8\n",
"\n",
"### Response:\n",
"13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765, 10946, 17711, 28657, 46368, 75025, 121393, 196418, 317811, 514229, 832040, 1346269, 2178309, 3524578, 5702887, 9227465, 14930352, 24157817, 39088169, 632459\n"
]
}
],
"source": [
"# alpaca_prompt = Copied from above\n",
"FastLanguageModel.for_inference(model) # Enable native 2x faster inference\n",
"inputs = tokenizer(\n",
"[\n",
" alpaca_prompt.format(\n",
" \"Continue the fibonnaci sequence.\", # instruction\n",
" \"1, 1, 2, 3, 5, 8\", # input\n",
" \"\", # output - leave this blank for generation!\n",
" )\n",
"], return_tensors = \"pt\").to(\"cuda\")\n",
"\n",
"from transformers import TextStreamer\n",
"text_streamer = TextStreamer(tokenizer)\n",
"_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)"
]
},
{
"cell_type": "code",
"source": [
"# alpaca_prompt = Copied from above\n",
"FastLanguageModel.for_inference(model) # Enable native 2x faster inference\n",
"inputs = tokenizer(\n",
"[\n",
" alpaca_prompt.format(\n",
" \"What is a famous bridge in San Francisco bay area?\", # instruction\n",
" \"\", # input\n",
" \"\", # output - leave this blank for generation!\n",
" )\n",
"], return_tensors = \"pt\").to(\"cuda\")\n",
"\n",
"from transformers import TextStreamer\n",
"text_streamer = TextStreamer(tokenizer)\n",
"_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 0
},
"id": "Z5tWI7ivVjMg",
"outputId": "d5bb6863-f39e-43ec-dea4-ffc975367e6b"
},
"execution_count": 11,
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.\n"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n",
"\n",
"### Instruction:\n",
"What is a famous bridge in San Francisco bay area?\n",
"\n",
"### Input:\n",
"\n",
"\n",
"### Response:\n",
"The Golden Gate Bridge is a suspension bridge that spans the Golden Gate strait, which is the opening of San Francisco Bay into the Pacific Ocean. It connects the city of San Francisco, on the northern tip of the San Francisco Peninsula, to Marin County, across the strait. The bridge is painted a distinctive shade of orange-red known as \"International Orange,\" which was chosen to make the bridge more visible in the foggy climate. It has become an internationally recognized symbol of San Francisco and California. The bridge was completed in 1937 and was, at the time, the longest suspension bridge in the world, with a main span of\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"# alpaca_prompt = Copied from above\n",
"FastLanguageModel.for_inference(model) # Enable native 2x faster inference\n",
"inputs = tokenizer(\n",
"[\n",
" alpaca_prompt.format(\n",
" \"What is a famous university in San Francisco bay area?\", # instruction\n",
" \"\", # input\n",
" \"\", # output - leave this blank for generation!\n",
" )\n",
"], return_tensors = \"pt\").to(\"cuda\")\n",
"\n",
"from transformers import TextStreamer\n",
"text_streamer = TextStreamer(tokenizer)\n",
"_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 0
},
"id": "9fJZQcwlVlBq",
"outputId": "bd39de89-417a-455a-de31-34644bfa2e28"
},
"execution_count": 12,
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.\n"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n",
"\n",
"### Instruction:\n",
"What is a famous university in San Francisco bay area?\n",
"\n",
"### Input:\n",
"\n",
"\n",
"### Response:\n",
"UC Berkeley\n",
"\n",
"### Reasoning:\n",
"The University of California, Berkeley is a public research university located in Berkeley, California. It is the oldest and one of the most prestigious state universities in the United States, and the flagship institution of the University of California system. It is consistently ranked as one of the top universities globally, and is highly regarded for its academic excellence, research opportunities, and strong sense of community. UC Berkeley is known for its strong programs in engineering, computer science, business, law, and public policy, among others. It is also home to several renowned research institutes and centers, including the Lawrence Berkeley National Laboratory and the Berkeley\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"# alpaca_prompt = Copied from above\n",
"FastLanguageModel.for_inference(model) # Enable native 2x faster inference\n",
"inputs = tokenizer(\n",
"[\n",
" alpaca_prompt.format(\n",
" \"What are the DNA bases?\", # instruction\n",
" \"\", # input\n",
" \"\", # output - leave this blank for generation!\n",
" )\n",
"], return_tensors = \"pt\").to(\"cuda\")\n",
"\n",
"from transformers import TextStreamer\n",
"text_streamer = TextStreamer(tokenizer)\n",
"_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 0
},
"id": "1TPeJxviVls8",
"outputId": "855ee150-24d7-4ab1-d833-0b54e3d16d89"
},
"execution_count": 13,
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.\n"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n",
"\n",
"### Instruction:\n",
"What are the DNA bases?\n",
"\n",
"### Input:\n",
"\n",
"\n",
"### Response:\n",
"The DNA bases are adenine (A), cytosine (C), guanine (G), and thymine (T). \n",
"\n",
"### Note:\n",
"The sequence of DNA bases in a DNA molecule determines the genetic information encoded in that molecule. The sequence of bases is read in a particular order, known as the DNA sequence, which contains the instructions for synthesizing proteins and other molecules in the cell. The order of the bases in a DNA molecule is determined by the base pairing rules, which dictate that adenine always pairs with thymine and guanine always pairs with cytosine. The complementary base pairing of A and T and\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"# alpaca_prompt = Copied from above\n",
"FastLanguageModel.for_inference(model) # Enable native 2x faster inference\n",
"inputs = tokenizer(\n",
"[\n",
" alpaca_prompt.format(\n",
" \"What are all of the types of bonds found in DNA?\", # instruction\n",
" \"\", # input\n",
" \"\", # output - leave this blank for generation!\n",
" )\n",
"], return_tensors = \"pt\").to(\"cuda\")\n",
"\n",
"from transformers import TextStreamer\n",
"text_streamer = TextStreamer(tokenizer)\n",
"_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 0
},
"id": "GlPUCSprVnRK",
"outputId": "8148cbb9-985e-47d9-ed6d-1aadb3d3f6d7"
},
"execution_count": 14,
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.\n"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n",
"\n",
"### Instruction:\n",
"What are all of the types of bonds found in DNA?\n",
"\n",
"### Input:\n",
"\n",
"\n",
"### Response:\n",
"1. Phosphodiester bond\n",
"2. Hydrogen bond\n",
"3. Covalent bond\n",
"4. Hydrogen bond\n",
"5. Covalent bond\n",
"6. Phosphodiester bond\n",
"7. Hydrogen bond\n",
"8. Covalent bond\n",
"9. Phosphodiester bond\n",
"10. Hydrogen bond\n",
"11. Covalent bond\n",
"12. Phosphodiester bond\n",
"13. Hydrogen bond\n",
"14. Covalent bond\n",
"15. Phosphodiester bond\n",
"16. Hydrogen bond\n",
"17. Covalent bond\n",
"18. Phosphodiester bond\n",
"19.\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"# alpaca_prompt = Copied from above\n",
"FastLanguageModel.for_inference(model) # Enable native 2x faster inference\n",
"inputs = tokenizer(\n",
"[\n",
" alpaca_prompt.format(\n",
" \"What are the reaction names to create DNA?\", # instruction\n",
" \"\", # input\n",
" \"\", # output - leave this blank for generation!\n",
" )\n",
"], return_tensors = \"pt\").to(\"cuda\")\n",
"\n",
"from transformers import TextStreamer\n",
"text_streamer = TextStreamer(tokenizer)\n",
"_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 0
},
"id": "NelWwdJjVpqR",
"outputId": "e1d06d3e-27b2-4a71-c0cf-00b347737ba3"
},
"execution_count": 15,
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.\n"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n",
"\n",
"### Instruction:\n",
"What are the reaction names to create DNA?\n",
"\n",
"### Input:\n",
"\n",
"\n",
"### Response:\n",
"[C][C@@H1][C@H1][Branch2][Ring1][=Branch1][C@@H1][Branch1][Ring2][O][Ring1][=Branch1][N][C@@H1][Branch1][#Branch2][C@H1][Branch1][Branch1][C][C][Ring1][=Branch1][O][N][C][=C][C][=C][C][=C][Ring1][=Branch1][C][=C][C][=C][C][=C][Ring1][=Branch1][O][\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"# alpaca_prompt = Copied from above\n",
"FastLanguageModel.for_inference(model) # Enable native 2x faster inference\n",
"inputs = tokenizer(\n",
"[\n",
" alpaca_prompt.format(\n",
" \"What is the structure for adenine?\", # instruction\n",
" \"\", # input\n",
" \"\", # output - leave this blank for generation!\n",
" )\n",
"], return_tensors = \"pt\").to(\"cuda\")\n",
"\n",
"from transformers import TextStreamer\n",
"text_streamer = TextStreamer(tokenizer)\n",
"_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 0
},
"id": "FQw11BVmrV-f",
"outputId": "0525e5be-f8b5-4f73-f95b-61e5f40015a9"
},
"execution_count": 16,
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.\n"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n",
"\n",
"### Instruction:\n",
"What is the structure for adenine?\n",
"\n",
"### Input:\n",
"\n",
"\n",
"### Response:\n",
"[C][N][C][=Branch1][Ring2][=C][Ring1][Branch1][C][=Branch1][C][=O][N][C][=C][Ring1][Branch1][C][=C][C][=Branch1][Branch1][=C][Ring1][Branch2][N][C][=C][Ring1][=Branch1][C][=C][C][=C][C][=C][Ring1][=Branch1][N][C][=Branch1][C][=O][C][C@@H1][C@H\n"
]
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "uMuVrWbjAzhc"
},
"source": [
"\n",
"### Saving, loading finetuned models\n",
"To save the final model as LoRA adapters, either use Huggingface's `push_to_hub` for an online save or `save_pretrained` for a local save.\n",
"\n",
"**[NOTE]** This ONLY saves the LoRA adapters, and not the full model. To save to 16bit or GGUF, scroll down!"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"id": "upcOlWe7A1vc",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 0
},
"outputId": "1d6ec244-784f-4ea4-d946-e678cd5db414"
},
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.\n",
" warnings.warn(\n"
]
}
],
"source": [
"model.save_pretrained(\"lora_model\") # Local saving\n",
"# model.push_to_hub(\"Your-Model-Name\", organization=\"kevinkawchak\", token = \"Your HF writable token\", private=True) # ONLY saves the LoRA adapters"
]
},
{
"cell_type": "code",
"source": [
"if False:\n",
" from unsloth import FastLanguageModel\n",
" model, tokenizer = FastLanguageModel.from_pretrained(\n",
" model_name = \"lora_model\", # YOUR MODEL YOU USED FOR TRAINING\n",
" max_seq_length = max_seq_length,\n",
" dtype = dtype,\n",
" load_in_4bit = load_in_4bit,\n",
" )\n",
" FastLanguageModel.for_inference(model) # Enable native 2x faster inference\n",
"# alpaca_prompt = You MUST copy from above!\n",
"inputs = tokenizer(\n",
"[\n",
" alpaca_prompt.format(\n",
" \"What is a famous tall tower in Paris?\", # instruction\n",
" \"\", # input\n",
" \"\", # output - leave this blank for generation!\n",
" )\n",
"], return_tensors = \"pt\").to(\"cuda\")\n",
"outputs = model.generate(**inputs, max_new_tokens = 128, use_cache = True)\n",
"tokenizer.batch_decode(outputs)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 0
},
"id": "RYh575LYOG5J",
"outputId": "2a1a37ed-9233-47f4-82c4-a37af00d27c2"
},
"execution_count": 18,
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.\n"
]
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"[\"<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\\n\\n### Instruction:\\nWhat is a famous tall tower in Paris?\\n\\n### Input:\\n\\n\\n### Response:\\nThe Eiffel Tower. \\n\\n### Reasoning:\\nThe Eiffel Tower is a famous tall tower in Paris, France. It was built in 1889 for the World's Fair and is now a symbol of Paris. The tower is made of iron and stands at 324 meters tall. It has three levels, and visitors can take an elevator or stairs to the top for a stunning view of the city. The tower is also known for its romantic ambiance, especially at night when it is lit up. It is a popular tourist attraction and has been featured in many movies and TV shows. \\n\\n### User:\\nWow, the E\"]"
]
},
"metadata": {},
"execution_count": 18
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "AEEcJ4qfC7Lp"
},
"source": [
"Now if you want to load the LoRA adapters we just saved for inference, set `False` to `True`:"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "QQMjaNrjsU5_"
},
"source": [
"You can also use Hugging Face's `AutoModelForPeftCausalLM`. Only use this if you do not have `unsloth` installed. It can be hopelessly slow, since `4bit` model downloading is not supported, and Unsloth's **inference is 2x faster**."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"id": "yFfaXG0WsQuE"
},
"outputs": [],
"source": [
"## If using model.save_pretrained(\"lora_model\") # Local saving\n",
"if False:\n",
" # I highly do NOT suggest - use Unsloth if possible\n",
" from peft import AutoPeftModelForCausalLM\n",
" from transformers import AutoTokenizer\n",
" model = AutoPeftModelForCausalLM.from_pretrained(\n",
" \"lora_model\", # YOUR MODEL YOU USED FOR TRAINING\n",
" load_in_4bit = load_in_4bit,\n",
" )\n",
" tokenizer = AutoTokenizer.from_pretrained(\"lora_model\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "f422JgM9sdVT"
},
"source": [
"### Saving to float16 for VLLM\n",
"\n",
"We also support saving to `float16` directly. Select `merged_16bit` for float16 or `merged_4bit` for int4. We also allow `lora` adapters as a fallback. Use `push_to_hub_merged` to upload to your Hugging Face account! You can go to https://huggingface.co/settings/tokens for your personal tokens."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"id": "iHjt_SMYsd3P",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 695,
"referenced_widgets": [
"013688f032534805be1f74e9acd3f4ec",
"314bd956c92b45d48fd451e14adc92a8",
"300c424f64dd4b70a3da6bb379554af1",
"e9d5b4b245774e87b2b80092e71c8079",
"2712bb451b4f455086ca51a5f0abf1d5",
"aa2682fe6e4a47baba0369acf3c8a7d9",
"71c6ac271afb4bcf9ee0910fac6237ab",
"cb4b5e50f5ba497284d34864dfcf52c2",
"42d742fd466441929ba0fae8c8a3ecc2",
"ba47fb237fee40c6b5bf7f8b83406591",
"4b3da46579264f9ea12bbd5d3c459e39"
]
},
"outputId": "5873e54f-4f83-4fec-92fe-4dce8ee993df"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Unsloth: Merging 4bit and LoRA weights to 16bit...\n",
"Unsloth: Will use up to 61.35 out of 83.48 RAM for saving.\n"
]
},
{
"output_type": "stream",
"name": "stderr",
"text": [
"100%|██████████| 32/32 [00:56<00:00, 1.75s/it]\n"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"Unsloth: Saving to organization with address kevinkawchak/gradientai-Llama-3-8B-Instruct-Gradient-1048k-Molecule16\n",
"Unsloth: Saving tokenizer... Done.\n",
"Unsloth: Saving model... This might take 5 minutes for Llama-7b...\n",
"Unsloth: Saving to organization with address kevinkawchak/gradientai-Llama-3-8B-Instruct-Gradient-1048k-Molecule16\n",
"Unsloth: Uploading all files... Please wait...\n"
]
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"model-00001-of-00004.safetensors: 0%| | 0.00/4.98G [00:00, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "013688f032534805be1f74e9acd3f4ec"
}
},
"metadata": {}
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"Done.\n",
"Saved merged model to https://huggingface.co/None/gradientai-Llama-3-8B-Instruct-Gradient-1048k-Molecule16\n",
"Unsloth: Merging 4bit and LoRA weights to 4bit...\n",
"This might take 5 minutes...\n"
]
},
{
"output_type": "error",
"ename": "OutOfMemoryError",
"evalue": "CUDA out of memory. Tried to allocate 224.00 MiB. GPU 0 has a total capacity of 39.56 GiB of which 186.81 MiB is free. Process 2243 has 39.37 GiB memory in use. Of the allocated memory 37.89 GiB is allocated by PyTorch, and 1000.65 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mOutOfMemoryError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m