{"id":991,"date":"2023-08-07T15:15:44","date_gmt":"2023-08-07T15:15:44","guid":{"rendered":"https:\/\/ai-box.eu\/?p=991"},"modified":"2023-08-08T11:31:44","modified_gmt":"2023-08-08T11:31:44","slug":"chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook","status":"publish","type":"post","link":"https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/","title":{"rendered":"Chatbot Llama 2 70B &#8211; run locally in a jupyter notebook"},"content":{"rendered":"<p>With this article I would like to help you to run locally under Ubuntu 22.04 Llama 2 in the 70B version on a NVIDIA A6000. Of course this is not easy because the classic Llama 2 70B model needs a GPU memory of about 280GB in the Float32 bit version. But if you quantize the model to INT4 bit then it fits into the memory of the NVIDIA A6000 with its 48GB GPU Ram. The model then occupies about 40GB of ram and runs at a decent speed.<\/p>\n<p>I followed the video tutorial by James Briggs on YouTube which explains very well how to get the model running. However, this tutorial lacks the setup of the environment under Ubuntu where everything is set up. This is where I had my minor challenges until everything was up and running. Therefore my tutorial starts with setting up the runtime environment. After that everything works without problems.<\/p>\n<p>Here is the link to the YouTube video of James Briggs: <a href=\"https:\/\/www.youtube.com\/watch?v=6iHVJyX2e50\" target=\"_blank\" rel=\"noopener\">Llama 2 in LangChain \u2014 FIRST Open Source Conversational Agent!<\/a><\/p>\n<p style=\"padding-left: 40px;\"><strong>Hardware Note:<\/strong> In order to set up the Llama 2 70B model on your computer as described in this guide, you will need approximately 200GB of free memory and a video card with 48GB of video RAM \/ GPU RAM.<\/p>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_85 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/#Set_up_runtime_environment\" >Set up runtime environment:<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/#CUDA_installation\" >CUDA installation:<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/#Video_%E2%80%93_Llama_2_70B_lokal\" >Video &#8211; Llama 2 70B lokal<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/#Summary\" >Summary<\/a><\/li><\/ul><\/nav><\/div>\n<h2><span class=\"ez-toc-section\" id=\"Set_up_runtime_environment\"><\/span>Set up runtime environment:<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>First I created a conda environment with the name Llama_2_70B and Python 3.10.9. In this environment I had the problem that I got error messages concerning CUDA again and again. So I installed CUDA in the active conda environment Llama_2_70B.<\/p>\n<p style=\"padding-left: 40px;\"><strong>Command:<\/strong> <code>conda create -n Llama_2_70B python=3.10.9<\/code><\/p>\n<p style=\"padding-left: 40px;\"><strong>Command:<\/strong> <code>conda activate Llama_2_70B<\/code><\/p>\n<h3><span class=\"ez-toc-section\" id=\"CUDA_installation\"><\/span>CUDA installation:<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>When installing CUDA, I followed the instructions linked below and simply copy &amp; pasted the commands one by one.<\/p>\n<p style=\"padding-left: 40px;\"><strong>Link: <\/strong><a href=\"https:\/\/developer.nvidia.com\/cuda-12-2-0-download-archive?target_os=Linux&amp;target_arch=x86_64&amp;Distribution=Ubuntu&amp;target_version=22.04&amp;target_type=deb_network\" target=\"_blank\" rel=\"noopener\">CUDA Installation-Guide<\/a><\/p>\n<p>Subsequently, after the CUDA was installed I still had to install <code>click<\/code>. A reboot I had now still executed after the CUDA installation my computer had prompted me to do so.<\/p>\n<p style=\"padding-left: 40px;\"><strong>Command:<\/strong> <code>conda install -c conda-forge click<\/code><\/p>\n<p>Then after my computer was booted again and I was logged into the active conda environment Llama_2_70B in the terminal window I installed Jupyter in. Jupyter is needed to run the Jypyter notebook that James Briggs has put online.<\/p>\n<p style=\"padding-left: 40px;\"><strong>Command:<\/strong> <code>pip install jupyter<\/code><\/p>\n<p>Here is the link to the Jupyter notebook you need.<\/p>\n<p style=\"padding-left: 40px;\"><strong>Jupyter Notebook: <\/strong><a class=\"Link--primary\" href=\"https:\/\/github.com\/pinecone-io\/examples\/blob\/master\/learn\/generation\/llm-field-guide\/llama-2\/llama-2-70b-chat-agent.ipynb\" target=\"_blank\" rel=\"noopener\" aria-describedby=\"item-type-27\">llama-2-70b-chat-agent.ipynb<\/a><\/p>\n<p>Now everything is set up and Jupyter can be started with the following command. Before you execute the command, change to the folder where you have placed the notebook before. Then you will see it immediately and can execute it.<\/p>\n<p style=\"padding-left: 40px;\"><strong>Command:<\/strong> <code>jupyter notebook<\/code><\/p>\n<p>Remember that when you run the Jupyter Notebook you will need a Hugging Face token to download the Llama 2 model. This is only possible if you have accepted the license agreement from META and used the same email address for the registration at META that you also use at Hugging Face. Then you can download the Llama 2 70B model.<\/p>\n<p>For me the download of the model in the notebook looked like shown in the following picture.<\/p>\n<div id=\"attachment_980\" style=\"width: 1034px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Downloading_llama_2_hugging_face-1024x638.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-980\" class=\"size-large wp-image-980\" src=\"https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Downloading_llama_2_hugging_face-1024x638.png\" alt=\"Downloading llama 2 hugging face\" width=\"1024\" height=\"638\" srcset=\"https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Downloading_llama_2_hugging_face-1024x638.png 1024w, https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Downloading_llama_2_hugging_face-300x187.png 300w, https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Downloading_llama_2_hugging_face-768x478.png 768w, https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Downloading_llama_2_hugging_face-400x250.png 400w, https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Downloading_llama_2_hugging_face-1080x673.png 1080w, https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Downloading_llama_2_hugging_face.png 1108w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/a><p id=\"caption-attachment-980\" class=\"wp-caption-text\">Downloading llama 2 hugging face<\/p><\/div>\n<p>If you now search for the location of the downloaded Llama 2 model then you will find the model in the folder .cache in your home directory under the following path.<\/p>\n<p style=\"padding-left: 40px;\"><strong>Path:<\/strong> ~<code>\/.cache\/huggingface\/hub\/models--meta-llama--Llama-2-70b-chat-hf\/blobs<\/code><\/p>\n<div id=\"attachment_984\" style=\"width: 1034px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/hugging_face_cache_folder-1024x950.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-984\" class=\"size-large wp-image-984\" src=\"https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/hugging_face_cache_folder-1024x950.png\" alt=\"Hugging Face HUB cache folder\" width=\"1024\" height=\"950\" srcset=\"https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/hugging_face_cache_folder-1024x950.png 1024w, https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/hugging_face_cache_folder-300x278.png 300w, https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/hugging_face_cache_folder-768x713.png 768w, https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/hugging_face_cache_folder-1080x1002.png 1080w, https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/hugging_face_cache_folder.png 1291w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/a><p id=\"caption-attachment-984\" class=\"wp-caption-text\">Hugging Face HUB cache folder<\/p><\/div>\n<p style=\"padding-left: 40px;\"><strong>Note:<\/strong> If you restart the virtual Conda environment or the computer and run the notebook again, the Llama 2 model will not be downloaded again. Because it is already in the .cache folder on your computer and is loaded from there back into the memory of the GPU.<\/p>\n<p>From now on, it&#8217;s best to follow James Briggs&#8217; video, which you can access via the following link: <a href=\"https:\/\/www.youtube.com\/watch?v=6iHVJyX2e50\" target=\"_blank\" rel=\"noopener\">Llama 2 in LangChain \u2014 FIRST Open Source Conversational Agent!<\/a><\/p>\n<h2><span class=\"ez-toc-section\" id=\"Video_%E2%80%93_Llama_2_70B_lokal\"><\/span>Video &#8211; Llama 2 70B lokal<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Here is the small video of me showing my computer and the slightly customized notebook. In the video you can see the performance of the model on my computer. I show the start time and the end time as well as the runtime the model needed to generate the answer.<\/p>\n<div id=\"attachment_974\" style=\"width: 1034px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/www.youtube.com\/watch?v=WKqLF-VH23U\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-974\" class=\"wp-image-974 size-large\" src=\"https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Llama_2_70B_Model_local_runtime-1024x575.jpg\" alt=\"Llama 2 70B Model local runtime\" width=\"1024\" height=\"575\" srcset=\"https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Llama_2_70B_Model_local_runtime-1024x575.jpg 1024w, https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Llama_2_70B_Model_local_runtime-300x168.jpg 300w, https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Llama_2_70B_Model_local_runtime-768x431.jpg 768w, https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Llama_2_70B_Model_local_runtime-1080x606.jpg 1080w, https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Llama_2_70B_Model_local_runtime-1280x720.jpg 1280w, https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Llama_2_70B_Model_local_runtime.jpg 1283w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/a><p id=\"caption-attachment-974\" class=\"wp-caption-text\">Llama 2 70B Model local runtime<\/p><\/div>\n<h2><span class=\"ez-toc-section\" id=\"Summary\"><\/span>Summary<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>I am thrilled with how well everything worked and how easy everything was after I configured the runtime environment correctly. It took me most of the time to set up the runtime environment. Because I had a somewhat lengthy error message which was only eliminated when I had installed the appropriate CUDA version in the Conda environment. After everything was set up I had fun with LangChain, Hugging Face and the Llama 2 70B model. It is really impressive how well everything works and especially how easy it is. It&#8217;s also nice that I don&#8217;t have to download the Llama 2 model again after a reboot. So it only takes a few minutes until everything is loaded into the GPU&#8217;s memory.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>With this article I would like to help you to run locally under Ubuntu 22.04 Llama 2 in the 70B version on a NVIDIA A6000. Of course this is not easy because the classic Llama 2 70B model needs a GPU memory of about 280GB in the Float32 bit version. But if you quantize the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":975,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_et_pb_use_builder":"","_et_pb_old_content":"","_et_gb_content_width":"","footnotes":""},"categories":[162,8,50],"tags":[107,88,89,109],"class_list":["post-991","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-large-language-models-en","category-news","category-top-story-en","tag-anleitung-en","tag-howto-en-2","tag-installation-en","tag-lokal-en","et-has-post-format-content","et_post_format-et-post-format-standard"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v28.0 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Chatbot Llama 2 70B - run locally in a jupyter notebook - Exploring the Future: Inside the AI Box<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Chatbot Llama 2 70B - run locally in a jupyter notebook - Exploring the Future: Inside the AI Box\" \/>\n<meta property=\"og:description\" content=\"With this article I would like to help you to run locally under Ubuntu 22.04 Llama 2 in the 70B version on a NVIDIA A6000. Of course this is not easy because the classic Llama 2 70B model needs a GPU memory of about 280GB in the Float32 bit version. But if you quantize the [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/\" \/>\n<meta property=\"og:site_name\" content=\"Exploring the Future: Inside the AI Box\" \/>\n<meta property=\"article:published_time\" content=\"2023-08-07T15:15:44+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-08-08T11:31:44+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Llama_2_70B_Model_local_runtime.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1283\" \/>\n\t<meta property=\"og:image:height\" content=\"720\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Maker\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@Ingmar_Stapel\" \/>\n<meta name=\"twitter:site\" content=\"@Ingmar_Stapel\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Maker\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\\\/991\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\\\/991\\\/\"},\"author\":{\"name\":\"Maker\",\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/#\\\/schema\\\/person\\\/cc91d08618b3feeef6926591b465eab1\"},\"headline\":\"Chatbot Llama 2 70B &#8211; run locally in a jupyter notebook\",\"datePublished\":\"2023-08-07T15:15:44+00:00\",\"dateModified\":\"2023-08-08T11:31:44+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\\\/991\\\/\"},\"wordCount\":812,\"commentCount\":0,\"image\":{\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\\\/991\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/ai-box.eu\\\/wp-content\\\/uploads\\\/2023\\\/08\\\/Llama_2_70B_Model_local_runtime.jpg\",\"keywords\":[\"Anleitung\",\"HowTo\",\"Installation\",\"lokal\"],\"articleSection\":[\"Large Language Models\",\"News\",\"Top story\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\\\/991\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\\\/991\\\/\",\"url\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\\\/991\\\/\",\"name\":\"Chatbot Llama 2 70B - run locally in a jupyter notebook - Exploring the Future: Inside the AI Box\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\\\/991\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\\\/991\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/ai-box.eu\\\/wp-content\\\/uploads\\\/2023\\\/08\\\/Llama_2_70B_Model_local_runtime.jpg\",\"datePublished\":\"2023-08-07T15:15:44+00:00\",\"dateModified\":\"2023-08-08T11:31:44+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/#\\\/schema\\\/person\\\/cc91d08618b3feeef6926591b465eab1\"},\"breadcrumb\":{\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\\\/991\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\\\/991\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\\\/991\\\/#primaryimage\",\"url\":\"https:\\\/\\\/ai-box.eu\\\/wp-content\\\/uploads\\\/2023\\\/08\\\/Llama_2_70B_Model_local_runtime.jpg\",\"contentUrl\":\"https:\\\/\\\/ai-box.eu\\\/wp-content\\\/uploads\\\/2023\\\/08\\\/Llama_2_70B_Model_local_runtime.jpg\",\"width\":1283,\"height\":720,\"caption\":\"Llama 2 70B Model local runtime\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\\\/991\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Start\",\"item\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Chatbot Llama 2 70B &#8211; run locally in a jupyter notebook\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/#website\",\"url\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/\",\"name\":\"Exploring the Future: Inside the AI Box\",\"description\":\"Inside the AI Box, we share our experiences and discoveries in the world of artificial intelligence.\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/#\\\/schema\\\/person\\\/cc91d08618b3feeef6926591b465eab1\",\"name\":\"Maker\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/e96b93fc3c7e50c1f21c5c6b1f146dc4867936141360830b328947b32cacf93a?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/e96b93fc3c7e50c1f21c5c6b1f146dc4867936141360830b328947b32cacf93a?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/e96b93fc3c7e50c1f21c5c6b1f146dc4867936141360830b328947b32cacf93a?s=96&d=mm&r=g\",\"caption\":\"Maker\"},\"description\":\"I live in Bavaria near Munich. In my head I always have many topics and try out especially in the field of Internet new media much in my spare time. I write on the blog because it makes me fun to report about the things that inspire me. I am happy about every comment, about suggestion and very about questions.\",\"sameAs\":[\"https:\\\/\\\/ai-box.eu\"],\"url\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/author\\\/ingmars\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Chatbot Llama 2 70B - run locally in a jupyter notebook - Exploring the Future: Inside the AI Box","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/","og_locale":"en_US","og_type":"article","og_title":"Chatbot Llama 2 70B - run locally in a jupyter notebook - Exploring the Future: Inside the AI Box","og_description":"With this article I would like to help you to run locally under Ubuntu 22.04 Llama 2 in the 70B version on a NVIDIA A6000. Of course this is not easy because the classic Llama 2 70B model needs a GPU memory of about 280GB in the Float32 bit version. But if you quantize the [&hellip;]","og_url":"https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/","og_site_name":"Exploring the Future: Inside the AI Box","article_published_time":"2023-08-07T15:15:44+00:00","article_modified_time":"2023-08-08T11:31:44+00:00","og_image":[{"width":1283,"height":720,"url":"https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Llama_2_70B_Model_local_runtime.jpg","type":"image\/jpeg"}],"author":"Maker","twitter_card":"summary_large_image","twitter_creator":"@Ingmar_Stapel","twitter_site":"@Ingmar_Stapel","twitter_misc":{"Written by":"Maker","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/#article","isPartOf":{"@id":"https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/"},"author":{"name":"Maker","@id":"https:\/\/ai-box.eu\/en\/#\/schema\/person\/cc91d08618b3feeef6926591b465eab1"},"headline":"Chatbot Llama 2 70B &#8211; run locally in a jupyter notebook","datePublished":"2023-08-07T15:15:44+00:00","dateModified":"2023-08-08T11:31:44+00:00","mainEntityOfPage":{"@id":"https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/"},"wordCount":812,"commentCount":0,"image":{"@id":"https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/#primaryimage"},"thumbnailUrl":"https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Llama_2_70B_Model_local_runtime.jpg","keywords":["Anleitung","HowTo","Installation","lokal"],"articleSection":["Large Language Models","News","Top story"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/","url":"https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/","name":"Chatbot Llama 2 70B - run locally in a jupyter notebook - Exploring the Future: Inside the AI Box","isPartOf":{"@id":"https:\/\/ai-box.eu\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/#primaryimage"},"image":{"@id":"https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/#primaryimage"},"thumbnailUrl":"https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Llama_2_70B_Model_local_runtime.jpg","datePublished":"2023-08-07T15:15:44+00:00","dateModified":"2023-08-08T11:31:44+00:00","author":{"@id":"https:\/\/ai-box.eu\/en\/#\/schema\/person\/cc91d08618b3feeef6926591b465eab1"},"breadcrumb":{"@id":"https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/#primaryimage","url":"https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Llama_2_70B_Model_local_runtime.jpg","contentUrl":"https:\/\/ai-box.eu\/wp-content\/uploads\/2023\/08\/Llama_2_70B_Model_local_runtime.jpg","width":1283,"height":720,"caption":"Llama 2 70B Model local runtime"},{"@type":"BreadcrumbList","@id":"https:\/\/ai-box.eu\/en\/news\/chatbot-llama-2-70b-run-locally-in-a-jupyter-notebook\/991\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Start","item":"https:\/\/ai-box.eu\/en\/"},{"@type":"ListItem","position":2,"name":"Chatbot Llama 2 70B &#8211; run locally in a jupyter notebook"}]},{"@type":"WebSite","@id":"https:\/\/ai-box.eu\/en\/#website","url":"https:\/\/ai-box.eu\/en\/","name":"Exploring the Future: Inside the AI Box","description":"Inside the AI Box, we share our experiences and discoveries in the world of artificial intelligence.","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ai-box.eu\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/ai-box.eu\/en\/#\/schema\/person\/cc91d08618b3feeef6926591b465eab1","name":"Maker","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/e96b93fc3c7e50c1f21c5c6b1f146dc4867936141360830b328947b32cacf93a?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/e96b93fc3c7e50c1f21c5c6b1f146dc4867936141360830b328947b32cacf93a?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/e96b93fc3c7e50c1f21c5c6b1f146dc4867936141360830b328947b32cacf93a?s=96&d=mm&r=g","caption":"Maker"},"description":"I live in Bavaria near Munich. In my head I always have many topics and try out especially in the field of Internet new media much in my spare time. I write on the blog because it makes me fun to report about the things that inspire me. I am happy about every comment, about suggestion and very about questions.","sameAs":["https:\/\/ai-box.eu"],"url":"https:\/\/ai-box.eu\/en\/author\/ingmars\/"}]}},"_links":{"self":[{"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/posts\/991","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/comments?post=991"}],"version-history":[{"count":0,"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/posts\/991\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/media\/975"}],"wp:attachment":[{"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/media?parent=991"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/categories?post=991"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/tags?post=991"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}