vibecode.wiki
RU EN
~/wiki / rabochee-mesto / установка-ollama-локальный-ии-у-себя-на-компьютере

Installation Ollama: local AI on your computer

Next step

Open the bot or continue inside this section.

$ cd section/ $ open @mmorecil_bot

Article -> plan in AI

Paste this article URL into any AI and get an implementation plan for your project.

Read this article: https://vibecode.morecil.ru/en/rabochee-mesto/%D1%83%D1%81%D1%82%D0%B0%D0%BD%D0%BE%D0%B2%D0%BA%D0%B0-ollama-%D0%BB%D0%BE%D0%BA%D0%B0%D0%BB%D1%8C%D0%BD%D1%8B%D0%B9-%D0%B8%D0%B8-%D1%83-%D1%81%D0%B5%D0%B1%D1%8F-%D0%BD%D0%B0-%D0%BA%D0%BE%D0%BC%D0%BF%D1%8C%D1%8E%D1%82%D0%B5%D1%80%D0%B5/ Work in my current project context. Create an implementation plan for this stack: 1) what to change 2) which files to edit 3) risks and typical mistakes 4) how to verify everything works If there are options, provide "quick" and "production-ready".
How to use
  1. Copy this prompt and send it to your AI chat.
  2. Attach your project or open the repository folder in the AI tool.
  3. Ask for file-level changes, risks, and a quick verification checklist.

Ollama is a way to run large language models (LLMs) on your computer. Not in the browser, not in the cloud or through other servers. The model lives next to your code and files. This is the foundation for projects where control, confidentiality and independence from external services are important.


Why we need local AI in development

When working with cloud AI, you always depend on the provider. The service can change tariffs, restrict APIs, overload, or simply be unavailable.

Local AI is changing the paradigm:

  • Control: The model works on your iron. You decide when and how to use it.
  • Confidentiality: Source code and data do not leave your computer.
  • Independence: Work continues even without an internet connection.

In vibcoding, this is not just a matter of privacy, but a matter of workflow architecture. You are building an environment where AI is an integral and predictable part of your toolkit.

What's Ollama

Ollama is not just a program, but a local rantime and manager for AI models.

Think of it as a machine learning Docker:

  • Downloads models (like container images) from public repositories.
  • Manages them on your computer.
  • Provides a simple API for interaction.

The key point is that Ollama does not generate text itself. He's a performance medium. The quality and type of response depends entirely on the model chosen.

Why Ollama

For the start, Ollama is the perfect choice:

  • Simplicity: Install in a few clicks, without deep ML knowledge.
  • Cross-platform: It runs on macOS, Windows and Linux.
  • Self-sufficiency: It does not require accounts, API keys or payments.
  • Integrability: It has a simple API, making it ideal for connecting to code editors and other tools.

Again, we fix the idea: you choose not the “most powerful” platform, but the most practical and understandable one to integrate into your workflow.

Step 1: Installation of Ollama

The process is unified for all OSes.

  1. Go to the official website:

https://ollama.comXX 2. Press the Download button. 3. Download the installer for your operating system:

.exe for Windows,

.dmg for macOS,

.deb or

.rpm for Linux 4. Run the installer. Follow the standard installation instructions without unchecking the default ticks (unless you understand what they are for)

<img loading="lazy" decoding="async" src="/uploads/screenshot-2026-02-05-at-12.06.49.webp" alt=" />

Important: Once installed, Ollama will run as a background service (demon). You don’t have to open it as a separate app every time. It will work in the system tray and wait for your commands.

*On Windows and macOS, Ollama will also be installed as a regular application that can be opened for a simple chat interface. *

Step 2: Installation check and first command

Open the terminal (Command Prompt, PowerShell, Terminal, iTerm2, etc.) and execute the command:

ollama --versionXX

In response, you should see the version number, for example,

ollama version 0.***XX

This confirms that Ollama is installed correctly and is available from the command line.

<img loading="lazy" decoding="async" src="/uploads/screenshot-2026-02-05-at-12.31.27.webp" alt=" />

Step 3: Your First Local Model

An Ollama without a model is an empty container. Let's download and launch our first model. A great starting point is llama3.2, as it is modern, efficient and works well on most hardware.

At the same terminal, perform:

code
ollama pull llama3.2

What's going on

  • Ollama refers to its repository (Model Library).
  • Downloads the llama3.2 model and all its dependencies to your computer.
  • The process can take a few minutes depending on internet speed and model size (usually a few gigabytes).

After downloading, the model is ready for use.

Step 4: Basic Use: Command Line Interface (CLI)

The easiest way to communicate with a model is to run it interactively.

code
ollama run llama3.2

After executing the command, you will see the >>> invitation. This means that the model is loaded into memory and ready for dialogue. Write any question or request and press Enter.

Example:

code
Write a greeting for a new project called “VibeCode”

To exit the interactive mode, enter /bye or press Ctrl+D.

Step 5: Integration with IDE and code editors

The true power of Ollama is revealed when integrated right into the development environment. Here are the main ways:

1. Plugins and extensions

Many popular editors have plugins that connect to Ollama’s local server.

  • VS Code: Install an extension such as genai or continue. In the extension settings, specify endpoint http://localhost:11434
  • Cursor: Perfectly designed for AI. In Settings > AI Models, select "Ollama" and specify the model (e.g., llama3.2). Cursor will automatically send requests to your local Ollama.
  • JetBrains IDE (IntelliJ, PyCharm, etc.): Look for plugins like CodeGeeX or Continue that support local models via Ollama.

2. Directly through the API

Ollama runs a local server on the 11434 port. You can send them HTTP requests from your scripts or tools.

An example of a simple request via curl:

code
curl -X POST http://localhost:11434/api/generate -d'
  "model": "llama3.2",
  "Prompt": "Explain what a Python decorator is.",
  "stream": false
}'

This opens up opportunities to create your own automation tools.

What is important to understand at this stage

You didn't just put the program in. You've added a new, managed layer to your working ecosystem.

Now you have:

  • Code editor.
  • Your project.
  • Locally working AI assistant who understands the context of your project and is ready to help without requiring access to the Internet.