AI/ML

Microsoft Fara-8B learning experience

Over the weekend, I was trying do a learning project using Microsoft’s latest Microsoft Fara-7B model, which is a “Computer Use Agent”.

Microsoft Fara-7B is a small, efficient, open-weight AI model (with 7 billion parameters) designed to act as a Computer Use Agent (CUA), allowing it to perform tasks on a computer by visually understanding the screen (aka screenshots) and using mouse/keyboard actions (clicks, typing, scrolling) to automate web tasks like booking travel, shopping, or filling forms, offering speed, privacy (runs locally on devices), and lower cost compared to larger models.

Currently the model is still in experimental mode, and what caught my attention is its offer of making it run on a PC. Slightly exagerated for a beginner however, you need some GPU of minimum 6GB. I have NVIDIA RTX A3000 and after too much trials, I was somehow able to run it once on my laptop, but then onwards I am stuck with memory errors.

“(EngineCore_DP0 pid=7153) ValueError: Free memory on device (4.97/6.0 GiB) on startup is less than desired GPU memory utilization (0.9, 5.4 GiB). Decrease GPU memory utilization or reduce GPU memory used by other processes.”

While I tried multiple ways to make this work, fundamentally this should work with the same steps mentioned by Microsoft and Hugging Face

https://huggingface.co/microsoft/Fara-7B

https://www.microsoft.com/en-us/research/blog/fara-7b-an-efficient-agentic-model-for-computer-use/

git clone https://github.com/microsoft/fara.git
cd fara
python3 -m venv .venv 
source .venv/bin/activate
pip install -e .
playwright install
vllm serve "microsoft/Fara-7B" --port 5000 --dtype auto 
#sample
fara-cli --task "{task prompt}"

Leave a Reply