Large Language Models – itsabout.ai

LLMs vs SLMs: Why Smaller Models Could Power the Future of Agentic AI

The AI landscape has been dominated by Large Language Models (LLMs)—massive neural networks trained on trillions of tokens, spanning hundreds of billions of parameters. These models, such as GPT-4 or Claude, have shown remarkable general-purpose intelligence, but they come with steep costs: enormous compute requirements, GPU dependency, and operational overheads that make them inaccessible for […]

Simon Todd in 13 Sep 2025 No Comments

The Most Popular LLMs, VLLMs, and SLMs in Enterprise AI Today

Blockchain Finance Financial

Future General Healthcare Human Resources Informational Infrastructure Recruitment Security Society Technology

As enterprises rapidly adopt AI to improve efficiency, customer experience, and innovation, the choice of model architecture has become a critical factor. Whether it’s deploying a massive Large Language Model (LLM), an efficient Very Large Language Model (VLLM), or a compute-friendly Small Language Model (SLM), organisations are increasingly strategic about balancing performance, cost, and accuracy. […]

Simon Todd in 05 Jun 2025 No Comments

Is RAG Still Relevant in a Post-LLaMA 4 World?

Future General Informational

Infrastructure Technology

Not long ago, I wrote about why Retrieval-Augmented Generation (RAG) is such a pivotal architecture in modern AI workflows, particularly when compared to fine-tuning and training from scratch. The core argument was simple: RAG enables models to stay up-to-date, grounded, and efficient without massive retraining costs. It was (and still is) a pragmatic solution to […]

Simon Todd in 07 Apr 2025 No Comments

Fine-Tuning Parameters for vLLM on Gaudi3, H200, and MI300X

Blockchain Finance Financial

Future General Healthcare Human Resources Informational Infrastructure Recruitment Security Society Technology

The rise of large language models (LLMs) has driven significant demand for efficient inference and fine-tuning frameworks. One such framework, vLLM, is optimised for high-performance serving with PagedAttention, allowing for memory-efficient execution across diverse hardware architectures. With the introduction of new AI accelerators such as Gaudi3, H200, and MI300X, optimising fine-tuning parameters is essential to […]

Simon Todd in 08 Mar 2025 No Comments