A Silent Aisle, A Loud Truth The data centre aisle was unnervingly quiet. High-density AI servers hummed with an intensity that pushed the limits of what traditional airflow could sustain. In the middle of this familiar landscape sat a single liquid-cooled node: compact, efficient and completely alone. It looked like a glimpse of the future […]

Read More

The AI landscape has been dominated by Large Language Models (LLMs)—massive neural networks trained on trillions of tokens, spanning hundreds of billions of parameters. These models, such as GPT-4 or Claude, have shown remarkable general-purpose intelligence, but they come with steep costs: enormous compute requirements, GPU dependency, and operational overheads that make them inaccessible for […]

Read More

As enterprises rapidly adopt AI to improve efficiency, customer experience, and innovation, the choice of model architecture has become a critical factor. Whether it’s deploying a massive Large Language Model (LLM), an efficient Very Large Language Model (VLLM), or a compute-friendly Small Language Model (SLM), organisations are increasingly strategic about balancing performance, cost, and accuracy. […]

Read More

Not long ago, I wrote about why Retrieval-Augmented Generation (RAG) is such a pivotal architecture in modern AI workflows, particularly when compared to fine-tuning and training from scratch. The core argument was simple: RAG enables models to stay up-to-date, grounded, and efficient without massive retraining costs. It was (and still is) a pragmatic solution to […]

Read More