Red Hat has announced its intention to acquire Neural Magic, the lead developer behind the open source vLLM project.
The acquisition is being positioned as a way for Red Hat and its parent IBM to lower the barrier to entry for organisations that want to run machine learning workloads without the need to deploy servers equipped with graphics processing units (GPUs). This reliance creates a barrier to entry, hindering the widespread adoption of artificial intelligence (AI) across various industries and limiting its potential to revolutionise how we live and work.
The GitHub entry for vLLM describes the software as: “A high-throughput and memory-efficient inference and serving engine for LLMs [large language models].”
In a blog discussing the deal, Red Hat president and CEO Matt Hicks said Neural Magic had developed a way to run machine learning (ML) algorithms without the need for expensive and often difficult to source GPU server hardware.
He said the founders of Neural Magic wanted to empower anyone, regardless of their resources, to harness the power of AI. “Their groundbreaking approach involved leveraging techniques like pruning and quantisation to optimise machine learning models, starting by allowing ML models to run efficiently on readily available CPUs without sacrificing performance,” he wrote.
Hicks spoke about the shift towards smaller, more specialised AI models, which can deliver exceptional performance with greater efficiency. “These models are not only more efficient to train and deploy, but they also offer significant advantages in terms of customisation and adaptability,” he wrote.
Red Hat is pushing the idea of sparsification, which, according to Hicks, “strategically removes unnecessary connections within a model”. This approach, he said, reduces the size and computational requirements of the model without sacrificing accuracy or performance. Quantisation is then used to reduce model size further, enabling the AI model to run on platforms with reduced memory requirements.
“All of this translates to lower costs, faster inference and the ability to run AI workloads on a wider range of hardware,” he added.
Red Hat’s intention to acquire Neural Magic fits into parent company IBM’s strategy to help enterprise customers use AI models.
In a recent interview with Computer Weekly, Kareem Yusuf, product management lead for IBM’s software portfolio, said the supplier has identified a business opportunity to support customers that want to “easily mash their data into the large language model”. This, he said, allows them to take advantage of large language models in a way that enables protection and control of enterprise data.
IBM has developed a project called InstructLab that provides the tools to create and merge changes to LLMs without having to retrain the model from scratch. It is available in the open source community, along with IBM Granite, a foundation AI model for enterprise datasets.
Dario Gil, IBM’s senior vice-president and director of research, said: “As our clients look to scale AI across their hybrid environments, virtualised, cloud-native LLMs built on open foundations will become the industry standard. Red Hat’s leadership in open source, combined with the choice of efficient, open source models like IBM Granite and Neural Magic’s offerings for scaling AI across platforms, empower businesses with the control and flexibility they need to deploy AI across the enterprise.”