Intelligent CIO Middle East Issue 106 | Page 78

MODERN AI AND ML CLUSTERS CONSIST OF HUNDREDS , SOMETIMES THOUSANDS , OF GPUS .
DISRUPTIVE TECH

MODERN AI AND ML CLUSTERS CONSIST OF HUNDREDS , SOMETIMES THOUSANDS , OF GPUS .

In addition , other design concepts increase the reliability and efficiency of the entire fabric . These include properly sized fabric inter-connects with the optimum number of links , and the ability to detect and correct imbalances in the data stream to avoid congestion and packet loss . Explicit Congestion Notice , with Data Centre Quantised Congestion Notice and priority-based data stream control ensure loss-free transfer .
is built with a constant network speed of 400Gbps to 800Gbps from the NIC to the leaf and spine layers . Depending on the model size and GPU scale , a twolayer , three-level non-blocking fabric or a three-layer , five-level non-blocking fabric can be used .
Dynamic and adaptive load balancing is used at the switch to reduce overloads . Dynamic load balancing redistributes data streams locally at the switch to evenly distribute them . Adaptive load balancing monitors the forwarding of data streams and next-hop tables to identify bottlenecks and redirect traffic away from overloaded paths .
Usage of Retrieval Augmented Generation
Thierry Nicault , AVP and General Manager , Salesforce Middle East
When a customer needs help with a recent purchase , typically they start the conversation with the company ’ s chatbot . For the experience to be both relevant and positive , the entire exchange needs to be grounded in that customer ’ s data , such as their recent product purchase , warranty information , and any past conversations they have had . The chatbot should also be tapping into company data , such as the latest learnings from other customers who have bought similar products and internal knowledge base articles .
Some of this information might reside in transactional databases , structured information , while the rest might be in unstructured files , such as warranty contracts or knowledge base articles . Both types of data need to be accessed , and the right data needs to be utilised . If not , the exchange with the chatbot will be at best frustrating and at worst inaccurate .
An effective way of making LLMs more accurate is with an AI framework called Retrieval Augmented Generation . This enables companies to use their structured and unstructured proprietary data to make generative AI more contextual , timely , trusted , and relevant .
Combining all your customer data , structured and unstructured , into a combined 360-degree view will ensure customers have the most relevant information for any enterprise scenario .
Many companies are exploring the use of Retrieval Augmented Generation technology to improve internal processes and provide accurate and up-to-date information to advisors and other employees . Offering contextual assistance , ensuring personalised support , and continuously learning will improve efficiency decision-making across their organisation .
If an overload cannot be avoided , the applications are notified by Explicit Congestion Notice at an early stage . Leaf and spine switches then update Explicit Congestion Notice-enabled packets informing the senders of the overload so that they can slow down the transfer to avoid packet loss . If the endpoints do not respond in time , priority-based flow control enables Ethernet receivers to report buffer availability back to the senders .
Leaf and spine switches can also pause or reduce traffic on specific connections during times of overload to reduce congestion and eliminate packet loss , enabling lossless transfers for certain types of traffic .
Automation is the final element of an effective AI solution for data centres . It is used in the design , deployment and management of the AI data centre . It can automate the AI data centre lifecycle from Day 0 to Day 2 +.
The result is repeatable and continuously validated AI data centre designs and deployments that do not only eliminate hu-man error , but also use telemetry and data stream information to optimise performance , facilitate proactive troubleshooting and pre-vent downtime .
AI is becoming more mainstream , but businesses and society are still at the beginning of what will ultimately be possible . Either way , data centre networks will continue to play an important role in the coming decades as the frontiers of AI continue to be explored .
AI infrastructure solutions that deliver high performance to optimise GPU efficiency are essential . Ethernet fabrics with innovative net-working technologies that accelerate data transfer and enable loss-free transfers are key enablers – and can help drive the AI revolution . p
78 INTELLIGENTCIO MIDDLE EAST www . intelligentcio . com