t cht lk
t cht lk
Ahmad Shakora, Group Vice President,
Emerging Markets, META, Cloudera
Arrival of open data lakehouse technologies
The technology landscape is witnessing significant advancements in open data lakehouse technologies, providing a robust foundation for AI and analytics. Let us delve into key technological developments and their advantages, focusing on the broader implications rather than specific products.
One of the primary hurdles in AI implementation lies in efficiently managing and accessing data scattered across disparate environments. Open data lakehouse technologies are evolving to address this concern by introducing innovative solutions. Adopting an open table format, as seen in some advancements, enables enterprises to apply mission-critical data to AI processes seamlessly.
This approach facilitates interoperability, allowing various compute engines like Spark, Flink, Impala, and NiFi to access and process datasets within the open data lakehouse concurrently.
Technological strides are being made to enhance data lake management while upholding data integrity. Features such as time travel, schema evolution, and streamlined data discovery contribute significantly to the effectiveness of open data lakehouse technologies. Time travel capabilities empower data teams to traverse historical data states, aiding in trend analysis and historical performance evaluations.
Meanwhile, in-place schema evolution ensures adaptability to changing data structures, crucial for organisations striving to achieve regulatory compliance and adhere to data protection policies.
As AI and advanced analytics continue to grow in scale, the demand for scalable data storage solutions has become paramount. Innovative technologies like improved replication, enhanced volume quotas, and support for cloud-native architectures contribute to greater scalability at a lower cost.
These advancements empower enterprises to accommodate the increasing volume of data and strengthen data security and readiness for enterprise-wide applications.
Recognising the need for seamless technology upgrades, open data lakehouse technologies are introducing Zero Downtime Upgrades, ZDU. This feature offers organisations a more convenient means of upgrading, supporting rolling upgrades for various components. ZDU aims to minimise workflow disruptions, reduce downtime, and ultimately enhance productivity.
By allowing one-stage and auto upgrades of large clusters, ZDU ensures that organisations can keep their systems up-to-date without experiencing lengthy and costly downtimes.
Open data lakehouse technologies incorporate powerful data security and governance layers, providing a fundamental framework for safeguarding sensitive company data. These layers ensure consistent policy application, support compliance with regulations such as the General Data Protection Regulation, GDPR, and reinforce overall data integrity.
The significance of a hybrid approach cannot be overstated when it comes to maximising business value from enterprise data. Open data lakehouse technologies are designed to offer portable, cloud-native analytics that can be deployed across diverse infrastructures.
training runs, and more complex deployments. The demand for bigger AI models has intensified a global chip shortage, caused by supply chain problems, energy demands of large-scale computing, and geopolitical issues.
Even cloud providers ' claims of running on renewable energy often mask the reality that peak AI workloads force them to fall back on fossil fuel sources during high-demand periods.
This gap between intention and impact underscores a systemic issue: without tools to measure, verify, and mitigate AI ' s footprint, even well-meaning initiatives become performative.
AIOps platforms, originally designed to streamline IT operations, are evolving into indispensable tools for climate accountability. By integrating environmental metrics into their analytics, these systems offer transformative capabilities.
Modern AIOps platforms monitor emissions at the workload level, providing granular insights into which applications, processes, or services are the most carbon-intensive.
They integrate with energy meters, cloud providers, and hardware sensors to calculate emissions using industrystandard models like the Greenhouse Gas Protocol.
This allows businesses to take immediate action, such as dynamically adjusting resource allocation, shifting workloads to renewable energy-powered data centres, or implementing low-power operation modes outside of peak hours.
76 INTELLIGENTCIO MIDDLE EAST www. intelligentcio. com