As we move deeper into 2024, it is imperative for data management leaders to look in their rear-view mirrors to assess and, if needed, refine their data management strategies. One thing is clear; if data-centric organizations want to succeed in 2024, they will need to prepare for an environment in which data is increasingly distributed.
Five trends emerge in 2024:
Data Anti-Gravity Will Prevail
While migrations to the cloud, cloud data lakes, and/or cloud data warehouses will remain big factors behind the success of modern data and analytics, it will become increasingly hard for any organization to rely on a single cloud provider, cloud data warehouse, or data lake to meet all end-to-end data and analytics needs.
This is why data anti-gravity will be the new norm in 2024 and beyond. Other factors contributing to data anti-gravity will be the rising costs of data replication, data sovereignty, local data governance laws and regulations, and the requirement for accelerated speed-to-insight. As the data anti-gravity trend continues, data management leaders should invest in technologies that are built on the premise of distributed data management.
Data Products Will Rule
Data mesh, a distributed data management approach, will begin to play a more prominent role as data becomes increasingly distributed. In a data mesh context, business stakeholders will need to be able to define and create data products and govern the data based on their domain needs. IT will need to deploy the right infrastructure to enable business users to be more self-sufficient.
In this data-centric era, it is not enough to merely package data attractively; organizations need to enhance entire end-user experience. Echoing the best practices of e-commerce giants, contemporary data platforms must offer features like personalized recommendations and popular product highlights, while also building confidence through user endorsements and data lineage visibility. Moreover, these platforms should facilitate real-time queries directly from the data catalog and maintain an interactive feedback loop for user inquiries, data requests, and modifications. Just as timely delivery is essential in e-commerce, quick and dependable access to data is becoming indispensable for organizations.
Generative AI (GenAI) Will Redefine Business-as-Usual
GenAI will have a huge impact on data management and result in tools and technologies that are more business friendly. However, in an increasingly distributed data landscape, without the ability to assure access to high quality, trusted data, a GenAI-enabled data management infrastructure will be of little or no use.
Organizations are encountering several additional challenges as they attempt to implement GenAI and large language models (LLMs), including issues with data quality, governance, ethical compliance, and cost management. Each obstacle has direct or indirect ties to an organization’s overarching data management strategy, affecting the organization’s ability to ensure the integrity of distributed data as it is fed into AI models, abide by complex regulatory guidelines, or facilitate the model’s integration into existing systems.
Unpredictable Cloud Costs Will Rattle Nerves
Organizations have been moving infrastructure to the cloud, but many are discovering an inconvenient truth: that cloud costs can be extremely volatile and difficult to predict.
These efforts can be enhanced by adopting financial operations (FinOps) principles, which blend financial accountability with the cloud’s flexible spending model. By regularly monitoring expenditures, forecasting costs, and implementing financial best practices in cloud management, organizations can balance cost savings and operational efficacy, ensuring that their cloud or hybrid data strategies are economically and functionally robust.
Data Security and Governance Will Continue to Exert Pressure
Data will always have to be secured and governed. However, as data remains distributed across on-premises and cloud systems, data security and governance should not be an additional impediment to data access, collaboration, and innovation. In 2024, we will see an increase in solutions that simplify security and governance.
Organizations are leveraging global policies for data security and governance. Global data security policies can be based not only on user roles, but also on location, so that a person on vacation might not be able to access the data from the main office. And global data governance policies can automatically standardize the spelling of certain words, across the different systems within a company.
The Future is Logical
To overcome the challenges presented by each of these five trends, organizations will need to be able to leverage data management strategies that are designed from the ground up to support distributed data. Traditional data management approaches rely on the physical replication of data from multiple systems into a central repository, like a data warehouse or data lake, but such approaches, by definition and also in practice, do not support inherently distributed data. In contrast, logical data management approaches enable real-time connections to disparate data, without replication, which is exactly what is required.
- Author
Ángel Viña
CEO and Founder at Denodo