Authored by: Moti Uttam, Managing Director, Malaysia, Hitachi Vantara
The Global AI-Powered Storage Market is Expected to Grow from USD 10.4 Billion in 2019 to USD 34.5 Billion by 2024; Growing at a CAGR of 27.1%. - ResearchAndMarkets.com.
The big data phenomenon has transformed the economic value of data and business leaders globally now see the potential to monetize their data. Much like a business has movable and immovable assets, data has now become the intellectual capital of an organisation.
Most assets depreciate with usage. However, data appreciates or gains value with usage. That is, the more the organization uses the data across more use cases, the more valuable, complete and accurate the data becomes. These same characteristics apply to analytics, where analytics is basically “data” that has been refined or “curated” into customer, product or operational insights.
South East Asian countries are still falling behind the APAC region as a whole in embracing and applying AI and machine learning to their benefit. This is despite having a large set of computable data driven by growing digitalisation, affordable computation and storage technology, newer algorithms and a community mindset which has long been ready for AI adoption.
Companies in the region have to break away from basic AI usage like face recognition and multi- player experience to larger functions like understanding user patterns and dynamically changing work processes and value to customers by unlocking dormant data. A prevalent concern across users is the storing and securing of mass data before they can effectively transform it using algorithms. This is where AI storage shows its value as it is innovated specifically to serve machine learning in the long run.
Unique to AI Workload
AI workloads are different from generic workloads. Hence, the use of traditional data storage may interfere with the efficiency of algorithms. AI and machine learning workloads require huge amounts of storage to extract translate and load input data, do exploratory data analysis to see what data is relevant, create test data, build and train the models, retrain the models to reflect changing data patterns, inferences, and new discoveries and keep them running. AI Storage is built to cope with quick match, extraction, analysis and large storage capabilities.
As most data appreciate in value, it must be retained for as long as there is perceived value. Of course, there are also local regulatory requirements like PDPA, which may require retention of data in a form that explains the decisions made by AI. Data protection and data dispersion will also create copies of data, making AI storage the best option for businesses looking to embrace deep learning and big data analytics.
Scalable AI Storage
Scalability not only requires the ability to scale capacities into the tens of petabytes, but also scale connectivity to thousands of servers. Scalability includes the ability to minimize the footprint and cost of storage through intelligent use of deduplication, compression, virtualization, tiering, archiving, indexing cataloging and shredding.
Machine learning also does not take place at a singular point. The use algorithms require the storage infrastructure to be accessible to a large number of compute servers with high-bandwidth and low-latency connections to the data. While the data ingestion and data engineering phase may be sequential, the bulk of the work in training the models and doing the analysis is very random.
AI and ML data is stored with metadata which their algorithms rapidly sort through to determine which data is relevant to the problem being solved. A lot of the data may be cold until it is suddenly called to action. This requires multiple levels of storage with different performance and costs with metadata being on the highest performance level and less active data on the lower performance and cost levels.
Managing Data Processes
Machine learning develops models that are evaluated and rebuilt as new data is added. The data refines the model and has to be updated consistently to remain a valuable data asset. There are many phases with different processing requirements, from data ingestion, data engineering, data discovery and visualization, model development and training, model deployment, and retention, all with various performance, tiering, and protocol requirements. Intelligent storage systems which utilize AI and ML are needed to continuously learn and adapt to the infrastructure environment to better manage and serve data.
AI and ML are provided to automatically adjust for peak performance and uptime as well as maximum return on investment (ROI). With AI, leaders see the opportunity to focus more time on strategic initiatives while the data center monitors itself. In effect, this allows an organization to begin implementing an autonomous data center that intelligently predicts and prescribes changes that increase uptime and operational efficiency for AI and ML workloads while meeting SLAs.
Designing AI and ML capabilities is a major focus for Hitachi engineering teams. Across every area of the company, development teams are researching how to develop AI to make smarter, broader insights. Hitachi VSP 5000 series provides the core data storage foundation for all digital business operations with the speed and scale to power existing workloads as well as new, data-intensive workloads emerging through multi cloud and AI-driven environments. It is agile enough to store block and file data and supports workload diversity ranging from traditional mission-critical business applications to containers to mainframe. Hitachi VSP 5000 series enables all workload consolidation that maximizes operational efficiencies.