When you look at the fundamentals of technology, the most vital piece is data and access to it in a timely manner. We continue to generate huge amounts of data at an ever-increasing rate and our businesses and daily lives rely on always being able to access it, irrespective of its age. However, storing data doesn’t come for free, and to keep cost under control we started to categorise data hierarchically based on its business value which is making cold storage cool again.
Data categories besides cold data
Data has always been categorised by its immediate importance to the user, which allows some costs management. Across the IT world, there are numerous names such as Tier 1/2/3 or production/archive or cold/warm/hot. Irrespective of the name given, the terms are relative to the user, one man’s active data is another person’s low access data. Putting such semantics to one side, we are collecting, processing, and storing ever-increasing amounts of data. The costs to hold vast quantities of data are increasing due to the growing volumes and component costs.
Storing ever-increasing amounts of data
A simple phrase that is one of the most important roles in the industry because we, the users, expect it to be available whenever we want. Balancing the convenience of access and the storage costs is not such a simple task.
Historically cold data was often viewed as inactive data that would not require quick access and probably was being kept for regulatory reasons, e.g. historical accounts. Such data was likely to be held on tape and could well be stored offsite or copied offsite. The insatiable desire for business competitive advantage has changed that, we may describe data as cold but access can often be demanded at very short notice to aid business decisions.
Accessing your data can be slow due to the older storage technology or the required selection, sorting, and reading technology – just think tape – to make it available. As companies and other organisations intend to use their data with big data concepts such as artificial learning (AI/AL) deep learning (DL) and machine learning (ML), the idea of cold data being truly online has become a necessity. New and efficient systems and architectures have emerged to address the need. Cloud providers like IONOS Cloud S3 Object Storage, Amazon Glacier, or Google Coldline, to name prominent ones, offer cloud cold storage solutions. On-prem storage solutions combined with intelligent software can make for easier to access and read cold data, providing another great solution. I think the latter is a smart route for companies that range in the data storage requirements of a few Terabytes to Petabytes. That opinion may not surprise you, given my work is creating hardware that empowers performant software to the fullest.
Truly pure offline storage still has its use case, especially if the data should be kept away from the world of the Internet and mainly needed as a transfer medium that is accessed as per need only. Think of Bitcoin as an example. Any Bitcoin data should be kept in an offline wallet, i.e. USB stick unless used for trading.
Of course, there are several hybrid solutions that make sense depending on your data management strategy, which also should include data versioning. The most important and highly strategic decision you need to make before planning and executing, I believe, is to get clear on how you are using and accessing your data.