For years it has been normal practice for organizations to store as much data as they can. More economical storage options combined with the hype around big data encouraged data hoarding, with the idea that value would be extracted at some point in the future.\nWith advances in data analysis many companies are now successfully mining their data for useful business insights, but the sheer volume of data being produced and the need to prepare it for analysis are prime reasons to reconsider your strategy. To balance cost and value it\u2019s important to look beyond data hoarding and to find ways of processing and reducing the data you\u2019re collecting.\nExponential data growth\nThe volume of data that\u2019s being produced daily is growing fast. People generate enormous amounts of data, but machine generated data is set to eclipse that. As the IoT grows from an estimated 23 billion connected devices this year to almost 31 billion by 2020 and a staggering 75 billion by 2025, according to IHS data at Statista, collecting and storing all that raw data is starting to look impractical.\u00a0\nWe\u2019ve kept pace with data generation so far by adopting better compression technologies and backing up incrementally with a focus on what has changed, but as the volume increases we\u2019re going to fall woefully behind. We must find a way to reduce the amount of data that we\u2019re collecting.\nIdentifying what you need\nThe most expensive way to store data is in its raw form, so we need to reduce it, extracting pertinent details like averages, or standard deviations. Streamlining the data we collect and processing it to ensure that it\u2019s in a useful format seems an obvious answer, however, it\u2019s not as easy as it sounds. \u00a0\nIn some cases, it may be prudent to store raw data for future audits in the event of liability exposure. Regulatory requirements must also be weighed in when deciding what data to keep and what to let go of.\nPart of the difficulty with boiling data down is that we\u2019re still developing analysis through machine learning and artificial intelligence. That means we\u2019re betting on what will be valuable and what we can afford to discard. It\u2019s not practical or prudent to try and store all raw data, but there\u2019s a balance to be found and much depends on your specific business.\nProcessing at the edge\nFiguring out what data you want to keep and how the remaining data you\u2019re collecting should be processed is just one piece of the puzzle. You also need to work out where the processing and data reduction is going to take place. There\u2019s a natural tendency to want to centralize data for analysis but collecting the data and sending it to the cloud for processing is going to take time and cost money.\nIn many cases it will prove more cost effective to reduce data at the edge, as close as possible to where it\u2019s generated. This is a good way of reducing storage requirements and network traffic by only sending forward what you need for analysis. The trick is accurately identifying what you need, but as machine learning advances we\u2019ll be able to progress beyond educated guessing.\nMapping the future\nTo mitigate the risk of discarding valuable data you need to draw up some projections and ask probing questions about the future of your business. Don\u2019t just look at what you use data for today, ask what you might use it for tomorrow. If there are new sources on the horizon, work out what they\u2019ll need to provide for effective business analytics.\nThere must be some kind of ROI calculation here. What is the cost of storing this data versus its potential future value? Work out your ideal topology and plan how you\u2019ll reduce data, forward it, store it, process it, and analyze it.\nIn the short term it may be necessary to err on the side of caution and make provisions to store more data points. The best strategy right now may be to process at the edge where you can, but combine that with more traditional centralization of data where there\u2019s less clarity around its value.\nBeing proactive\nAs the mountain of data grows ever larger, failing to act is asking for trouble. You need a smart cloud data management strategy to drive innovation and it will rely on the data collection and processing foundation you build. The speed with which new data is accumulating and its projected growth mean that time is of the essence. Trying to retrofit a processing procedure or introduce a streamlined data topology will never be as easy as it is today.\nUse your current business performance and future goals to identify the data you need, find ways to process that data at the edge where practical, and weigh up the value of analysis versus storage. The ideal data strategy is going to take time to figure out, and will differ from organization to organization, but what\u2019s certain is that data hoarding is no longer a viable approach.