Ever had a data problem? Ever thought it was an insurmountable big wave? On 29th April…..just after Big Data Week, but still under the banner, panelists from iVEC, SGI, NextGen, and Landgate discussed, debated, reflected and advised on the challenges, opportunities and encounters of big data. No data set was excluded – no data set was too large, too small, too complex or too simple. The lively discussion tackled the “data deluge” and specifically asked if “the Petabyte was the new norm” and what is big data anyway?
Big data seems to mean many things to many people. For example is big data only data greater than 1 petabyte in size? Or should big data be defined by the number of days it takes days to ingest to a data store? Or perhaps big data is only big if it needs to be physically moved on an external device? Of course the data movement issue is an issue in itself and can be a reflection on cost rather than data movement performance. Ultimately it seems that big data is subjective and only big in the eye of the beholder.
The panel also discussed the use and re-use of data analytical capabilities / tools, which were either open source or have been produced for other purposes. For some, especially those working in research, this could be an economically viable route to be able to either ‘dip in’ to big data or ‘attempt to harness the monster.’
Legacy data was also discussed, especially around how to deal with public data, how long should this be held for, when does data no longer serve a useful purpose and when can it be deleted. What is clear is that for most, legacy data (including jurisdictional issues, restrictions and requirements) are not well understood and we are all still learning. More fuel for Big Data Week next year perhaps?