The Internet never forgets
1. Can we permanently delete our data from the Internet?
The short answer is “no”. Computer system don’t know an action that deletes data. When data is stored on a storage device and want’s to be removed, the computer just removes the file pointer. The original data still is left at the very same, original place. The next time data is saved on the same storage device, the computer eventually overwrites the old data. Once this is repeated many times, it becomes more and more difficult to reconstruct the original data, but in many cases this still can be done. Think of it like a footprint in the sand. The footprint is not deleted, but once you step many times over the same footprint, the original one can hardly be seen, but is still there.
The Internet is a very large network of computers, servers, nodes, gateways, routers, clusters, etc., which means that every time you copy data to the Internet it travels a long way from your device to the remote location, where your data is written to a storage device. When traveling all the way through the network, your data is copied, cached and backed on each node or gateway. The very same data is copied so many times that it becomes impossible to control on how many places the same data has been written to a storage device. Even at the remote endpoint a computer cluster is copying the data again for reliability and redundancy reasons.
Once you try to delete and remove such data from the Internet, you might not see it at the original location any longer, but the data will still be there in all copied places for a long time, until it is overwritten many many times.
Private, confidential data does not belong into the Internet. The Internet was not invented for that purpose. The origin of the Internet, the former arpa-net, was to publish and distribute data and information on scientific research, academic knowledge, information and content from libraries and book collections.
The Internet never forgets.
2. Can we recover Deleted or Lost Files?
Yes. Depending the resources, the disposability and the mechanisms one can more or less easily recover lost data. Again: all the data that travels through the Internet is copied and cached all over many places. For example there is a robot called “the Internet archiver”, which is a search spider that follows all the links it can find in the Internet and crawls through the network. Once it finds data it copies and stores it, where one can recover it again. Further there are large data crawlers that are gathering large amount of data for data mining purpose, copy and keep the data in historical archives. Once somebody has access to such resources, one can easily recover “lost” data.
Computer systems don’t know the concept of “loosing” data. Think of Cloud computing like fog. In nebulous computing data can slowly fade away, but is not completely removed, just overwritten. It’s like your memory in the brain works: if you don’t repeat the information from time to time again, you start to forget. Try to delete a memory from your brain. The more you try to delete it from your brain, the more you will keep it, because you repeat and continue to remember.
Content delivery networks work that way: the more the same data is requested, the more it is copied to the very surface of the network supply.
The question is: are you clever enough to ask the right question to retrieve your lost data?