About a year ago, IDC put out a document quantifying the amount of data generated by all sources in 2006 and came up with the astounding number of 161 Exebytes. Now, the company has refastened its future-gazing goggles and come up with an even higher number for 2007: 281 Exebytes. That’s 45GB for every man, woman, and child on the planet. Given that many people have yet to see even their first computer, some of the rest of us are clearly doing the digital equivalent of driving SUVs.
One of the reasons that IDC raised its estimates of each person’s digital footprint is that it hadn’t initially accounted for much of what it calls a person’s “digital shadow,” that information on us that we don’t create ourselves. Examples are medical imagery, surveillance photos, and search history. So, in addition to the stuff we generate voluntarily, lots of information just gets created about us.
We all know about those inconvenient shadows that follow us around from sites like Facebook and MySpace. Those funky photos of you playing beer pong in college are awfully hard to get rid of, particularly when others — who perchance see in the pix an archetype of the carefree years — copy the images and put them on their sites all over the world. But think of all the purely involuntary data on us laying around on hard drives and passing around through data channels everywhere. Take, for example, the NSA’s famous monitoring of big telecommunications trunks as part of the Bush Administration’s national security policy. You are probably in there somewhere, your inconsequential bits mixed together with everyone else’s, the NSA’s stealthy machinery sniffing over them for interesting morsels to pass along to the pudgy linguists at Fort Meade for further examination.
The bad part of the digital shadow is that some history you’d like to see disappear is nearly un-erasable. A recent story circulating in the news told of a woman whose former boyfriend posted a fake profile of her on a social network, with her real name, photos, and contact information as well as a fiction about the type of people she’d like to meet and what she’d like to do with them. She had to spend many months and engage the services of a firm specializing in Web reputation redemption to get rid of the hundreds of instances of these fake profile elements scattered around the Web.
Then there’s the idea that bad guys can piece together profiles of people from bits of information they mine here and there. The crooks, using nothing other than publicly available information, may be able to steal your identity, borrow money in your name, and otherwise create a financial disaster that will take you years to clean up. Some of this information you put there yourself on your home page at MySpace, but perhaps some came from items posted by companies from which you bought things online.
I’m reminded of my time living in Austria years ago. Bear with me for a paragraph, and I’ll tie it back in. One morning, eating breakfast with my Austrian girlfriend in our kitchen, I looked out the window and saw a man across the courtyard eating breakfast in his kitchen. He wasn’t a pretty sight, disheveled, not young, a bit on the heavy side, and wearing one of those t-shirts with spaghetti straps common among southern European men of a certain class. The bottom half rounded over his paunch, creating a shelf that had managed over time to catch various morsels that never made it into his mouth. I asked, “Why do Austrians not have curtains on their windows?” I had noticed this phenomenon before. Of course, the bedrooms and bathrooms are in the back and make use of clerestory windows and mottled glass, but, notably, on most streets you could look right into people’s homes. She said, “In Austria, which is a Catholic country, people believe you should have nothing to hide. If you had curtains, people would wonder what you were doing behind there.” I’ve often thought about this idea, that you should have nothing to hide. A good adage for the Internet would be, don’t have anything to hide. Your reputation should be like that of a middle-class Austrian during the Cold War years: people can look in your window and see you there in your t-shirt, and while you might not be a movie star, at least you have nothing you would mind other people seeing.
One way people have found to aid their online reputations is to dilute bad stuff with good stuff. If you can get 1,000 positive hits to cover your beer pong episode, most people won’t find it. It’ll be on page 23 of their searches.
Not all of this digital shadow is bad, however; some of the data is beneficial. There’s a lot to be said for universal medical records in the sky. One thing American consumers suffer from is a lack of choice in medical care. This total absence of any semblance of market dynamics in the capitalist paradise might appearing baffling at first to anyone who hasn’t checked into the beneficiaries of this situation: all participants except the consumer, who has no vote. If you could assemble your entire medical history and all your resources in the cloud, you could move from employer to employer and health care provider to health care provider with little friction, a circumstance that would clearly benefit the consumer, but not necessarily the provider, which rather enjoys having captive customers.
Nonetheless, some companies, like EMC, the sponsor of the IDC study, have taken a more enlightened approach. They see the possibility of reducing overall health care costs (and therefore their own bills) if they free up their users and provide good preventative as well as care resources. This strategy does well by doing good. EMC’s Healthlink offers employees an interactive health portal, which allows them to find doctors, get second opinions, read up on various conditions, receive remote patient monitoring, obtain e-prescribing, and acquire other related health services in a way that helps workers take better care of themselves. So, this piece of the digital shadow could be thought of as pretty much benign.
There still remain issues of what’s actually in these records and who gets to see them. If an insurance company were able to access a record that said your DNA indicated a predisposition to cancer or heart disease, you might face discrimination in insurance costs or availability. These issues are still being hashed out, but in general, consumers are more, rather than less, empowered by having control over a coherent body of information about their health condition, whether they or someone else put the information there.
I’m a firm proponent of cloud computing, which involves having a lot of my precious data up on servers that I don’t own or control. I’m always moving between computers, and I like to have all my stuff available on all of them. There are various strategies for synchronization, but cloud computing is still relatively new, and kinks have yet to be worked out. One new twist, now in beta trial from Microsoft, called FolderShare, actually synchronizes files between clients (desktops or notebooks) without holding the data itself on the server. This trick is pretty neat. When connected, the program analyses the two clients, reconciles the differences, and then clears the temporary files off the server, leaving no trace.
Another way to deal with personal storage up on the server is to encrypt it. With your key, you can access your data, which looks like gibberish to anyone else who might come across it. Drawbacks are that encryption slows access down and you might lose the key. Faster pipes, drives, and processors will help with the slowness, but there’s no really elegant way to deal with the potential of a lost key.
But someday, wherever you are, you’ll be able to call down all your stuff to any device, be it as small as a phone or as large as a home entertainment system. With better data management, you’ll not only have great access to that vast data footprint you’re creating, but also a coherent way to figure out what’s really there.
© 2008 Endpoint Technologies Associates, Inc. All rights reserved.
