Predictive analytics and privacy, the moral enemies?

Predictive analytics is a subset of data. Data is the very essence of what we care about. Personal data is not equal to a person, it’s much better. It weighs nothing, takes no space and lasts forever. Data about a person is not as valuable as the person but it’s cheaper to manage and hence a better investment.

The fact that data is perceived as dangerous speaks to its power, if it were weak, it wouldn’t be a threat

One of the top five health insurance companies is trying to predict if an elderly insurance policyholder will pass away in the next 18 months, based on the patient’s recent medical claims. This seems dubious at first but it gives so much power to these companies. Could the company delay the treatment expenses because they are 95% confident that the patients’ survival chances are very slim? At this point, you’d have questions that’d, unfortunately, remain unanswered as these experiments are carried out way below the surface.

The Battle Over Data

On one hand are the privacy advocates. These guys don’t want data, want to delete, contain or even prevent from being recorded in the first place. On the other hand, we have the entrepreneurs, the managers, and techies who are data hustlers. Data hustlers see the value and this value is exciting, more exciting than an economic standpoint. In the recent past, tracking of GPS signals, cell phones, and cars has seen a new surge, some of the cases leading to trouble with law enforcement agencies as well. Tom Mitchell, a professor at Carnegie Mellon University and a leader in the research and development in machine learning capabilities wrote that:

The potential benefits of mining such data are various: reducing traffic congestion and pollution, limiting the spread of disease, better use of public places: parks, buses etc. but the risks of privacy from aggregating such data are on a scale that humanity has never faced before

The battle will go on for decades to come. Data is like a knife – can be used for both good and evil and simply outlawing the concept is not a solution. There is no correct resolution here and saying “Please click here to agree to our terms and privacy policy” is not the correct way to educate the consumer because organizations and consumers speak very different languages. A long way to go!

The Misconception

Predictive Analytics, in general, is misinterpreted by many. It’s seen as a drill-down process to view and analyze data at an individual’s level, while it’s completely the opposite. It’s a roll-up process where the objective is to aggregate the entire data and observe the pattern from a bird’s eye.

Insight or intrusion?

If I look into my friend’s shopping cart and conclude based on his cart items that he’s going on a trek without informing me, have I committed a thoughtcrime? Yet another example of data being convicted of something is “racial discrimination”. Crime department in the U.S. are using predictive models to predict if the criminal that is soon to be released will commit yet another crime? Because the model takes in factors of the previous crimes committed including gender, age, neighborhood, offender’s zip code, ethnicity creeps into the model. While we are moving towards an era where race, ethnicity shouldn’t be criteria to judge someone but here as you can see, the predictive model will try to correlate your race with a probability of committing a crime. It’s agreed that predictive analytics is taking such factors into account that can generate more false positive than true positives, but Ellen Kurtz, the champion of this crime model argues “If you wanted to remove everything correlated to race, you wouldn’t have anything”. The moral being that data shouldn’t be made a scapegoat! Predictive Analytics has the power to do more good than evil. It’s everywhere that companies are tracking almost everything from the uber rides you take to the clicks you make. Every company that tracks such data knows how they can benefit from it, be it advertising, re-targeting, lead scoring, recommending the next movie to watch and what not. Collecting data in itself is not a crime but what you do about it is questionable.

While we have new ethical problems, we don’t have new ethics – Michael Lotti

An era of Open Data

The open data movement is here. The decades of data collected was never meant for productive use. Free public data is out, here’s a list of the sources where you can find data from a labor’s average pay in the U.S. to earthquakes.


See how Accenture has set up a Data Ethics Strategy for themselves.Lastly, I believe every company using data for predictive analytics should put in the “Data Usage Terms & Ethics”, only then will predictive analytics and privacy co-exist peacefully

Team Znbound
Table of Contents