Over the last few years there has been an increased focus on data ethics. People are waking up to the fact that in many cases it is their data and that organisations should respect this ownership. This means asking for consent to collect, store, and utilise this data, and similarly, ensuring that the way this data is used is clearly articulated and fair. For example, no one would agree to sharing social media data if this were then used to increase their insurance premiums.
Given the volumes of data now being both generated and aggregated and the recent examples of organisations abusing these, the timing of this debate makes sense. However, in many organisations personally identifiable information (PII) is only a small part of the total data volume. Despite this fact, I think the question of ethics is an important one. Maybe not quite data ethics, but let us call it analytics ethics – the rules that should govern our analytics.
While analytics is the art of making better decisions, the challenge we face is that human are not quite rational creatures. Even when reasoning based on facts, our decision-making processes are coloured by our emotions. This is one reason why storytelling is such an effective way of communicating data-driven findings: it allows us to package our work in an emotional envelope. The reason we need analytics ethics is that this tendency can be exploited.
For example, have you ever sat in a meeting where an analysis was presented that seemed so compelling as to make it impossible to disagree? Was it clearly explained how they got to this outcome? As the 1956 book How to Lie with Statistics stated: "If you torture the data long enough, it will confess to anything." As an analyst or data scientist, be transparent about your process. If you chose to leave some data out to fit your story, make this clear.
Another important topic is accountability. The idea of analytics is that it is objective, or as close to that as possible. Yet the amount of times analysts disavow their work when it starts making waves in an organisation is disheartening. Instead, stand by your numbers. Impactful analyses about truth, not convenience. In the scenario where your analysis was incorrect, take the time to understand and explain your error. This builds trust and credibility.
Finally, we should consider the morality of our work. Are we truly helping our stakeholders make better decisions? Humans are prone to biases and irrational decisions and we have difficulty with concepts such as exponential growth or randomness. Make sure you are consciously avoiding these biases, instead of exploiting them in order to achieve the outcome you want. This is similar to the idea of dark patterns in the field of user experience design.
In summary, data ethics is important, absolutely. However, we should also consider our analytics ethics. Transparency, Accountability, and Morality are just three aspects analysts and data scientists should keep in mind. Many organisations still do not believe in the transformative power of their data. In such a low-trust environment, building credibility has nothing to do with regulation, and everything to do with how we decide to act ourselves.