- Unbiased data, teams, and algorithms. This refers to the importance of managing inherent biases that can arise from the development team composition if there isn’t a good representation of gender, race, and sex. Data and training methods must be clearly identified and addressed through the AI design. Gaining insights and potentially making decisions based on a model that is in some way biased (a tendency toward gender inequality or racist attitudes, for example) isn’t something you want to happen.
- Algorithm performance. The outcomes from AI decisions shall be aligned with stakeholder expectations that the algorithm performs at a desired level of precision and consistency and doesn´t deviate from the model objective. When models are subsequently deployed in their target environment in a dynamic manner and continue to train and optimize model performance, the model will adjust to the potential new data patterns and preferences and might start deviating from the original goal. Setting sufficient policies to keep the model training on target is therefore vital.
- Resilient infrastructure. Make sure that the data used by the AI system components and the algorithm itself are secured from unauthorized access, corruption, and/ or adversarial attack.
- Usage transparency and user consent. A user must be clearly notified when interacting with an AI and must be offered an opportunity to select a level of interaction or reject that interaction completely. It also refers to the importance of obtaining user consent for data captured and used. The introduction of the General Data Protection Regulation (GDPR) in the EU has prompted discussions in the US calling for similar measures, meaning that the awareness of the stakes involved in personal information as well as the need to protect that information are slowly improving. So, even if the data is collected in an unbiased manner and models are built in an unbiased setup, you could still end up with both ethically challenging situations (or even breaking the law) if you’re using personal data without the right permissions.
- Explainable models. This refers to the need for AI’s training methods and decisions criteria to be easily understood, documented, and readily available for human assessment and validation. It refers to situations where care has been taken to ensure that an algorithm, as part of an intelligent machine, produces actions that can be trusted and easily understood by humans. The opposite of AI explainability is when the algorithm is treated as a black box, where even the designer of the algorithm cannot explain why the AI arrived at a specific insight or decision.
An additional ethical consideration, which is more technical in nature, relates to the reproducibility of results outside of the lab environment. AI is still immature, and most research-and-development is exploratory by nature. There is still little standardization in place for machine learning/artificial intelligence. De facto rules for AI development are emerging, but slowly and they are still very much community driven. Therefore, you must ensure that any results from an algorithm are actually reproducible— meaning you get the same results in the real, target environment as you would not only in the lab environment but also between different target environments (between different operators within the telecommunications sector, for example.)
How to ensure trustworthy artificial intelligence
If the data you need access to in order to realize your business objectives can be considered ethically incorrect, how do you manage that? It’s easy enough to say that applications should not collect data about race, gender, disabilities, or other protected classes. But the fact is that if you do not gather that type of data, you'll have trouble testing whether your applications are in fact fair to minorities.Machine learning algorithms that learn from data will become only as good as the data they’re running on. Unfortunately, many algorithms have proven to be quite good at figuring out their own proxies for race and other classes, in ways that run counter to what many would consider proper human ethical thinking. Your application would not be the first system that could turn out to be unfair, despite the best intentions of its developers. But, to be clear, at the end of the day your company will be held responsible for the performance of its algorithms, and (hopefully) bias-related legislation in the future will be stricter than it is today. If a company isn’t following laws and regulations or ethical boundaries, the financial cost could be significant — and perhaps even worse, people could lose trust in the company altogether. That could have serious consequences, ranging from customers abandoning the brand to employees losing their jobs to folks going to jail.
To avoid these types of scenarios, you need to put ethical principles into practice, and for that to happen, employees must be allowed and encouraged to be ethical in their daily work. They should be able to have conversations about what ethics actually means in the context of the business objectives and what costs to the company can be weathered in their name. They must also be able to at least discuss what would happen if a solution cannot be implemented in an ethically correct manner. Would such a realization be enough to terminate it?Data scientists in general find it important to share best practices and scientific papers at conferences, writing blog posts, and developing open source technologies and algorithms. However, problems such as how to obtain informed consent aren’t discussed quite as often. It's not as if the problems aren’t recognized or understood; they’re merely seen as less worthy of discussion. Rather than let such a mindset persist, companies should actively encourage (rather than just allow) more discussions about fairness, the proper use of data, and the harm that can be done by the inappropriate use of data.
Recent scandals involving computer security breaches have shown the consequences of sticking your head in the sand: Many companies that never took the time to implement good security practices and safeguards are now paying for that neglect with damages to their reputations and their finances. It is important to exercise the same due diligence now accorded security matters when thinking about issues like fairness, accountability, and unintended consequences of your data use. It will never be possible to predict all unintended consequences of such usage and, yes, the ability to foresee the future is limited. But plenty of unintended consequences could easily have been foreseen. (Facebook’s Year in Review feature, which seemed to go out of its way to remind Facebook users of deaths in the family and other painful events, is a prime example.)
Mark Zuckerberg's famous motto, "Move fast and break things," is unacceptable if it hasn’t been thought through in terms of what is likely to break. Company leaders should insist that they be allowed to ponder such aspects — and stop the production line whenever something goes wrong. This idea dates back to Toyota’s Andon manufacturing method: Any assembly line worker can stop the line if they see something going wrong. The line doesn’t restart until the problem is fixed. Workers don’t have to fear consequences from management for stopping the line; they are trusted, and are expected to behave responsibly.
What would it mean if you could do this with product features or AI/ML algorithms? If anyone at Facebook could have said, “Wait, we’re getting complaints about Year in Review” and pulled it out of production, Facebook would now be in a much better position from an ethical perspective. Of course, it’s a big, complicated company, with a big, complicated product. But so is Toyota, and it worked there.The issue lurking behind all these concerns is, of course, corporate culture. Corporate environments can be hostile to anything other than short-term profitability. However, in a time when public distrust and disenchantment are running at an all-time high, ethics is turning into a good corporate investment. Upper-level management is only starting to see this, and changes to corporate culture won’t happen quickly, but it’s clear that users want to deal with companies that treat them and their data responsibly, not just as potential profit or as engagements to be maximized.
The companies that will succeed with AI ethics are the ones that create space for ethics within their organizations. This means allowing data scientists, data engineers, software developers, and other data professionals, to “do ethics” in practical terms. It isn’t a question of hiring trained ethicists and assigning them to their teams; it’s about living ethical values every single day, not just talking about them. That’s what it means to “do good data science.”
Introducing ethics by design for artificial intelligence and data science
What's the best way to approach implementing AI ethics by design? Might there be a checklist available to use? Now that you mention it, there is one, and you'll find it in the United Kingdom. The government there has launched a data ethics framework, featuring the data ethics workbook. As part of the initiative, they have isolated seven distinct principles around AI ethics. The workbook they came up with is built up around a number of open-ended questions designed to probe your compliance with these principles. Admittedly, it's a lot of questions — 46, to be exact, which is rather too many for a data scientist to continuously keep track of and incorporate efficiently into a daily routine. For such questions to be truly useful then, they need to be embedded not only in the development ways of working but also as part of the data science infrastructure and systems support.It isn’t merely a question of making it possible as a practical matter to follow ethical principles in daily work and to prove how the company is ethically compliant — the company must also stand behind these ambitions and embrace them as part of its code of conduct. However, when a company talks about adding AI ethics to its code of conduct, the value doesn't come from the pledge itself, but rather emerges from the process people undergo in developing it. People who work with data are now starting to have discussions on a broad scale that would never have taken place just a decade ago. But discussions alone won’t get the hard work done. It is vital to not just talk about how to use data ethically but also to use data ethically. Principles must be put into practice!
Here’s a shorter list of questions to consider as you and your data science teams work together to gain a common and general understanding of what is needed to address AI ethical concerns:- Hacking: To what extent is an intended AI technology vulnerable to hacking, and thus potentially vulnerable to being abused?
- Training data: Have you tested your training data to ensure that it is fair and representative?
- Bias: Does your data contain possible sources of bias?
- Team composition: Does the team composition reflect a diversity of opinions and backgrounds?
- Consent: Do you need user consent to collect and use the data? Do you have a mechanism for gathering consent from users? Have you explained clearly what users are consenting to?
- Compensation: Do you offer reimbursement if people are harmed by the results of your AI technology?
- Emergency brake: Can you shut down this software in production if it’s behaving badly?
- Transparency and Fairness: Do the data and AI algorithms used comply with corporate values for technology such as moral behavior, respect, fairness and transparency? Have you tested for fairness with respect to different user groups?
- Error rates: Have you tested for different error rates among diverse user groups?
- Model performance: Do you monitor model performance to ensure that your software remains fair over time? Can it be trusted to perform as intended, not just during the initial training or modelling but also throughout its ongoing “learning” and evolution?
- Security: Do you have a plan to protect and secure user data?
- Accountability: Is there a clear line of accountability to an individual and clarity on how the AI operates, the data that it uses, and the decision framework that is applied?
- Design: Did the AI design consider local and macro social impact, including its impact on the financial, physical, and mental well-being of humans and our natural environment?