Several people have raised the issue of big data becoming bad data. The main arguments for that claim include disparate impact theory, which is used in U.S. anti-discrimination law that says a policy may be considered discriminatory if it has an adverse impact on any group based on race, religion, gender, sexual orientation or other protected status.
We can look at the bad data dispute from the credit rating point of view. Certainly there are different views on how these software algorithms can be biased through the legal definition of disparate impact. It’s better to look at the issue from a rational standpoint with three stakeholders: lenders, society as a whole and financial institutions.
Let’s start with lenders, who are the most important stakeholder group. Most lenders still rely on credit bureau scores and basic information from loan applications. This works more or less fine for people with solid credit history. On the other hand, this gives no chance for borrowers who are trying to improve their credit rating or don’t even have one. Thanks to big data the breadth of information that could be potentially used for credit scoring has expanded considerably and this gives all people a fair chance for a credit score. Moreover, this includes every person with little or even no credit history, for example young people, unbanked or underbanked and recent immigrants. We are making intense efforts to include every group as an important and permanent part of financial institutions.
Most importantly, the society as a whole will benefit from usage of big data as it increases the accuracy of scoring models. Undoubtedly every single person is not creditworthy and therefore refusing their access to a loan reduces the risk of their downward spiral. A large amount of not creditworthy people receiving loans might even play a part in a catastrophic event like the subprime crisis. In addition, it helps financial institutions to assess their credit risk and therefore even lower the interest rate. Better interest rates open the door for people who couldn’t afford it before and makes it cheaper for others.
To conclude, the possibility of receiving a loan increases based on our experience. In some cases, we have even seen a drop in the interest rate. This results in more people receiving access to credit with a better interest rate thanks to increase of scoring model accuracy. We believe that designing systems from the start in discrimination-conscious way will reduce the risk of machine-learning algorithms introducing unintentional bias much like humans do. This should avoid the moral problem of discrimination. In addition, requiring drivers to pass an eye test discriminates against the blind, but eyesight is quite essential to safely drive a car. As the last exclusion is justified then loaning to people who are not creditworthy should be an acceptable exclusion as well.