BETA
This is a BETA experience. You may opt-out by clicking here

More From Forbes

Edit Story

AI Versus HR: 4 Inclusion Problems And 5 Checks Before You Buy

Following

The HR tech market is predicted to grow from $33B in 2021 to $77B 2031. In the rush for AI-powered automation, HR professionals and business leaders need to apply caution. There are four fundamental problems that must be carefully unpicked in order to prevent standardization from undermining progress in diversity and inclusion and five checks advised for those making procurement decisions.

Problem One: Population Differences

The foundation of statistics, Central Limit Theorem, predicts that when we have a representative data sample of a population, it naturally organises itself around the normal distribution. This naturally occurring phenomenon is true for human height, for personality traits such as introversion and more. However, different populations can form their own distributions, such as women’s height versus men’s height, or neurodivergent introversion versus neurotypical. If we have trained our AI system on mainly men, the recommendations for women will be distorted.

Problem Two: Unusual People

In the normal distribution, the next algorithm is premised on the most likely next correlation, and the most likely next correlation after that. Proximity works for around 68% of people, whose preferences, behaviours and tendencies fall close the numerical average, for the population that has been mainly represented in the data sample. Those who do not possess average attributes are not served by the recommendations made by a correlation-based algorithm, unless they have been categorized as a separate population. Those minoritized in the workplace are likely to be poorly catered for and misunderstood.

Problem Three: Correlation Is Not Causality

The numbers do not check, understand or predict based on rationality, only data proximity. For example, red wine was once lauded by the media as reducing heart disease. This is because there is a correlation between drinking one glass a day and lower rates of heart failure. However, it is unlikely that the red wine alone achieves this, more likely that the type of person who moderately stops after one glass is also moderate and healthy in other lifestyle factors which actively reduce the heart disease, such as exercise and low consumption of unhealthy fats and sugar. The computer doesn’t understand this sort of nuance, it assumes that x = y and if it cannot see w and z, they do not exist.

Problem 4: the Good Employee

Not everything that is good about an employee can be quantified. For example, should an employee add value to a team with good nature, or for being the person who spends hours unpicking small mistakes in process to streamline sales activity, are these acts of service recorded accurately?

This is akin to the red wine issue. If we only record actual sales, and let the non-discerning AI decide who is "best," we may end up mistakenly advantaging only the most goal orientated, but potentially selfish employees. This is not good decision making for promotion, but the AI cannot predict what it cannot see. It doesn't know there are other attributes that it should be considering.

So, with all these potential flaws, here are some questions to ask yourself before you commission AI.

Question 1: Do You Have A Governance Framework?

Who is on it? To which principles do they ascribe? Are they clued in to ensure the responsibility for confidentiality, data storage and equality compliance falls on the vendor, not the purchaser? If you don’t have sufficient capacity to understand and monitor what you are buying, you may find yourself in legal jeopardy if things go wrong.

Question 2: How Good Is The Training Data?

For example, Amazon trained a hiring AI on existing staff performance and HR records, and so it learned that being male was associated with success and promotion and started eliminating female resumes. When facial recognition software was being developed, it was trained using the internet, during the period 2000-2010 when there were more images online of George Bush than any Black Women. Disability campaigners have pointed out the flaws in using automated video based hiring for people with facial disfigurement, tics or stroke.

These issues need to be understood, compensated for and challenged in any commissioning process for AI. If you don’t ask the questions, you may be liable for the biased answers produces by the tool.

Question 3 Will There Be A Transfer Issue?

For example, Xerox used AI to collect data on commute times as part of a well-being initiative, but there were no guardrails built in on access to other parts of the HR data store. The AI took a "machine learning" initiative and linked commute times to retention, finding that those living closer to the office were less likely to leave. Without critical reasoning from human perspective, this link could have formed the basis of a hiring strategy. Yet the link is not benign – house prices were higher closer to town, so the AI inadvertently built a privilege loop into the system. What the machine learns needs to be carefully considered for risk.

Question 4: Are Humans In The Loop, On The Loop Or Out Of The Loop?

In the loop means that humans have to sanction every decision before it is made live. On the loop means the ability to intervene if necessary. Out of the loop means humans are not involved. On the loop is considered the best compromise given the current issues in AI as above, and this must be calculated against the risk of the decision for human harm or legal risk. On balance assessments must be part of the governance structure, and there must people guarding against bias and discrimination. Humans being on the loop prevented Xerox from making a mistake, humans on the loop discovered the Amazon AI bias.

Question 5: Are The AI Decisions Transparent?

Can you explain the decisions the AI is making? For example if you are making performance, promotion or leaving decisions based on AI, can you justify it? There is a current active case in the USA between Derek Mobley and Workday, where Mobley alleges age, race and disability discrimination, having failed to achieve a job following over 80 applications using the Workday candidate screening AI. This loops back to the first question on governance and the need to understand what you are buying. Who is liable when things go wrong? How can a case for defence be built if no one know what the tool actually does?

Does AI Understand Its Own Limitations?

In preparation for this article, I asked ChatGPT what checks an employers should make before purchasing AI based HR software.

It described planning, checking and reviewing. It understood that there needed to be good training data and that biased data was possible. It knew it had to have explicability and transparency.

But…..

It missed the transfer issues and scope creep shown in the Xerox case. It assumed there will be good training data out there, when actually there may not: the internet isn't used equally and the world of work is already subject to much bias. It missed the problem of the good employee. It didn’t know which legislation required compliance.

We are still learning about the potential of AI in society and we lack the legislative guardrails as is equivalent in medicine, engineering standards, distribution of chemicals. Until solid evidence is available, employers are advised to remain on the loop as a minimum, and avoid delegating HR decisions to correlationary data. The potential for human harm is real, and should not be ignored.

Follow me on Twitter or LinkedInCheck out my website