One pressing challenge for machine learning researchers is reducing bias, which is often present in the raw data and can be amplified by such systems.
For example, if a developer wants to create an algorithm to help identify the most suitable candidates for a job, then they can use the existing employees of the company as a data source. As a result, the ML system will have corresponding distortions. Thus, if the company employs more men, then they may be assigned more weight in the sample. However, people with certain experiences or characteristics may be weeded out.
Google intends to address this issue with its new Know Your Data (KYD) dataset analysis tool. With it, developers will be able to identify existing biases in their data to minimize them, says SearchEngines.
At the moment, the new system is rather limited in how it can retrieve and parse sample data, notes NIXSolutions. However, it is noted that it points to a better future for such analysis and opens up more opportunities for reducing bias in machine learning systems.