Apple has clarified that its OpenELM artificial intelligence model is not used in any of the AI or machine learning features within the company’s commercial products, including Apple Intelligence. This statement was issued after it was discovered that data of questionable origin had been used to train OpenELM.
Previously, it was revealed that Apple, along with other technology giants, had utilized subtitles from YouTube videos to train their AI models, including materials from some of the platform’s largest video bloggers. This data was included in the public Pile array, published by the non-profit organization EleutherAI. The array contained subtitles downloaded from YouTube, essentially transcripts of video recordings, which directly violates the platform’s rules.
Apple’s Stance on Data Usage
Apple has emphasized that OpenELM is their contribution to the research community, aimed at advancing the creation of open, large language models. The company informed 9to5Mac that OpenELM was created solely for research purposes and does not provide any functionality to the Apple Intelligence system. OpenELM is published as open source and is available to anyone, including on the Apple Machine Learning Research website section.
Since OpenELM is not part of the Apple Intelligence system, the allegedly illegally obtained YouTube subtitles have no connection to the commercial system, notes NIX Solutions. Apple has previously stressed that Apple Intelligence was trained “on licensed data, including data selected to improve certain functions, and publicly available data collected by our web crawler.” We’ll keep you updated on any further developments regarding OpenELM. Apple has no plans to develop new versions of OpenELM.
By providing this clarification, Apple aims to address any misunderstandings and concerns regarding the data used in training their AI models and to reaffirm their commitment to ethical practices in AI development.