I thank the committee for its invitation to attend to discuss the processing of personal data in the context of digital assistants. I am one of the deputy commissioners at the Data Protection Commission, with responsibility for the consultation, supervision and guidance on policy functions of the office. Also in attendance are Cathal Ryan, assistant commissioner, who has responsibility for supervision and engagement of technology multinationals, and Ultan O’Carroll, assistant commissioner, who has responsibility for technology policy.
As the committee will be aware, the Data Protection Commission, DPC, is the lead supervisory authority in the EU under the general data protection regulation, GDPR, for most of the world’s largest technology, social media and Internet platform companies operating in the European market, given that their EU headquarters are based in this State. This responsibility brings a central role for the DPC in overseeing the compliance of these companies’ services and products with EU data protection requirements. The technology products and services of a number of these companies include digital or voice assistants. Those are the terms commonly used to describe a consumer or in-home device that operates by listening for and interpreting human voice commands or instructions. The more common examples are Google’s Google Assistant, Apple’s Siri, Amazon’s Alexa and Microsoft’s Cortana. In recent months, a number of international media reports on human reviews of voice recordings collected by voice assistant products have brought into focus the question of how technology companies are using voice data gathered via voice assistant technology. As the EU lead supervisory authority for a number of these companies, namely, Google, Apple, and Microsoft, the DPC is currently engaging with those organisations to establish the manner in which they are meeting their data protection requirements in this context. The Luxembourg data protection authority acts as EU lead supervisory authority for Amazon Alexa.
Before turning to the data protection issues arising in the use of voice assistants, I will briefly describe how such devices function in practice, which is helpful for calling out the nexus with data protection requirements. Voice assistants record user audio clips and convert those clips into a text form that acts as an input to online services such as search, weather, shopping, mapping and communications. In some cases where the devices are home-based, the instructions may also be used to control smart home devices including those for lighting, TV and media, heating and security. Devices listen continuously for instructions and may in some cases also be set up to recognise individual users' voices. They listen for keywords such as “Hey Google” or “Siri”, which triggers recording of the user's voice. Voice recordings can be stored alongside their converted text forms, either on the device or in the cloud. Service providers may also record against a user’s profile, preferences and choices that they derive from an analysis of the user’s voice commands. They may use that to serve back the information sought by a user or to add to their profile for the purposes of advertising. Raw audio signals are converted into a recognisable human word. Often, because of the variation in human voice, accent, tone or phrase, machine learning - in other words, artificial intelligence - is used with large volumes of sample voices to create a model of human speech. Different models may be needed for different languages. These models are updated over time to refine them and improve quality. In some cases, quality control will require some human review of voice snippets, especially where words are being incorrectly recognised, where background noises are incorrectly identified as human speech or to help reduce misactivations of the device. Human review of voice data collected and processed by automated means is a common method to review, improve and train the algorithms used in voice assistant technology. While not inherently problematic or contentious from a data protection perspective, this kind of processing has many data protection elements, which must be carefully considered and assessed by the companies providing such services to ensure that the use of user data is legitimate and appropriately protected.
I will briefly mention some of the key elements arising in this context. First is ensuring an appropriate lawful basis to process personal data. Organisations need to identify a lawful basis under the general data protection regulation, GDPR, which will permit the processing of voice data in the manner proposed, such as consent or legitimate interest, which are the most commonly used legal bases. Valid consent issues arise where it is not demonstrably active, informed, specific, freely given and withdrawable. Likewise, a legitimate interests basis must clearly demonstrate that the legitimate interests of the company are not outweighed by the rights of users concerned.
The second element I would like to mention is the provision of adequate transparency to users where the type of processing taking place is concerned. Information must be in an understandable format which allows individuals to make informed choices as to how their data are processed and which facilitates the exercise of their data protection rights. With the potential for voice processing to be invisible, particularly further processing for purposes not readily obvious to users, transparency measures need to be in place when devices are being installed, when they are in use and where a user wants to review what processing their device has undertaken.
The third element is implementation of effective and integrated measures and safeguards to adhere to the principles of data protection. This element of compliance requires appropriate technical and organisational measures to be put in place to confirm that only personal data which are necessary for each specific purpose of the processing are actually processed. As I mentioned earlier, the human review of voice recordings is a common practice to improve the accuracy and effectiveness of algorithms designed to transcribe and translate voice data. However user data must be adequately protected in this process, and indeed for all purposes for which voice recordings are used. Such safeguards and protections can include designing the process of evaluation of audio snippets, by either contractors or employees of the company, with data protection in mind from the start; being clear on what volume and size of audio snippets are necessary for each processing purpose; identifying clear conditions where it is necessary to recognise the person whose voice is processed; clear and plain transparency and privacy notices; technical security safeguards such as pseudonymisation and anonymisation of data; organisational measures; and opt-in features. There is a long list of safeguards which any company should take into account when processing such data.
The fourth and final element of compliance I wish to mention is the implementation of measures appropriate to the nature of the data being processed and the risks to users. This is a very important element of compliance because sometimes digital assistants are activated incorrectly, with the risk that private or sensitive conversations in the home or workplace are inadvertently recorded. While providers of voice assistant services have implemented preventative measures, organisations need to do more to reduce their incidence. Implementing adequate safeguards can balance out or minimise the data protection risks arising in this way.
The DPC is continuing to examine these issues in our ongoing engagement with the companies for whom we are the EU lead supervisory authority. We acknowledge and welcome the recent changes made by several companies to enhance transparency for users concerning the practice of human review of voice data to improve voice assistant technology, as well as the implementation of greater user choice on the use of data in such contexts.
As lead supervisory authority, the DPC also continues to co-operate with our EU data protection colleagues to identify common areas of concern and to identify what further steps, including guidance, may be necessary to bring additional clarity to the application of data protection requirements in the use of voice assistant technology. I thank the committee for the opportunity to be here today. My colleagues and I will be happy to take questions.