Speech and voice are media through which we express ourselves. Speech communication can be used to command virtual assistants, to transport emotion or to identify oneself. How can we strengthen security and privacy for speech representation types in user-centric human/machine interaction?
Interdisciplinary exchange is in high demand. The need to better understand and develop user-centric security solutions and privacy safeguard in speech communication is of growing importance for commercial, forensic, and government applications. The SPSC Symposium is a platform to seek better designed services and products as well as better informed policy papers for legislators and governance. The symposium is organized by the ISCA SPSC special interest group and the VoicePrivacy Challenge Team .
The second edition of the Symposium on Security & Privacy in Speech Communication, focuses on Speech and voice through which we express ourselves. As speech communication can be used to command virtual assistants to transport emotion or to identify oneself, the symposium encourages participants to give answers on how we can strengthen security and privacy for speech representation types in user-centric human/machine interaction? The symposium therefore sees that interdisciplinary exchange is in high demand and aims to bring together researchers and practitioners across multiple disciplines – more specifically: signal processing, cryptography, security, human-computer interaction, law and anthropology.
The second edition of the VoicePrivacy Challenge Workshop is spearheading the effort to develop privacy preservation solutions for speech technology. It aims to consolidate the newly formed community to develop the task and metrics and to benchmark progress in anonymization solutions using common datasets, protocols and metrics. VoicePrivacy takes the form of a competitive challenge. Participants are required to develop anonymization algorithms which conceal speaker identity within speech signals. At the same time, they should preserve linguistic content and naturalness. VoicePrivacy 2022 Challenge participants are encouraged to submit to the SPSC Symposium papers related to their Challenge entry, as well as other scientific papers related to voice privacy and anonymization.
To strengthen the efforts for both events, ease joined discussions, and extend the interdisciplinary exchange, we decided to combine our teams and organized a joined event. For the general symposium, we welcome contributions to related topics, as well as progress reports, project dissemination, or theoretical discussions and “work in progress”. In addition, guests from academia, industry and public institutions as well as interested students are welcome to attend the conference without having to make their own contribution.
Although, we aim for meeting all of you on-side, we also opt for virtual presentations during the workshop.
The Symposium is held at the Incheon National University: Small Theatre, 2F, Shops & Service Centre, see Campus-Map:
It is approx 40 minutes walk from the Interspeech Convention center to the NUI:
The bus-stop is called: Incheon National Univ. College of Engineering (인천대학교공과대학) The following busses service there:
We have a large lecture hall with 286 seats. This room is near school diners for lunch.
Abstract: With the rise of deepfakes and synthetic media, the question as to what is real and what is not will become increasingly important and politized. Deepfakes can be used to spread fake news, influence elections, introduce highly realistic fake evidence in courts and make fake porno movies. Each of these applications potentially has a big impact on society, social relationships, democracy and the rule of law. The question this talk shall assess is whether the current regulatory regime suffices to address these potential harms and if not, which additional rules and principles should be adopted. It will discuss several potential amendments to the privacy and data protection regime, limitations to the freedom of expression and ex ante rules on the distribution of use of deepfake-technologies.
Bio: Bart van der Sloot specializes in the area of Privacy and Big Data. He also publishes regularly on the liability of Internet Intermediaries, data protection and internet regulation. He has studied both philosophy (BA; MA) and law (BA; MA) in the Netherlands and Italy, also successfully completing the Honours Research Programme. He is an associate professor at the Tilburg Institute for Law, Technology, and Society of the University of Tilburg, Netherlands. Bart formerly worked for the Institute for Information Law, University of Amsterdam, where he wrote his Phd on privacy and virtue ethics, and for the Scientific Council for Government Policy (WRR) (part of the Prime Minister’s Office of the Netherlands) to co-author a report on the regulation of Big Data. Bart van der Sloot is the general editor of the international privacy journal European Data Protection Law Review. He also served as the director of the Privacy & Identity Lab between 2016-2021. Between 2010-2020, he was the founder and coordinator of the Amsterdam Platform for Privacy Research (APPR), the minor Privacy Studies and the Amsterdam Privacy Conferences 2012, 2015 and 2018. Bart was awarded two highly prestigious research stipends by the Dutch Scientific Organisation. The Top Talent Research Grant fully covered his Phd project and the Veni grant (2021-2025) covers a research project called: the right to be let alone ... by yourself.
Abstract: How big should voice be? How big can voice become? For some, conversational AI is a toy-like technology of convenience – an alarm, a music player, a teller of tales and news headlines. For businesses, conversational AI is most often a technology of automation and efficiency – one that alleviates call center burdens. But let’s pause and consider, for a moment, the possibilities. Here is an inclusive interface that will soon be resident on every digital device – in a world where every device is digital. Here’s a capability that will exist within every website, every smart system, every AI. We’re at the edge – and let’s dare to say it – of a worldwide voice web. Join Jon Stine, Executive Director of the Open Voice Network, a community of the Linux Foundation, for an exploration of where voice is, where it can go (for optimal societal and economic benefit), and what stands in its way. (Spoiler alert: he’s going to talk about interoperability and data ownership.)
Bio: Jon Stine is the Executive Director of The Open Voice Network, an open-source community of the Linux Foundation dedicated to developing technical standards and usage guidelines for the emerging world of artificial intelligence-enabled voice assistance. https://www.linkedin.com/in/jonstine/.
He brings to the task more than 30 years of executive leadership in the retail and technology industries. He led sales of national apparel brand to better US department and specialty stores before joining the Intel Corporation in 2000 to head its first global outreach to the retail and consumer goods industries.
He joined Cisco Systems retail-consumer goods consulting team in late 2006, and later headed Cisco’s North America consulting practice for retail-CPG. In 2014, he returned to Intel as the Global Enterprise Sales General Manager for the retail, hospitality, and consumer goods industries. He left Intel in 2019 to build The Open Voice Network.
He resides in Portland, Oregon, USA.
Abstract: Often overlooked, pseudonymisation can be an interesting alternative to anonymisation, especially in the context of speech data. Recognised as a safeguard for the rights and freedoms of data subjects, pseudonymisation can make GDPR compliance considerably easier to achieve. This talk will discuss the advantages of pseudonymisation, its special role for speech data (e.g. according to the European Data Protection Board's guidelines on voice assistants), and its seemingly bright future under the EU Data Governance Act.
Bio: Dr. iur. Paweł Kamocki is a legal expert in Leibniz-Institut für Deutsche Sprache, Mannheim. He studied linguistics and law, and in 2017 obtained his doctorate in law from the universities of Paris and Münster for a thesis on legal aspects of data-intensive university research, with a focus on Knowledge Commons. He worked as a research and teaching assistant at the Paris Descartes university (now: Université de Paris), then also in the private sector. He is certified to work as an attorney in France. An active member of the CLARIN community since 2012, he currently chairs the CLARIN Legal and Ethical Issues Committee. He also worked with other projects and initiatives in the field of research data policy (RDA, EUDAT) and co-created several LegalTech tools for researchers. One of his main research interests are legal issues in Machine Translation.
Abstract: In recent years, we have witnessed the astonishing advancement of speech generation technology, thanks to the rapid development of deep learning. The state-of-the-art speech synthesis technology can clone a speaker’s voice with a few training samples and generate natural-sounding audio samples that the speaker never said. The technology can be misused to create misinformation, which spreads farther, faster, and more broadly than the truth and erodes our trust in online information. It can also be misused to attack voice biometric systems. This talk will first present a high-level overview of approaches to manipulate and synthesize audio. Then, it will highlight recent technical developments to detect manipulated and synthetic audio. This talk will also discuss some current challenges and the needs from a user point of view.
Bio: Zhizheng Wu is an associate professor in the School of Data Science, the Chinese University of Hong Kong, Shenzhen. He received his Ph.D. from Nanyang Technological University, Singapore in 2015 and worked for Meta (formerly known as Facebook), JD.COM, Apple, University of Edinburgh, and Microsoft Research Asia. Zhizheng was the creator of Merlin, an open-source speech synthesis toolkit. He co-initiated and co-organized the first Automatic Speaker Verification Spoofing and Countermeasures (ASVspoof) challenge at Interspeech 2015, the Voice Conversion Challenge 2016, and organized the Blizzard Challenge 2019. He is a member of the IEEE Speech and Language Processing Technical Committee (2021-2023).
We are glad to announce the program, all times are KST:
|09:00 - 09:10||Opening Ceremony|
|09:10 - 10:10||Opening Keynote Deepfakes: regulatory challenges for the synthetic society with Bart van der Sloot|
|10:10 - 11:10||Tutorial 1 Anonymization Part 1 by Emmanuel Vincent|
|11:10 - 11:40||Break|
|11:40 - 12:00||Tutorial 2 Anonymization Part 2 by Xin Wang|
|12:00 - 12:20||Introduction of the VoicePrivacy Challenge with Natalia Tomashenko|
|12:20 - 13:00||Discussion about the VoicePrivacy Challenge with VPC Organizers|
|13:00 - 14:30||Lunch Break (self-service)|
|14:30 - 15:30|| Privacy and security using speech and speaker recognition techniques Part 1
|15:30 - 16:00||Invited Talk Detecting manipulated and synthetic audio with Zhizheng Wu|
|16:00 - 16:30||Break|
|16:30 - 17:00||Invited Talk Choose a pseudonym. Legal perspective on pseudonymisation of speech data. with Pawel Kamocki|
|17:00 - 18:00||SIG's SPSC Townhall Meeting What has been done, what will be done in SPSC with Tom Backström|
|18:00 - open||Social Event for on-site participants|
|9:00 - 11:00|| VoicePrivacy Challenge
|11:00 - 11:30||Break|
|11:30 - 12:30|| Privacy and security using speech and speaker recognition techniques Part 2
|12:30 - 13:30||
Closing Keynote The Five Issues Between Voice and Its Value with Jon Stine
|13:30 - 13:40||Closing Ceremony|
All time are given with respect to the KET zone. You can use a time zone converter to check the times in your time zone.
Registration fees for the event:
The registration to the workshop can be performed using the Interspeech registration system For the event-only registration (without attending INTERSPEECH 2022), please use the following link: Event-only Registration
The event is open to everyone, regardless of their contribution to the VoicePrivacy challenge or SPSC symposium.
In addition, all the VoicePrivacy challenge participants, who will submit results and system descriptions by 31st July, are encouraged to present their work during the event (even if they did not submit papers to the SPSC symposium).
The VoicePrivacy initiative aims to promote the development of privacy preservation tools for speech technology and foster progress in the development of anonymization and pseudonymization solutions which suppress personally identifiable information contained within recordings of speech while preserving linguistic content and speech naturalness. VoicePrivacy takes the form of a competitive benchmarking challenge, with common datasets, protocols and metrics.
Ingo SIEGERT, Otto von Guericke University Magdeburg, Germany
Karla MARKERT, Fraunhofer AISEC, Germany
Tom BÄCKSTRÖM, Aalto University, Finland
Irina ILLINA, University of Lorraine, France
Hung-yi LEE, National Taiwan University, Taiwan
Jennifer WILLIAMS, University of Southampton, UK
Shri NARAYANAN, University of Southern California
Salima MEDHAFFAR, LIA - Avignon University, France
Gerald PENN, University of Toronto, Canada
Natalia TOMASHENKO, LIA - Avignon University, France
Nick EVANS, EURECOM, France