| Both sides previous revision Previous revision Next revision | Previous revision |
| dcc:itsol:whisper [2024/07/31 08:28] – moved content from LibGuides to here giulio | dcc:itsol:whisper [2025/12/17 08:20] (current) – added beta release warning giulio |
|---|
| {{indexmenu_n>3}} | {{indexmenu_n>5}} |
| ====== Whisper Guide ====== | ====== Whisper Guide ====== |
| |
| ===== Attention! ===== | ===== Attention! ===== |
| |
| **The use of Whisper is currently being piloted by the DCC. We are trying to build a highly secure environment for you to safely use this A.I. based tool. We kindly ask for your patience while we set this up for you.** | **The use of Whisper is at the user's own responsibility. The setup illustrated in this guide makes use of an instance of Whisper that is run locally and that does not send data outside of the local environment. Please make sure to handle your data correctly. If you have any doubts about how to do so, please start by following the steps on this page of our wiki: [[dcc:itsol:whisper:datamanage]].** |
| |
| **If you are interested in using this tool and you are not already in contact with the DCC, please feel free to send us a message at [[dcc@rug.nl|dcc@rug.nl]]. We will put you on our waiting list and will contact you as soon as the pilot is over. Once the pilot is complete, we will make the tool available on demand.** | **If you have any questions on Whisper that this guide does not answer, please feel free to send us a message at [[dcc@rug.nl|dcc@rug.nl]].** |
| | |
| | **News Item (16-12-2025):** We have released a beta version of the Whisper interface with the option to add diarization (speaker recognition) to the transcription/translation job. This version of the interface is not final yet, so please keep in mind that not everything might work the way you want it to. We will complete this new interface in January, but in the meantime, feel free to test it out using the default parameters. |
| |
| ===== Introduction ===== | ===== Introduction ===== |
| |
| This guide takes you through the steps to set up a personal system of speech-to-text transcription on University of Groningen infrastructure (for UG staff and students) on the basis of the [[https://openai.com/research/whisper|OpenAI Whisper automatic speech recognition (ASR) model]] running on the [[https://iris.service.rug.nl/tas/public/ssp/content/detail/service?unid=0d51dd1aa44f4cdcb4949f1702d1829f|Hábrók High Performance Computing]] (HPC) cluster. | This guide takes you through the steps to set up a series of folders and a script to run speech-to-text transcription on the University of Groningen infrastructure (for UG staff and students) based on the [[https://openai.com/research/whisper|OpenAI Whisper automatic speech recognition (ASR) model]] running on the [[https://iris.service.rug.nl/tas/public/ssp/content/detail/service?unid=0d51dd1aa44f4cdcb4949f1702d1829f|Hábrók High Performance Computing]] (HPC) cluster. |
| | |
| The process of transcribing spoken audio to text is usually a very time consuming manual process. The UG offers a licensed version of [[https://www.audiotranskription.de/en/f4transkript/|F4 Transkript]] on the University Workplace as an aid for manual transcription, but doesn't offer automatic speech recognition software. | |
| | |
| This guide is offered by the DCC to help researchers process their research data as efficiently as possible, while optimizing data protection (keeping their audio files on UG storage instead of sending it to cloud services). For technical aspects, the service is supported by the Data Science and HPC team of the CIT. If you wish to read more on the detailed functionalities of Whisper, please refer to the [[https://github.com/openai/whisper|manual in their Git repository]]. | |
| |
| Because audio is highly sensitive data, our advice is to access this tool by requesting a [[https://www.rug.nl/society-business/centre-for-information-technology/research/services/virtual-research-workspace?lang=en|Virtual Research Workspace]] (VRW). To request a VRW capable to access Whisper, please use the form found in the linked webpage and specify that you need transcription capabilities. | The process of transcribing spoken audio to text is usually a very time-consuming manual process. The UG offers a licensed version of [[https://www.audiotranskription.de/en/f4transkript/|F4 Transkript]] on the University Workplace as an aid for manual transcription, but doesn't offer automatic speech recognition software. |
| |
| **N.B.**: As stated above, we are as of now not able to give you access to the tool due to an ongoing pilot. You can still contact us at [[dcc@rug.nl|dcc@rug.nl]] and we will put you on our waiting list. When the pilot is over, we will contact you again to onboard you. | This guide is offered by the DCC to help researchers process their research data as efficiently as possible, while optimizing data protection (keeping their audio files on UG storage instead of sending them to cloud services). For technical aspects, the service is supported by the Data Science and HPC team of the CIT. If you wish to read more on the detailed functionalities of Whisper, please refer to the [[https://github.com/openai/whisper|manual in their Git repository]]. |
| |
| Should you have any further questions on the use or initial set up of Whisper on Hábrók HPC, please contact the DCC at [[dcc@rug.nl|dcc@rug.nl]]. | Should you have any further questions on the use or initial setup of Whisper on Hábrók HPC, please contact the DCC at [[dcc@rug.nl|dcc@rug.nl]]. |
| |
| | [[dcc:itsol:whisper:setup| → Move to the next step]] |