Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
dcc:itsol:whisper:scripts [2024/08/12 07:33] – Removed text that did not make sense anymore giuliodcc:itsol:whisper:scripts [2025/02/03 08:49] (current) – added note on changed dependencies giulio
Line 5: Line 5:
  
 In order to run the script, you will first have to create it. Open your text editor of choice and copy the highlighted code below into the new file. Save the file with the name: ''whisper_runall.sh''. In order to run the script, you will first have to create it. Open your text editor of choice and copy the highlighted code below into the new file. Save the file with the name: ''whisper_runall.sh''.
 +
 +**Note:** The PyTorch module needed to install Whisper has changed due to an update on the dependencies of Whisper. The module displayed in the screenshots is the previous version. Please make sure to **use the version of the module you find in the text**.
  
  
Line 14: Line 16:
 ''#!/bin/bash'' ''#!/bin/bash''
  
-''#SBATCH --time=08:00:00''+''#SBATCH %%--%%time=08:00:00''
  
-''#SBATCH --gpus-per-node=1''+''#SBATCH %%--%%gpus-per-node=1''
  
-''#SBATCH --mem=16000''+''#SBATCH %%--%%mem=16000''
  
 \\ \\
  
-''module load PyTorch/1.12.1-foss-2022a-CUDA-11.7.0''+''module load PyTorch/2.1.2-foss-2023a-CUDA-12.1.1''
  
 ''source $HOME/.envs/whisper/bin/activate'' ''source $HOME/.envs/whisper/bin/activate''
  
-''whisper $HOME/whisper_audio/* --model large-v2 --output_dir $HOME/whisper_output/''+''whisper $HOME/whisper_audio/%%--%%model large-v2 %%--%%output_dir $HOME/whisper_output/''
  
 ---- ----
Line 56: Line 58:
 The next three lines specify certain parameters for the batch script: The next three lines specify certain parameters for the batch script:
  
-  * ''#SBATCH --time=08:00:00''+  * ''#SBATCH %%--%%time=08:00:00''
  
 This line specifies the maximum time your job will run on the cluster. The format is ''hh:mm:ss''. The example asks for a maximum of 8 hours, which is plenty of time to cover about 15-20 hours of interviews. Should you run into longer processing times, this is the parameter you want to change. This line specifies the maximum time your job will run on the cluster. The format is ''hh:mm:ss''. The example asks for a maximum of 8 hours, which is plenty of time to cover about 15-20 hours of interviews. Should you run into longer processing times, this is the parameter you want to change.
  
-  * ''#SBATCH --gpus-per-node=1''+  * ''#SBATCH %%--%%gpus-per-node=1''
  
 This line tells the cluster that the script is asking for 1 GPU to be allocated to this job. For Whisper, 1 GPU is more than enough to run the transcription, please do not modify this parameter. This line tells the cluster that the script is asking for 1 GPU to be allocated to this job. For Whisper, 1 GPU is more than enough to run the transcription, please do not modify this parameter.
  
-  * ''#SBATCH --mem=16000''+  * ''#SBATCH %%--%%mem=16000''
  
 This line specifies the amount of Memory/RAM asked for this job. In the default case, the script asks for 16GB of RAM to be allocated. This line specifies the amount of Memory/RAM asked for this job. In the default case, the script asks for 16GB of RAM to be allocated.
Line 72: Line 74:
 The next two lines make sure that the virtual environment and the dependencies that Whisper needs to run are correctly loaded: The next two lines make sure that the virtual environment and the dependencies that Whisper needs to run are correctly loaded:
  
-  * ''module load PyTorch/1.12.1-foss-2022a-CUDA-11.7.0''+  * ''module load PyTorch/2.1.2-foss-2023a-CUDA-12.1.1''
  
 This line loads the program packages that Whisper needs to run. Please be sure to not modify it, otherwise the script is not going to load the correct dependencies. This line loads the program packages that Whisper needs to run. Please be sure to not modify it, otherwise the script is not going to load the correct dependencies.
Line 82: Line 84:
 Finally, the last line is the actual command to run Whisper: Finally, the last line is the actual command to run Whisper:
  
-  * ''whisper $HOME/whisper_audio/* --model large-v2 --output_dir $HOME/whisper_output/''+  * ''whisper $HOME/whisper_audio/%%--%%model large-v2 %%--%%output_dir $HOME/whisper_output/''
  
-If you wish to modify the location of the input audio, then you need to specify its ''PATH'' and replace ''$HOME/whisper_audio/*''. Please remember to add an ''*'' at the end of the PATH to let the program know that you wish to process all files present in the folder you selected. In the same way, modify the PATH after ''--output_dir'' if you wish to change the location of the output directory. Finally, if you wish to change the language model used, you need to change the value after ''--model''. Please consult the Whisper manual before changing the model.+If you wish to modify the location of the input audio, then you need to specify its ''PATH'' and replace ''$HOME/whisper_audio/*''. Please remember to add an ''*'' at the end of the PATH to let the program know that you wish to process all files present in the folder you selected. In the same way, modify the PATH after ''%%--%%output_dir'' if you wish to change the location of the output directory. Finally, if you wish to change the language model used, you need to change the value after ''%%--%%model''. Please consult the Whisper manual before changing the model.
  
 ++++ ++++
 +
 +==== Specialized scripts ====
 +
 +The script described above is a general use script. It relies on Whisper to make most of the decisions regarding the transcription. If you need to be more strict on what the program is allowed to do, you might want to use one of the scripts listed below. 
 +
 +It is good practice to create different scripts for different tasks, instead of modifying the same script based on your needs. In this way, you don't have to modify the script again, if you want to execute a task that you already created in the past. This practice helps you keep order and is less prone to errors.
 +
 +\\
 +
 +=== Forced English ===
 +
 +This script forces Whisper to transcribe the audio into English. Use this script if the automatic language detection results in the wrong language (i.e. a strong English accent being recognized as Welsh, instead of English). The same concept works for other supported languages, for example Dutch. To change which language is forced, simply substitute the string ''English'' with the desired language behind the ''%%--%% language'' command.
 +
 +When you save the script, you can call it ''whisper_forcedEnglish.sh''. If you forced a different language, we advise you to label it accordingly. To execute it, simply type into the terminal ''sbatch whisper_forcedEnglish.sh'' and follow the same steps as the general script (see [[dcc:itsol:whisper:running|here]]).
 +
 +++++ Click to display the script |
 +
 +''#!/bin/bash''
 +
 +''#SBATCH %%--%%time=08:00:00''
 +
 +''#SBATCH %%--%%gpus-per-node=1''
 +
 +''#SBATCH %%--%%mem=16000''
 +
 +\\
 +
 +''module load PyTorch/2.1.2-foss-2023a-CUDA-12.1.1''
 +
 +''source $HOME/.envs/whisper/bin/activate''
 +
 +''whisper $HOME/whisper_audio/* %%--%%model large-v2 %%--%%language English %%--%%output_dir $HOME/whisper_output/''
 +
 +++++
 +\\
 +=== Translate instead of transcribe ===
 +
 +Whisper is also capable of translating any X language into English. To let the program know that you wish to see a translation instead of a transcription, you need to specify which ''%%--%% task'' the program needs to perform. The script below is already edited to perform a translation. Please keep in mind that the transcript will **only** be translated then, and **the original text will not be displayed in the output files**. If you need to have the original as a means of comparison, you can either first run the general script on the audio, or you can run a forced language script (see above) before you run the translation.
 +
 +When you save the script, you can call it ''whisper_translate.sh''. To execute it, simply type into the terminal ''sbatch whisper_translate.sh'' and follow the same steps as the general script (see [[dcc:itsol:whisper:running|here]]).
 +
 +**Note**: Regardless of whether you run the transcription or the translation first, the file names of the output files will be the exact same. In order for the second operation (translation or transcription) to not overwrite the first, you need to rename the output files before you run the second operation. In this way, the output of your first operation will remain untouched by the second operation.
 +
 +++++ Click to display the script |
 +
 +''#!/bin/bash''
 +
 +''#SBATCH %%--%%time=08:00:00''
 +
 +''#SBATCH %%--%%gpus-per-node=1''
 +
 +''#SBATCH %%--%%mem=16000''
 +
 +\\
 +
 +''module load PyTorch/2.1.2-foss-2023a-CUDA-12.1.1''
 +
 +''source $HOME/.envs/whisper/bin/activate''
 +
 +''whisper $HOME/whisper_audio/* %%--%%model large-v2 %%--%%task translate %%--%%output_dir $HOME/whisper_output/''
 +
 +++++ 
 +
 +\\
  
 [[dcc:itsol:whisper:running| → Move to the next step]] [[dcc:itsol:whisper:running| → Move to the next step]]