Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
habrok:additional_information:course_material:exercises_solutions [2023/09/20 08:54] – [Exercise 8 (DRAFT) - Using local storage within a job] camarocico | habrok:additional_information:course_material:exercises_solutions [2024/11/26 10:29] (current) – Use dokuwiki wp links for wikipedia pedro | ||
---|---|---|---|
Line 4: | Line 4: | ||
These exercises can be followed by anyone having a Hábrók account, as the required data is available to all users. They may therefore be useful if you need to get started with using the cluster, but are not able to follow the basic course in person. | These exercises can be followed by anyone having a Hábrók account, as the required data is available to all users. They may therefore be useful if you need to get started with using the cluster, but are not able to follow the basic course in person. | ||
- | The accompanying | + | The accompanying |
The hostname of the Hábrók cluster to use for the exercises is: '' | The hostname of the Hábrók cluster to use for the exercises is: '' | ||
Line 11: | Line 11: | ||
- | The end goal of these exercises is to submit | + | The end goal of these exercises is to submit |
results. The first job will run some R code that generates an animated | results. The first job will run some R code that generates an animated | ||
GIF file of the Mandelbrot set. The second job will run a Python script | GIF file of the Mandelbrot set. The second job will run a Python script | ||
on climate data and generates both text output and a plot of temperature | on climate data and generates both text output and a plot of temperature | ||
- | data of a city. | + | data of a city. The third job will train a neural network based rice |
+ | classifiier, | ||
In the first part of the exercises we are going to use the command-line | In the first part of the exercises we are going to use the command-line | ||
- | to set up some directories and files for these two jobs. The directories | + | to set up some directories and files for these three jobs. The directories |
will contain all the input files (scripts, data) for the different jobs | will contain all the input files (scripts, data) for the different jobs | ||
that we are going to submit. | that we are going to submit. | ||
In the second part of the exercises we will write the job scripts for | In the second part of the exercises we will write the job scripts for | ||
- | both jobs, actually submit them, and, finally, study the results | + | all three jobs, actually submit them, and, finally, study their results. |
- | jobs. | + | |
===== Exercises for Part I ===== | ===== Exercises for Part I ===== | ||
Line 66: | Line 66: | ||
You can again just copy and paste the code on the command line: < | You can again just copy and paste the code on the command line: < | ||
[username@login1 ~]$ ls / | [username@login1 ~]$ ls / | ||
- | ex1_mandelbrot.R | + | dataset.tar.gz |
[username@login1 ~]$ | [username@login1 ~]$ | ||
</ | </ | ||
Line 404: | Line 404: | ||
</ | </ | ||
- | ==== Exercise 4 (DRAFT) | + | ==== Exercise 4 - Command-line: |
=== a. Change back to the jobs directory === | === a. Change back to the jobs directory === | ||
Line 593: | Line 593: | ||
< | < | ||
[username@login1 username]$ ls | [username@login1 username]$ ls | ||
+ | climate.csv | ||
[username@login1 username]$ | [username@login1 username]$ | ||
</ | </ | ||
Line 609: | Line 610: | ||
==== Exercise 6 - Using R within a job ==== | ==== Exercise 6 - Using R within a job ==== | ||
+ | |||
+ | In this exercise we will generate an animated image file showing an iterative generation of the Mandelbrot fractal using some code in R. You can find more details on the Mandelbrot set [[wp> | ||
+ | |||
+ | The main purpose of the exercise is to learn how to submit R code using a job script to the cluster. Having a nice image as a result is a bonus. | ||
=== a. Go to the job directory for the Mandelbrot job === | === a. Go to the job directory for the Mandelbrot job === | ||
Line 683: | Line 688: | ||
- Did you forget to include the Shebang! line at the top of the file? Better do it now, then. | - Did you forget to include the Shebang! line at the top of the file? Better do it now, then. | ||
<hidden solution> | <hidden solution> | ||
- | You can write the jobscript with the text editor you prefer. Note that instructions for the batch scheduler have to be given in lines starting with ''# | + | You can write the jobscript with the text editor you prefer. Note that instructions for the batch scheduler have to be given in lines starting with ''# |
Here is the information you need to put in: | Here is the information you need to put in: | ||
Line 907: | Line 912: | ||
When you open the file on your local computer you will see an animated version of the Mandelbrot fractal set. | When you open the file on your local computer you will see an animated version of the Mandelbrot fractal set. | ||
- | Details on this calculation can be found at: https://en.wikipedia.org/ | + | Details on this calculation can be found [[wp> |
</ | </ | ||
Line 973: | Line 978: | ||
==== Exercise 7 - Using Python within a job ==== | ==== Exercise 7 - Using Python within a job ==== | ||
+ | |||
+ | In this exercise we will run a Python script that will analyze some temperature data for cities around the world stored in a csv file. The result will be a graph showing the average temperature over a period of time for the city of your choosing. | ||
=== a. Go to the climate job directory === | === a. Go to the climate job directory === | ||
Line 1070: | Line 1077: | ||
</ | </ | ||
- | == d. Make sure you have a valid city name and submit the job === | + | === d. Make sure you have a valid city name and submit the job === |
Make sure that you have replaced // | Make sure that you have replaced // | ||
major city (see exercise 3h of the first part), and submit the job. | major city (see exercise 3h of the first part), and submit the job. | ||
Line 1147: | Line 1154: | ||
</ | </ | ||
- | ==== Exercise 8 (DRAFT) | + | ==== Exercise 8 - Using local storage within a job ==== |
+ | |||
+ | In this exercise we will train a neural network to recognize the type of rice from a picture of a rice grain. The training data set consists of 14,000 pictures for each type of rice grain, for 5 types. The types being Jasmine, Basmati, Arborio, Ipsala and Karacadag. Next to this the data set has 1,000 pictures for each grain to test the quality of the resulting neural network. | ||
+ | |||
+ | The data set can be found at: https:// | ||
+ | |||
+ | Here are a few sample images: | ||
+ | |||
+ | |{{: | ||
+ | ^ Karacadag ^ Arborio ^ Jasmine ^ | ||
+ | |||
+ | The main purpose of the exercise is to show you how to handle data sets containing many small files. In this case 75,000. If you would extract the data set on the /scratch file system, you'll already notice that this is quite slow as /scratch works poorly for handling many small files, as it has been optimized for streaming large files. | ||
+ | |||
+ | We will therefore not go into more detail about how to use the resulting neural network. | ||
=== a. Go to the rice_classifier job directory === | === a. Go to the rice_classifier job directory === | ||
Line 1220: | Line 1240: | ||
</ | </ | ||
- Extract the compressed dataset to the right location on the local storage: < | - Extract the compressed dataset to the right location on the local storage: < | ||
- | tar xzvf / | + | tar xzf / |
</ | </ | ||
- Run the training: < | - Run the training: < | ||
Line 1262: | Line 1282: | ||
# Extract the compressed data file to local storage | # Extract the compressed data file to local storage | ||
- | tar xzvf / | + | tar xzf / |
+ | |||
+ | echo Starting Python program | ||
# Train the classifier | # Train the classifier | ||
Line 1290: | Line 1312: | ||
=== f. Study the output file === | === f. Study the output file === | ||
- | Study the SLURM output file and solve any errors, if necessary. | + | Study the SLURM output file and solve any errors, if necessary. Note that tensorflow gives several warnings about not being able to use the CUDA library for a Nvidia GPU, which can be ignored. |
<hidden solution> | <hidden solution> | ||
If everything went right no error messages should appear in the output file. | If everything went right no error messages should appear in the output file. | ||
Line 1302: | Line 1324: | ||
This is generally not a good idea, since the results might be large / contain lots of files, but it is fine for this particular example. | This is generally not a good idea, since the results might be large / contain lots of files, but it is fine for this particular example. | ||
- | Copy the contents of the '' | + | Copy the contents of the '' |
<hidden solution> | <hidden solution> | ||
You can use the MobaXterm file browser for downloading the file to your desktop or laptop for inspection. For this you need to move the file browser to the climate job directory. Select the png file and click on the download button with the arrow pointing downwards. | You can use the MobaXterm file browser for downloading the file to your desktop or laptop for inspection. For this you need to move the file browser to the climate job directory. Select the png file and click on the download button with the arrow pointing downwards. | ||
</ | </ | ||
+ | There will be three plots. One showing the accuracy for the training epochs for both the training and testing data set. The second plot will show the loss function, which is another measure for the accuracy of the neural network. The third plot will show the confusion matrix, which counts how the images in the testing data set where labeled, comparing the predicted labels to the real labels. | ||
==== The End ==== | ==== The End ==== | ||
Congratulations! You have submitted your first jobs to the Hábrók cluster. With what you've learned you should be able to write your own job scripts and run calculations on the Hábrók cluster. | Congratulations! You have submitted your first jobs to the Hábrók cluster. With what you've learned you should be able to write your own job scripts and run calculations on the Hábrók cluster. |