Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
habrok:additional_information:course_material:exercises_solutions [2024/10/10 08:32] – [Exercise 8 - Using local storage within a job] fokkehabrok:additional_information:course_material:exercises_solutions [2024/11/26 10:29] (current) – Use dokuwiki wp links for wikipedia pedro
Line 610: Line 610:
  
 ==== Exercise 6 - Using R within a job ==== ==== Exercise 6 - Using R within a job ====
 +
 +In this exercise we will generate an animated image file showing an iterative generation of the Mandelbrot fractal using some code in R. You can find more details on the Mandelbrot set [[wp>Mandelbrot_set|here]].
 +
 +The main purpose of the exercise is to learn how to submit R code using a job script to the cluster. Having a nice image as a result is a bonus.
  
 === a. Go to the job directory for the Mandelbrot job === === a. Go to the job directory for the Mandelbrot job ===
Line 908: Line 912:
  
 When you open the file on your local computer you will see an animated version of the Mandelbrot fractal set. When you open the file on your local computer you will see an animated version of the Mandelbrot fractal set.
-Details on this calculation can be found at: https://en.wikipedia.org/wiki/Mandelbrot_set+Details on this calculation can be found [[wp>Mandelbrot_set|here]].
 </hidden> </hidden>
  
Line 974: Line 978:
  
 ==== Exercise 7 - Using Python within a job ==== ==== Exercise 7 - Using Python within a job ====
 +
 +In this exercise we will run a Python script that will analyze some temperature data for cities around the world stored in a csv file. The result will be a graph showing the average temperature over a period of time for the city of your choosing.
  
 === a. Go to the climate job directory === === a. Go to the climate job directory ===
Line 1149: Line 1155:
  
 ==== Exercise 8 - Using local storage within a job ==== ==== Exercise 8 - Using local storage within a job ====
 +
 +In this exercise we will train a neural network to recognize the type of rice from a picture of a rice grain. The training data set consists of 14,000 pictures for each type of rice grain, for 5 types. The types being Jasmine, Basmati, Arborio, Ipsala and Karacadag. Next to this the data set has 1,000 pictures for each grain to test the quality of the resulting neural network.
 +
 +The data set can be found at: https://www.kaggle.com/datasets/muratkokludataset/rice-image-dataset
 +
 +Here are a few sample images:
 +
 +|{{:habrok:additional_information:course_material:karacadag_10004_.jpg?nolink|}}|{{:habrok:additional_information:course_material:arborio_100_.jpg?nolink|}}|{{:habrok:additional_information:course_material:jasmine_10003_.jpg?nolink|}}|
 +^ Karacadag ^ Arborio ^ Jasmine ^
 +
 +The main purpose of the exercise is to show you how to handle data sets containing many small files. In this case 75,000. If you would extract the data set on the /scratch file system, you'll already notice that this is quite slow as /scratch works poorly for handling many small files, as it has been optimized for streaming large files.
 +
 +We will therefore not go into more detail about how to use the resulting neural network.
  
 === a. Go to the rice_classifier job directory === === a. Go to the rice_classifier job directory ===
Line 1265: Line 1284:
 tar xzf /scratch/$USER/dataset.tar.gz -C $TMPDIR/dataset tar xzf /scratch/$USER/dataset.tar.gz -C $TMPDIR/dataset
  
-echo Starting Pyton program+echo Starting Python program
  
 # Train the classifier # Train the classifier
Line 1293: Line 1312:
  
 === f. Study the output file === === f. Study the output file ===
-Study the SLURM output file and solve any errors, if necessary. Note that tensorflow gives several warnings about not being able to use a Nvidia GPU, which can be ignored.+Study the SLURM output file and solve any errors, if necessary. Note that tensorflow gives several warnings about not being able to use the CUDA library for a Nvidia GPU, which can be ignored.
 <hidden solution> <hidden solution>
 If everything went right no error messages should appear in the output file.  If everything went right no error messages should appear in the output file. 
Line 1310: Line 1329:
 </hidden> </hidden>
  
 +There will be three plots. One showing the accuracy for the training epochs for both the training and testing data set. The second plot will show the loss function, which is another measure for the accuracy of the neural network. The third plot will show the confusion matrix, which counts how the images in the testing data set where labeled, comparing the predicted labels to the real labels. 
 ==== The End ==== ==== The End ====
  
 Congratulations! You have submitted your first jobs to the Hábrók cluster. With what you've learned you should be able to write your own job scripts and run calculations on the Hábrók cluster.  Congratulations! You have submitted your first jobs to the Hábrók cluster. With what you've learned you should be able to write your own job scripts and run calculations on the Hábrók cluster.