Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

For this assignment, you may use a Python program or a JupyterLab notebook. 

(0) Completion of Labs and Reading

  • If you have not yet completed the in-class work or the weekly reading, then you may want to finish that first. Recent lecture notes on Canvas may also be useful... 

(

...

0.5)  Checkout the repository:


Use this link to accept the assignment and create your repository on GitHub:   https://classroom.github.com/a/NKjGQ23OwIjIwZVv

After you accept the assignment and the repository and it exists in your GitHub then clone the repository into your working area on Rivanna. 

Processing a data file (reading and writing)

For this example, we will examine the full Iris data set mentioned in class.

...

-- Iris Versicolour

-- Iris Virginica




1)

...

3 Points:

Write a program (or notebook) iris_parse.py (or .ipynb) that performs the following actions:

  • reads the data file, iris.data, one line at a time. Note that it is in your repository. 

  • prints it back to three different files depending on the class: "Setosa.out",  "Versicolour.out", and "Virginica.out". These files only need to include the 4 numbers. Don't include the name in the output file; it would be repetitive and complicate the next step.     


2)

...

2 Points:

Write a second program (or notebook) iris_loadtxt.py (or .ipynb) to read-in each one of these files into NumPy arrays using the function np.loadtxt (you will have 3 NumPy arrays with 4 columns of values in each).  Use these columns along with NumPy functions to print summary statistics to the screen (mean and standard deviation).  Make sure it is clear what you are printing to the screen.

2b) Want to be an A student?  Yes? Then, try this part! (1 point)

Make a table including the summary statistics for each type of iris: average and standard deviation for the 4 attributes of each flower class.   Put the extra effort in to make your table look nice and easy to read and understand.  

Your output might look something like this:

Class                                      sepal length                         sepal width                       petal length                      petal width

Iris Setosa                              Avg +- SD                                ...

Iris Versicolour                        ...

Iris Virginica

Make sure your output is well-organized and easy to read, then write it to a file called summary.txt. Formatting matters!


(3) Pi again!   Plotting the summary of many MC integration results.  (3 Points)


Start with the program from your repository plotting_pi_mc.py or the equivalent notebook (note that they import pi_functions.py for the calculation as we did in class).   

Add a new function or functions that make the pi estimate many times (N_MC) and that makes a plot showing a histogram for the results obtained from your N_MC estimates of pi.  For each estimate use 1000 random x,y values for N (the number of random values thrown for MC integration).  Also, on the figure draw a line at the true value of pi, write in the title on the figure: pi value estimated (average of all values), the standard deviation, and the number of N_MC used.  


NOTE:  You need to select the range and binning carefully to visualize your data well.  I used 30 bins and range of 2.85-3.40.


Make this figure 4 times and save it to a file:  N_MC=10; 100 ; 1000 ; 10,000.  pi_mc_10.png, pi_mc_100.png, pi_mc_1000.png, pi_mc_10000.png


You should see the Central Limit Theorem in action!  Think about that...  


(3) The Gaussian Function (2 Points) 

Start with the program from your repository named (gaussian.py - or you may use the Jupyter Notebook) that draws a Gaussian normal distribution (mean=0, sigma=1). Fill the < 1-sigma, >1-sigma & <2-sigma, >2-sigma & < 3-sigma, , >3-sigma with different colors using the function fill_between, axvspan, or something else. Use the documentation and Google to figure out how to fill regions under the curve! Your plot should look something like this, but maybe even better!  


Image Added

Make sure to save this plot as PlotGaussian.png.


Done? Make sure you answered any questions, then clean up your code and include some useful comments, then pushUpload the two programs and outputs to your repository: Setosa.out, Versicolour.out, Virginica.out, summary.txt, iris_parse.py, and iris_loadtxt.py (or iris_parse.ipynb and iris_loadtxt.ipynb if you use a Notebook), plotting_pi_mc.py , pi_mc_10.png, pi_mc_100.png, pi_mc_1000.png, pi_mc_10000.png, pi_mc_all4.png, gaussian.py, and PlotGaussian.png to GitHub.


Start your work early, so you can get assistance if needed.