Please read ALL the instructions carefully, especially those on the main homework page.   

Read the whole assignment before you begin.

For this assignment, you may use a Python program or a JupyterLab notebook. 

(0) Completion of Labs and Reading

(0.5)  Checkout the repository:

Use this link to accept the assignment and create your repository on GitHub:

After you accept the assignment and the repository and it exists in your GitHub then clone the repository into your working area on Rivanna. 

Processing a data file (reading and writing)

For this example, we will examine the full Iris data set mentioned in class.

In this data set, there are 5 columns of information (attributes):

1. sepal length in cm

2. sepal width in cm

3. petal length in cm

4. petal width in cm

5. class:

-- Iris Setosa

-- Iris Versicolour

-- Iris Virginica

1) 3 Points:

Write a program (or notebook) (or .ipynb) that performs the following actions:

2) 2 Points:

Write a second program (or notebook) (or .ipynb) to read-in each one of these files into NumPy arrays using the function np.loadtxt (you will have 3 NumPy arrays with 4 columns of values in each).  Use these columns along with NumPy functions to print summary statistics to the screen (mean and standard deviation).  Make sure it is clear what you are printing to the screen.

(3) Pi again!   Plotting the summary of many MC integration results.  (3 Points)

Start with the program from your repository or the equivalent notebook (note that they import for the calculation as we did in class).   

Add a new function or functions that make the pi estimate many times (N_MC) and that makes a plot showing a histogram for the results obtained from your N_MC estimates of pi.  For each estimate use 1000 random x,y values for N (the number of random values thrown for MC integration).  Also, on the figure draw a line at the true value of pi, write in the title on the figure: pi value estimated (average of all values), the standard deviation, and the number of N_MC used.  

NOTE:  You need to select the range and binning carefully to visualize your data well.  I used 30 bins and range of 2.85-3.40.

Make this figure 4 times and save it to a file:  N_MC=10; 100 ; 1000 ; 10,000.  pi_mc_10.png, pi_mc_100.png, pi_mc_1000.png, pi_mc_10000.png

You should see the Central Limit Theorem in action!  Think about that...  

(3) The Gaussian Function (2 Points) 

Start with the program from your repository named ( - or you may use the Jupyter Notebook) that draws a Gaussian normal distribution (mean=0, sigma=1). Fill the < 1-sigma, >1-sigma & <2-sigma, >2-sigma & < 3-sigma, , >3-sigma with different colors using the function fill_between, axvspan, or something else. Use the documentation and Google to figure out how to fill regions under the curve! Your plot should look something like this, but maybe even better!  

Make sure to save this plot as PlotGaussian.png.

Done? Make sure you answered any questions, then clean up your code and include some useful comments, then push: Setosa.out, Versicolour.out, Virginica.out,, and (or iris_parse.ipynb and iris_loadtxt.ipynb if you use a Notebook), , pi_mc_10.png, pi_mc_100.png, pi_mc_1000.png, pi_mc_10000.png, pi_mc_all4.png,, and PlotGaussian.png to GitHub.

Start your work early, so you can get assistance if needed.