2008 Semester 1

2nd year Data Analysis

 

 

http://physics.uwa.edu.au/~hammond/DataAnalysis/

 

Recommended text:  PR Bevington, “Data Reduction and Error Analysis for the Physical Sciences”

though any text book on data analysis will likely have many components of what will be covered in lectures.

 

 

Assessment Project Solutions

 

Q1 ex 11

Q2 ex 8

Q3 ex 12

Student query and PH response (9/04/08)

 

I was wondering if you could please answer a couple of questions that I have about data analysis.

    I have written up an example (modified from Levy & Preidel’s question 3) that outlines methods appropriate to your questions below.

Firstly, when doing the method of weighted least squares, how do you determine sigma(i)?

    Estimate, perhaps by using an unweighted fit to get an idea of the sigma value.

Secondly, I'm not quite sure how to use the propagation of uncertainties formula for sigma(z)

    This is outlined in lecture 4 and in the example here.

Lastly, what process are we supposed to use to determine whether we need to use the weighted least squares method or not?

    I hope the example here will help with making these kind of decisions.

 

L&P modified example 3

Tutorial

Explicit expressions for a & b in a weighted least squares

Two analysed examples with calculation layouts from Bevington

 

Handwritten comments from overhead projector

 

 

Lecture 4

 

Web

 

Pdf

 

Pdf-pf

 

 

Assessment Project

 

Make a full analysis of exercises 1, 4 and 5 from Levy & Preidel (1944) (pdf) commenting explicitly on which exercises require the use of weighted least squares and which do not.  All figures showing graphs of the data should be of the form which includes a plot of residuals as outlined in lectures.

 

Each student should work alone, developing their own data analysis routine.  The answer to each exercise should be written up in the form of the data analysis section of a laboratory report and the analysis method and discussion of the results should be in normal written English.  It MUST NOT be the raw print out of a Mathematica or Excel spreadsheet (or other software pachage).

 

Hand-in deadline (Physics Office) – Friday 11 April 10am (very strict deadline)

 

 

Lecture 3

 

Web

 

Pdf

 

Pdf-pf

 

3(a) Assume your data from lecture 1 follows s(i) = a + b i.  Determine values for a ± σa and b ± σb .  In the light of these values review your responses to 2(b), 2(c) and 2(d).

 

3(b) Answer the 9 questions from "Elementary Statistics" by Levy & Preidel (1944) (pdf) (some of which will need transforming into z(k) = a + b k form – be careful about your use of nomenclature in these cases) and determine values for a ± σa and b ± σb.

Solutions from book! 

 

Do they agree with yours?

 

The book results have not been checked and may not be reliable!!

Lecture 2

 

pdf

 

pdf-pf

ERS data – using ±3σ distributions

2(a) Complete the set-work for Lecture 1 (i.e. calculate s and σS for your 50 point sample of the dataset)

 

2(b) Plot your 50 point dataset on a graph. 

  • Are there any trends in your sample? 
  • Is your analysis method in (a) safe, given the trends you observe?

 

2(c) Is the analysis result proposed by the 1000 data point student consistent with your values for s and σS when taking into account his results arising from a 1000 point dataset (of which yours was a 50 point sample) and your identified trend(s) from (b)?

 

2(d) Calculate the standard deviation σD of your 50 point dataset (you may have already done this!!). 

  • Plot ( si / σD ) as a function of data point number i.  (i is the row number of each data point in the data file)
  • Based on your plot, is your value for σD statistically sensible?

 

Bring hardcopy of your worked solution to lecture 3.  Be prepared to discuss your solution.

 

Continue with working out how to minimize the sum of the squares of the residuals for a linear fit of y = a + b*x. 

 

What are the algebraic formula’s for a and b based on the N data points xi, yi

 

Lecture 1

 

web

 

pdf

 

pdf-pf

 

Do this question before the next lecture. 

 

Each student should bring their calculated value of s and σS to fill in on an overhead before the lecture.

 

In your analysis DO NOT use built in functions (e.g. of a calculator, spreadsheet or Mathematica) to perform the statistical analysis – write your own functions based on the formula presented in lectures.

 

Each of you has a different sample of the complete dataset for the situation below.  The data files are available from the link in the next column – use the file next to your name. 

 

A student makes length measurements of 1000 rods produced in an industrial process.  The results are written down as the length difference s of each rod from the expected length.  The student then analysed the dataset and  reported the overall difference as:

 

s = 0.65 ± 0.25 mm.

 

Is the average result reported by the student reasonable? 

 

Justify your opinion with your statistical analysis of your sample of 50 data points from his dataset.

 

Data 1