Sunday, 7 February 2016

Scatterplot Matrices to Analyse Water Parameters with R

So far, scatterplot matrices are the most useful function I have every seen in any software. Scatterplot matrices graphically summarize important relationships between vectors. Most impressively, scatterplot matrices can calculate the correlation coefficients between all possible combinations of vectors in a dataset. Also, matrices are easy to generate.

The goal for today's project is to identify physical water quality parameters with the strongest fit. The data is collected from River Avon, UK. Salinity and conductivity had a perfect fit, which was expected. Salinity and temperature had a moderate downhill (negative) linear relationship. Conductivity and temperature also had a moderate downhill (negative) linear relationship. Since conductivity, temperature and salinity likely influences each other, these parameters should be further analysed. Next steps could involve finding a regression plane between the three variables.

Water Parameters in River Avon, UK


 Water parameters measured are temperature (in Celsius), pH, Conductivity (mS), Dissolved Oxygen (%) and Salinity (ppt). The reading are conducted in different locations along the river during the summer season of 2015 (June, July and August).

Coding is as follows:

#River Avon Water Parameters 
#by Matthew Mano (matthewm3109@gmail.com)

water<-read.csv("waterQuality3.csv",header=T)
library("psych") #psych is a REALLY useful package 

pairs.panels(water[c(3, 4, 5, 6, 7)], gap = 0) #concatenation used to identify columns for regression 

As explained, coding is simple but powerful.

Link for data source: http://bit.ly/1LvM5XY
Link to download csv and r file: http://bit.ly/1LvMr0S

No comments:

Post a Comment