Monday, 7 September 2015

Statistical Tests


Parametric Tests
  • Assumes normal distribution
    • Therefore only usable with Continuous Data
    • ie if the data is not normally distributed, it may give an erroneous result
Non-Parametric Test
  • Used for Non-Normal distribution of continuous data
  • Used for non-continuous data
Parametric TestsStudent T Test Applications
  • Parametric data
  • Can be used for dependent (paired) or independent (unpaied ) data
  • Compares two groups only
ANOVA
  • Analysis of Variance
  • Parametric Data
  • Compares 3 or more groups
ANCOVA
  • Analysis of Co-variance
  • like ANOVA, but controls for confounding variable
Non-parametric TestsOrdinal Data
  • Wilcoxon Rank Sum
  • Wilcoxon singed rank
  • Mann-Whitney
  • Kruskal_Wallis
  • Friedman
Nominal Data
  • Chi Square
    • Use Bonferroni adjustments for more than 2 comparisons
  • Fisher Exact
  • Mc Nemar
  • Mantel-Haensel
Correlation
  • Correlation coefficient (r)
  • Coefficient of determiniation (R2
    • How much variability in X is determined by Y
  • Determines how related two variable ares
Pearson
  • Parametric Data
Spearman
  • Non-Parametric
NominalThink "name". Data of this type has no "sortability" or ranking. Think green spoons vs. blue spoons vs. pink spoons. Which one is "higher" than the other? Answer: None. The question doesn't make sense. If you can't rank them, it's nominal data. Side effects of different types would also be nominal data. You cant really rank nausea vs. headache vs. constipation.
Ordinal dataThink "order". Data of this type can be sorted, but the "distance" between categories isn't clear or consistent. Think of the severity of bleeding associated with anticoagulation therapy. Mild, Moderate and Severe. Severe is worse than mild, certainly. But the difference cannot be described in any sort of units. Severe is usually defined as either a hospitilization or transfusion. Something of an arbitrary definition.
Interval datacan be ranked/sorted, AND the interval between the ranks is uniform. Extending the example of blood loss. Hemoglobin lab results are interval data. The distance between 10gm/dL and 11gm/dL is the same as the difference between 14gm/dL and 15gm/dL.
Interval vs RatioThe classic comparsion between interval data and ratio data is the comarpsion of temperature in celcius and kelvins. They use the same scale, but Celcius' (interval data) zero point is the freezing of water. Kelvin (ratio) starts at absolute zero (-273.15 degrees celcius). Most laboratory tests are either interval or ratio. Platelets are ratio, because you can't have a negative platelet count. Zero is as low as it goes.
Correlationhow strongly data are related. The scale is from -1 to +1. If the r value is less than zero, it is a negative relationship, ie as one goes up, the other goes down. The larger the absolute value (ie the closer to -1 for negative correlations or +1 for possitive correlations) of r, the greater the association. For example, if variables x and y have a r of 0.15 and variables z and v have a r of 0.98, it is more likely that z and v are related. x and y are unlikely to be related.

No comments:

Post a Comment