Abstract
The sample correlation coefficient R is almost universally used to estimate the
population correlation coefficient ρ. If the pair (X,Y)
has a bivariate normal distribution, this
would not cause any trouble. However, if the marginals are nonnormal, particularly if they
have high skewness and kurtosis, the estimated value from a sample may be quite different from
the population correlation coefficient ρ.
The bivariate lognormal is chosen as our case study for this robustness study. Two approaches
are used: (i) by simulation and (ii) numerical computations.
Our simulation analysis indicates that for the bivariate lognormal, the bias in estimating ρ can
be very large if ρ≠0, and it can be substantially reduced only after a large number (three to four
million) of observations. This phenomenon, though unexpected at first, was found to be
consistent to our findings by our numerical analysis.