Simulated Annealing for Continuous Distribution Parameter Estimation in R
Jarrett Meyer
Posted on April 27, 2022
This was a fun one!
I've been trying to find a distribution that described HbA1c in the general population. After a few hours of digging, the best I could find was this research from NIH. In this article, all we are given is a few quantiles.
5% | 10% | 25% | 50% | 75% | 90% | 95% |
---|---|---|---|---|---|---|
4.7 | 4.8 | 5.1 | 5.4 | 5.7 | 6.1 | 6.8 |
I want to try to fit a continuous distribution to these points. Simulated annealing lets us search the problem space for a global optimum. I know from the text of the research that the data is skewed (mean = 5.6, mode = 5.3). We need a distribution that allows for skewness. Let’s go with the gamma distribution.
The first thing we need to do is create a data frame of our known data set.
df <- data.frame(quantile = c(0.05, 0.1, 0.25, 0.50, 0.75, 0.9, 0.95),
value = c(4.7, 4.8, 5.1, 5.4, 5.7, 6.1, 6.8))
Next, we need to write a function that we want our SA algorithm to minimize.
fun <- function (x) {
shape <- x[1]
scale <- x[2]
df$est <- qgamma(df$quantile, shape, scale)
df$sq_err <- (df$est - df$value) ^ 2
error <- sum(df$sq_err)
error
}
We will use the GenSA package for the simulated annealing.
library(GenSA)
lower_bound <- 1e-6
upper_bound <- 100
fit <- GenSA(par = c(1, 1),
fn = fun,
lower = c(lower_bound, lower_bound),
upper = c(upper_bound, upper_bound))
This gives us gamma distribution parameters of shape = 89.4837 and scale = 16.2514. The SSE for this fit is 0.1747.
Lets see how this looks when plotted against the original data.
library(ggplot2)
df2 <- data.frame(x = seq(from = 3.0, to = 10.0, by = 0.01))
df2$y = pgamma(df2$x, fit$par[1], fit$par[2])
ggplot() +
geom_line(aes(x, y), df2, color = "red") +
geom_point(aes(value, quantile), df, size = 2, color = "blue")
There you go! If you only have a few data points and their quantiles, you can make a guess about their distribution. You can try this with multiple distributions and compare their resulting SSEs.
Posted on April 27, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 5, 2024