I want a function that is flat for the values x = 0 to 15, and that drops linearly for x > 15. I also want to function to be continuous (no abrupt jumps).
To fit this function, you need a special X matrix.
Surprisingly, even though the function you are trying to fit is not a linear function of x, you can still fit it with the lm function, as long as you have the correct independent variable matrix.
Warning: Removed 999 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 19 rows containing missing values or values outside the scale range
(`geom_line()`).
Notice that the quadratic model is smoother. It does not have a sharp elbow at the transition. Smoothness is often preferred in model fitting, though there are exceptions.
Fit a cubic spline
Here’s a final example of very simple splines. There is a “magic” cubic polynomial, \(3x^2-2x^3\). It rises smoothly from 0 at x=0, to 1 at x=1. It is flat (zero first derivative) at x=0 and x=1. This is what the function looks like.
x <-seq(0, 1, length=100)y <-3*x^2-2*x^3data.frame(x=x, y=y) %>%ggplot(aes(x, y)) +geom_line()
You can add a constant and/or multiply by a constant to get different starting and ending points. So, for example,
\(0.75 - 0.5(3x^2-2x^3)\)
will start at 0.75 and drop 0.5 units to end at 0.25.
I will use this function to show transition probabilities in a graph.
In a study of dog walking. Data was collected before and during the COVID pandemic. A “dog walker” was classified as someone who spent 150 minutes or more per week walking their dog. Did the proportion of dog walkers increase or decrease during the pandemic?
Source: Wallengren O, Bosaeus I, Frändin K, Lissner L, Falk Erhag H, Wetterberg H, Rydberg Sterner T, Rydén L, Rothenberg E, Skoog I. Comparison of the 2010 and 2019 diagnostic criteria for sarcopenia by the European Working Group on Sarcopenia in Older People (EWGSOP) in two cohorts of Swedish older adults. BMC Geriatr. 2021 Oct 26;21(1):600. doi: 10.1186/s12877-021-02533-y. PMID: 34702174; PMCID: PMC8547086. Available in [html format][wal1] or [pdf format][wal2].
Here’s the data.
During
Pandemic
Before
Pandemic Dog walker Non-walker Total
Dog walker 56 72 128
Non-walker 40 48 96
Total 96 120 216
Let’s convert these to cell percentages.
During
Pandemic
Before
Pandemic Dog walker Non-walker Total
Dog walker 26% 33% 59%
Non-walker 19% 22% 41%
Total 45% 55% 100%
Vucinic M, Vucicevic M, Nenadovicć K. THE COVID-19 PANDEMIC AFFECTS OWNERS WALKING WITH THEIR DOGS. J Vet Behav. 2021 Oct 20. doi: 10.1016/j.jveb.2021.10.009. Epub ahead of print. PMID: 34690614; PMCID: PMC8527592. Avialble in html format or pdf format.
Notice how the 19% slides smoothly upward and the 33% slides smoothly downward. So you can see that while some people took up dog walking during the pandemic, this was more than offset by the number who dropped dog walking during the pandemic.
Aris Perperoglou, Willi Sauerbrei, Michal Abrahamowicz, Matthias Schmid. A review of spline function procedures in R. BMC Medical Research Methodology, 2019-03-06, 19(46). Available in html format or pdf format.