由资料和模型可知:留存率曲线是一个指数曲线,可根据前7天留存率数据预测之后的留存率数据:利用nls函数求出幂指数函数y=a*x^b的系数a、b
# 前七天实际留存率数据 (day <- seq(1:7)) # 天数 (ratio <- c(0.383,0.268,0.216,0.187,0.167,0.156,0.145)) # 留存率值 # 利用nls函数求出幂指数函数y=a*x^b的系数a、b fit <- nls(ratio~a*day^b,start = list(a=1,b=1)) # 查看模型结果 summary(fit) # 对新增用户在接下来365日每天的留存率进行预测 predicted <- predict(fit,data.frame(day=seq(1,365))) # 查看预测结果 predicted # 绘制留存率预测曲线 library(dygraphs) data <- as.data.frame(predicted) data <- ts(data) dygraph(data,main="留存的预测曲线") %>% dySeries("predicted",label="留存率",strokeWidth = 2) %>% dyOptions(colors = "green",fillGraph = TRUE,fillAlpha = 0.4) %>% dyHighlight(highlightCircleSize = 5, highlightSeriesBackgroundAlpha = 0.2, hideOnMouseOut = FALSE) %>% dyAxis("x", label = "日期",drawGrid = FALSE) %>% dyAxis("y", label = "留存率") %>% dyRangeSelector()结果如下:
> (day <- seq(1:7)) # 天数 [1] 1 2 3 4 5 6 7 > (ratio <- c(0.383,0.268,0.216,0.187,0.167,0.156,0.145)) # 留存率值 [1] 0.383 0.268 0.216 0.187 0.167 0.156 0.145 #参数估计结果 > summary(fit) Formula: ratio ~ a * day^b Parameters: Estimate Std. Error t value Pr(>|t|) a 0.381911 0.002164 176.51 1.11e-10 *** b -0.508544 0.005571 -91.29 2.99e-09 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.002349 on 5 degrees of freedom Number of iterations to convergence: 7 Achieved convergence tolerance: 9.236e-08 #预测留存率结果: > predicted [1] 0.38191061 0.26845690 0.21843606 0.18870675 0.16846294 0.15354554 [7] 0.14196843 0.13264787 0.12493582 0.11841787 0.11281510 0.10793196可放大图形,看更细致的曲线:
