Data-Overview

devtools::load_all()
# library(lossrx)

data("claims_transactional")
data("losses")
data("exposures")

latest_eval <- losses |> dplyr::filter(eval_date == max(.data$eval_date))
wc_dat <- latest_eval |> dplyr::filter(coverage == "WC")
al_dat <- latest_eval |> dplyr::filter(coverage == "AL")

lossrx Datasets

lossrx comes with some built in data for example usage, including:

  • a simulated transactional claims data.frame
  • a suite of example WC and AL lossruns combined into a single data.frame
  • sample exposure data for WC ($ payroll) and AL (vehicles or miles driven)

Loss Data

plot_distr(
  ~ total_incurred | coverage,
  latest_eval,
  mod.method = "split"
)

Top 10 Rows:

head(losses) |>
  kable(format = "html", digits = 2) |>
  kable_styling()
eval_date devt_age occurrence_number coverage member program_year loss_date rept_date hire_date report_lag report_lag_group day_of_week claim_type claimant_state loss_state cause department tenure tenure_group claimant_age claimant_age_group driver_age driver_age_group status total_paid total_incurred count open_count close_count incurred_group
2019-12-31 108 1 WC Member 2 2011 2011-01-03 2011-01-04 2000-09-11 1 1 to 3 Days Monday WCIN CA CA Strain / Overexertion Drivers 19.30 10+ Years 51 30+ Years Old NA (Missing) C 4149.66 4149.66 1 0 1 $0-$50K
2019-08-31 104 1 WC Member 2 2011 2011-01-03 2011-01-04 2000-09-11 1 1 to 3 Days Monday WCIN CA CA Strain / Overexertion Drivers 18.97 10+ Years 51 30+ Years Old NA (Missing) C 4149.66 4149.66 1 0 1 $0-$50K
2019-04-30 100 1 WC Member 2 2011 2011-01-03 2011-01-04 2000-09-11 1 1 to 3 Days Monday WCIN CA CA Strain / Overexertion Drivers 18.63 10+ Years 51 30+ Years Old NA (Missing) C 4149.66 4149.66 1 0 1 $0-$50K
2018-12-31 96 1 WC Member 2 2011 2011-01-03 2011-01-04 2000-09-11 1 1 to 3 Days Monday WCIN CA CA Strain / Overexertion Drivers 18.30 10+ Years 51 30+ Years Old NA (Missing) C 4149.66 4149.66 1 0 1 $0-$50K
2018-08-31 92 1 WC Member 2 2011 2011-01-03 2011-01-04 2000-09-11 1 1 to 3 Days Monday WCIN CA CA Strain / Overexertion Drivers 17.97 10+ Years 51 30+ Years Old NA (Missing) C 4149.66 4149.66 1 0 1 $0-$50K
2018-04-30 88 1 WC Member 2 2011 2011-01-03 2011-01-04 2000-09-11 1 1 to 3 Days Monday WCIN CA CA Strain / Overexertion Drivers 17.63 10+ Years 51 30+ Years Old NA (Missing) C 4149.66 4149.66 1 0 1 $0-$50K

Summary:

print(dfSummary(losses, 
                varnumbers   = FALSE, 
                valid.col    = FALSE, 
                graph.magnif = 0.76),
      method = 'render')

Data Frame Summary

losses

Dimensions: 79748 x 30
Duplicates: 0
Variable Stats / Values Freqs (% of Valid) Graph Missing
eval_date [Date]
min : 2011-04-30
med : 2017-08-31
max : 2019-12-31
range : 8y 8m 1d
27 distinct values 0 (0.0%)
devt_age [numeric]
Mean (sd) : 40.9 (25.1)
min ≤ med ≤ max:
4 ≤ 36 ≤ 108
IQR (CV) : 40 (0.6)
27 distinct values 0 (0.0%)
occurrence_number [character]
1. 1
2. 10
3. 100
4. 101
5. 102
6. 103
7. 104
8. 105
9. 106
10. 107
[ 6095 others ]
27(0.0%)
27(0.0%)
27(0.0%)
27(0.0%)
27(0.0%)
27(0.0%)
27(0.0%)
27(0.0%)
27(0.0%)
27(0.0%)
79478(99.7%)
0 (0.0%)
coverage [character]
1. AL
2. WC
31449(39.4%)
48299(60.6%)
0 (0.0%)
member [character]
1. Member 9
2. Member 10
3. Member 4
4. Member 17
5. Member 6
6. Member 1
7. Member 2
8. Member 12
9. Member 18
10. Member 13
[ 8 others ]
10878(13.6%)
10547(13.2%)
9347(11.7%)
8001(10.0%)
7165(9.0%)
5592(7.0%)
4508(5.7%)
3978(5.0%)
3707(4.6%)
3521(4.4%)
12504(15.7%)
0 (0.0%)
program_year [character]
1. 2011
2. 2012
3. 2013
4. 2014
5. 2015
6. 2016
7. 2017
8. 2018
9. 2019
13978(17.5%)
12249(15.4%)
11750(14.7%)
11281(14.1%)
11018(13.8%)
9068(11.4%)
5541(6.9%)
3380(4.2%)
1483(1.9%)
0 (0.0%)
loss_date [Date]
min : 2011-01-02
med : 2014-02-28
max : 2019-12-31
range : 8y 11m 29d
2430 distinct values 0 (0.0%)
rept_date [Date]
min : 2011-01-04
med : 2014-03-11
max : 2019-12-31
range : 8y 11m 27d
2240 distinct values 0 (0.0%)
hire_date [Date]
min : 1973-09-01
med : 2011-01-31
max : 2019-11-15
range : 46y 2m 14d
1727 distinct values 33290 (41.7%)
report_lag [numeric]
Mean (sd) : 8.6 (40.2)
min ≤ med ≤ max:
0 ≤ 1 ≤ 1620
IQR (CV) : 3 (4.7)
166 distinct values 0 (0.0%)
report_lag_group [factor]
1. 0 Days
2. 1 to 3 Days
3. 3+ Days
18386(23.1%)
37273(46.7%)
24089(30.2%)
0 (0.0%)
day_of_week [character]
1. Friday
2. Monday
3. Saturday
4. Sunday
5. Thursday
6. Tuesday
7. Wednesday
13428(16.8%)
16559(20.8%)
2604(3.3%)
2745(3.4%)
14644(18.4%)
16062(20.1%)
13706(17.2%)
0 (0.0%)
claim_type [character]
1. WCMO
2. ALPD
3. WCIN
4. AUPD
5. ALBI
6. AUBI
7. WCIO
8. AUIO
9. AUNA
10. WCNA
[ 9 others ]
27910(35.0%)
24365(30.6%)
20122(25.2%)
5172(6.5%)
1354(1.7%)
300(0.4%)
235(0.3%)
112(0.1%)
55(0.1%)
32(0.0%)
91(0.1%)
0 (0.0%)
claimant_state [character]
1. NY
2. IA
3. TX
4. CT
5. AL
6. CA
7. MD
8. HI
9. MO
10. NM
[ 34 others ]
13361(16.8%)
10318(13.0%)
9429(11.8%)
9374(11.8%)
7874(9.9%)
6513(8.2%)
6053(7.6%)
4695(5.9%)
3753(4.7%)
2163(2.7%)
6040(7.6%)
175 (0.2%)
loss_state [character]
1. NY
2. TX
3. CT
4. IA
5. AL
6. CA
7. MD
8. HI
9. MO
10. NM
[ 32 others ]
12310(15.4%)
9359(11.7%)
9059(11.4%)
8557(10.7%)
7067(8.9%)
6473(8.1%)
5663(7.1%)
4703(5.9%)
4118(5.2%)
2213(2.8%)
10201(12.8%)
25 (0.0%)
cause [character]
1. Burns
2. Caught In / Under / Betwe
3. Collisions - Multi-Vehicl
4. Cuts / Punctures
5. Miscellaneous
6. Motor Vehicle Accident
7. Repetitive Motion
8. Single Vehicle Accident
9. Slip / Trip / Fall
10. Strain / Overexertion
11. Struck By / Against
262(0.3%)
2201(2.8%)
14613(18.3%)
1955(2.5%)
3002(3.8%)
1066(1.3%)
731(0.9%)
16792(21.1%)
10980(13.8%)
19498(24.4%)
8648(10.8%)
0 (0.0%)
department [character]
1. Drivers
2. Food Prep/Mfg
3. Inside Sales / Administra
4. Outside Sales
5. Warehouse
50634(63.5%)
2630(3.3%)
4744(6.0%)
2048(2.6%)
19647(24.7%)
45 (0.1%)
tenure [numeric]
Mean (sd) : 7.4 (6.2)
min ≤ med ≤ max:
0 ≤ 5.8 ≤ 46.3
IQR (CV) : 6.6 (0.8)
11551 distinct values 33290 (41.7%)
tenure_group [factor]
1. Less than 1 Year
2. 1 to 3 Years
3. 3 to 5 Years
4. 5 to 10 Years
5. 10+ Years
6. (Missing)
2092(2.6%)
8620(10.8%)
9143(11.5%)
15406(19.3%)
11197(14.0%)
33290(41.7%)
0 (0.0%)
claimant_age [numeric]
Mean (sd) : 24.8 (20.2)
min ≤ med ≤ max:
0 ≤ 27 ≤ 117
IQR (CV) : 40 (0.8)
73 distinct values 1565 (2.0%)
claimant_age_group [factor]
1. Less than 18 Years Old
2. 18 to 21 Years Old
3. 22 to 30 Years Old
4. 30+ Years Old
5. (Missing)
26584(33.3%)
2561(3.2%)
14784(18.5%)
34254(43.0%)
1565(2.0%)
0 (0.0%)
driver_age [numeric]
Mean (sd) : 38.6 (10.9)
min ≤ med ≤ max:
0 ≤ 38 ≤ 95
IQR (CV) : 16 (0.3)
64 distinct values 63489 (79.6%)
driver_age_group [factor]
1. Less than 18 Years Old
2. 18 to 21 Years Old
3. 22 to 30 Years Old
4. 30+ Years Old
5. (Missing)
125(0.2%)
118(0.1%)
4060(5.1%)
11956(15.0%)
63489(79.6%)
0 (0.0%)
status [character]
1. C
2. I
3. O
4. R
71606(89.8%)
335(0.4%)
7288(9.1%)
519(0.7%)
0 (0.0%)
total_paid [numeric]
Mean (sd) : 6955.5 (34099)
min ≤ med ≤ max:
0 ≤ 897 ≤ 1161358
IQR (CV) : 3154 (4.9)
10119 distinct values 0 (0.0%)
total_incurred [numeric]
Mean (sd) : 8873.6 (42103.2)
min ≤ med ≤ max:
0 ≤ 1077.5 ≤ 1166036
IQR (CV) : 3526 (4.7)
9606 distinct values 0 (0.0%)
count [numeric]
Min : 0
Mean : 1
Max : 1
0:335(0.4%)
1:79413(99.6%)
0 (0.0%)
open_count [numeric]
Min : 0
Mean : 0.1
Max : 1
0:71941(90.2%)
1:7807(9.8%)
0 (0.0%)
close_count [numeric]
Min : 0
Mean : 0.9
Max : 1
0:8142(10.2%)
1:71606(89.8%)
0 (0.0%)
incurred_group [factor]
1. $0-$50K
2. $50K-$100K
3. $100K-$250K
4. $250K-$500K
5. $500K+
76789(96.3%)
1557(2.0%)
1021(1.3%)
281(0.4%)
100(0.1%)
0 (0.0%)

Generated by summarytools 1.0.1 (R version 4.4.2)
2024-11-11

Worker’s Compensation

Distribution of Claims

library(fplot)
fplot::plot_distr(wc_dat$total_incurred)

plot_distr(~ total_incurred | cause, wc_dat, cumul = TRUE)

plot_lines(
  total_incurred ~ program_year,
  losses
)