Title: | Actuarial Loss Development and Reserving with R |
---|---|
Description: | Actuarial Loss Development and Reserving Helper Functions and ShinyApp. |
Authors: | Jimmy Briggs [aut, cre] |
Maintainer: | Jimmy Briggs <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.0.4 |
Built: | 2025-02-16 05:48:47 UTC |
Source: | https://github.com/jimbrig/lossrx |
Transactional claims dataset. Convert to a static lossrun
using the
loss_run()
function.
claims_transactional
claims_transactional
A data.frame
with 80278 rows and 12 variables:
claim_num
integer. DESCRIPTION.
claim_id
character. DESCRIPTION.
accident_date
double. DESCRIPTION.
state
character. DESCRIPTION.
claimant
character. DESCRIPTION.
report_date
double. DESCRIPTION.
status
character. DESCRIPTION.
payment
double. DESCRIPTION.
case
double. DESCRIPTION.
transaction_date
double. DESCRIPTION.
trans_num
integer. DESCRIPTION.
paid
double. DESCRIPTION.
Coalesce Join
coalesce_join( x, y, by = NULL, suffix = c(".x", ".y"), join = dplyr::full_join, ... )
coalesce_join( x, y, by = NULL, suffix = c(".x", ".y"), join = dplyr::full_join, ... )
x |
x |
y |
y |
by |
by |
suffix |
suffix |
join |
join type |
... |
passed to dplyr join function |
a tibble
Connect to the Actuarial Database Instance.
connect_db(pool = TRUE)
connect_db(pool = TRUE)
pool |
Logical - should the connection be pooled? |
a database connection
Creates a database table given a connection, table name, and:
SQL file specifying the table's schema
CSV file to seed the table's values
create_tbl( conn, tbl_name, csv_path = "data-raw/database/CSV", sql_path = "data-raw/database/SQL", drop_if_exists = TRUE )
create_tbl( conn, tbl_name, csv_path = "data-raw/database/CSV", sql_path = "data-raw/database/SQL", drop_if_exists = TRUE )
conn |
database connection |
tbl_name |
character string representing table name |
csv_path |
base path (excluding file) to the CSV file |
sql_path |
base path (excluding file) to the SQL file |
drop_if_exists |
Should the table be dropped (with CASCADE) if it already exists? |
the created database table returned as an R data.frame()
.
It is assumed that the tbl_name
mirrors both the basename of the CSV file
and the SQL file (excluding extensions).
Data Manipulation Utilities
Pull Unique Values from a dataframe
pull_unique(df, var)
pull_unique(df, var)
df |
a provided |
var |
character/numeric - quoted named of a variable from |
pull_unique
returns a character vector of unique, sorted values from specified column
df <- data.frame(let = rep(letters, 2), num = rep(c(1:26), 2)) pull_unique(df, 1) pull_unique(df, "num")
df <- data.frame(let = rep(letters, 2), num = rep(c(1:26), 2)) pull_unique(df, 1) pull_unique(df, "num")
A set of helper functions for dealing with dates in a typical actuarial analysis context.
Derive the number of months elapsed between two dates.
end_of_month(date) start_of_month(date) extract_date(string) elapsed_months(end_date, start_date)
end_of_month(date) start_of_month(date) extract_date(string) elapsed_months(end_date, start_date)
date |
character or date representation of a date. |
string |
string to extract a date from |
end_date |
end date |
start_date |
start date |
end_of_month
returns last day of the month given.
start_of_month
returns the first day of the month given.
extract_date
returns a date object extracted from the provided string.
numeric
as.Date()
, lubridate::ceiling_date()
, lubridate::floor_date()
,
flipTime::AsDate()
# character input start_of_month("2020-08-13") end_of_month("2017-10-20") # can handle human-readable dates also start_of_month("July 7, 1999") end_of_month("February 5, 2019") # date input start_of_month(as.Date("2020-08-13")) end_of_month(as.Date("2020-10-20"))
# character input start_of_month("2020-08-13") end_of_month("2017-10-20") # can handle human-readable dates also start_of_month("July 7, 1999") end_of_month("February 5, 2019") # date input start_of_month(as.Date("2020-08-13")) end_of_month(as.Date("2020-10-20"))
Creates skeleton to document datasets via roxygen2
.
doc_data( obj, title = deparse(substitute(obj)), description = "DATASET_DESCRIPTION", write_to_file = TRUE, ... )
doc_data( obj, title = deparse(substitute(obj)), description = "DATASET_DESCRIPTION", write_to_file = TRUE, ... )
obj |
object to document |
title |
Title |
description |
Description |
write_to_file |
Logical |
... |
N/A |
silently returns the doc_string
library(lossrx) data(losses) string <- doc_data(losses, "Loss Data", "Claims Data", FALSE) cat(string)
library(lossrx) data(losses) string <- doc_data(losses, "Loss Data", "Claims Data", FALSE) cat(string)
Exposure data.
exposures
exposures
A data.frame
with 855 rows and 5 variables:
member
character. DESCRIPTION.
program_year
double. DESCRIPTION.
department
character. DESCRIPTION.
payroll
double. DESCRIPTION.
miles
double. DESCRIPTION.
Extract numbers from a string
extract_num(string)
extract_num(string)
string |
String to pull numbers from |
String of numbers
A set of helper functions for providing verbose feedback to the developer using this packages functions.
msg_field(x) msg_value(x) msg_done(x) msg_bullet(x, bullet = cli::symbol$bullet) msg_err(x) msg_path(x) msg_info(x) msg_code(x) msg_feedback(x)
msg_field(x) msg_value(x) msg_done(x) msg_bullet(x, bullet = cli::symbol$bullet) msg_err(x) msg_path(x) msg_info(x) msg_code(x) msg_feedback(x)
x |
The string passed to various |
bullet |
What to use for the message's |
Other Feedback Utilities:
indent()
,
inform()
Indentation around various msg_
feedback functions.
indent(x, first = " ", indent = first)
indent(x, first = " ", indent = first)
x |
The string passed to various |
first |
what to indent with - defaults to |
indent |
indentation of next line - defaults to |
string
Other Feedback Utilities:
feedback
,
inform()
A wrapper around rlang::inform()
for providing feedback to developers using
this packages functions.
inform(...)
inform(...)
... |
Passed to |
feedback in console
Other Feedback Utilities:
feedback
,
indent()
interp
- Actuarial InterpolationInterpolate Cumulative Loss Development Factors (CDFs).
interp(new_age, cdf_array, age_array, cutoff = 450, method = 3) interp.dblexp(new_age, age_high, age_low, cdf_high, cdf_low) interp.exp(new_age, age_high, age_low, cdf_high, cdf_low) interp.linear(new_age, age_high, age_low, cdf_high, cdf_low)
interp(new_age, cdf_array, age_array, cutoff = 450, method = 3) interp.dblexp(new_age, age_high, age_low, cdf_high, cdf_low) interp.exp(new_age, age_high, age_low, cdf_high, cdf_low) interp.linear(new_age, age_high, age_low, cdf_high, cdf_low)
new_age |
integer - value of the new age whose CDF is to be interpolated |
cdf_array |
numeric vector of CDFs (usually representative of the selected factors) |
age_array |
numeric vector of ages corresponding to the supplied |
cutoff |
the largest possible age, after which, no interpolation is performed |
method |
integer - must be 1, 2, or 3 where 1 represents linear, 2 represents exponential, and 3 represents double exponential. Defaults to 3, but falls back onto 1 if necessary. |
age_low , age_high
|
Low and High ages |
cdf_low , cdf_high
|
Low and High CDFs |
This generic function comes with three possible method
s:
Linear Interpolation
Exponential Interpolation
Double Exponential Interpolation
Actuaries often have to interpolate values in-between the selected Loss Development Factors (LDFs) / Cumulative Loss Development Factors (CDFs) in order to derive development factors at a variety of possible ages of maturity, outside the scope of the selected factors by the actuary.
For example, an actuary will select factors by maturity or development age in months using actuarial triangles. Due to the fact the actuarial selections are limited to the maturities present in the triangle (i.e. ages 12, 24, etc.), the factors for ages before, after, and in-between the selection ages must be interpolated.
A comprehensive approach to deriving the interpolated values would follow a pattern similar to the following:
For ages of maturity <= First Selected Age of Maturity
(i.e. <= 12
),
factors are derived using persistencies. A persistency is simply a
percentage value representing the percent paid/reported at a given age
compared to the age's ceiling and floor. For example, a persistency as of
age 3 would represent the percent paid/reported at 3 months of development
out of the total percent paid/reported at 12 months of development. The
persistency as of age 15 would represent the percent paid/reported at age
15 compared to the total percent paid/reported between ages 12 and 24.
For ages of maturity Selected Age of Maturity Floor <= x <= Selected Age of Maturity Ceiling
,
i.e. in-between ages, the factors are derived using double-exponential
interpolation using the selections at the floor and ceiling ages.
For ages of maturity >= Last Selected Age of Maturity
(i.e. >= 106
),
a decay factor approach is used to decay the final selected factor
across the ages beyond that final age.
derived numeric value for the supplied new_age
's CDF
interp.dblexp()
: Double Exponential Interpolation
interp.exp()
: Exponential Interpolation
interp.linear()
: Linear Interpolation
cdfs <- c(3.579, 2.866, 2.489, 2.121, 1.876, 1.543, 1.222, 1.150, 1.109, 1.005, 1.0025) ages <- seq(from = 12, to = (length(cdfs) * 12), by = 12) interp(14, cdfs, ages) interp(12, cdfs, ages) == cdfs[[1]] interp(27, cdfs, ages, method = 2)
cdfs <- c(3.579, 2.866, 2.489, 2.121, 1.876, 1.543, 1.222, 1.150, 1.109, 1.005, 1.0025) ages <- seq(from = 12, to = (length(cdfs) * 12), by = 12) interp(14, cdfs, ages) interp(12, cdfs, ages) == cdfs[[1]] interp(27, cdfs, ages, method = 2)
view losses as of a specific date
loss_run(val_date, trans_dat)
loss_run(val_date, trans_dat)
val_date |
date the valuation date of the loss run. Claim values from |
trans_dat |
data frame of claims transactions |
data frame of claims (1 claim per row) valued as of the val_date
losses
losses
losses
A data.frame
with 79748 rows and 30 variables:
eval_date
double. DESCRIPTION.
devt_age
double. DESCRIPTION.
occurrence_number
character. DESCRIPTION.
coverage
character. DESCRIPTION.
member
character. DESCRIPTION.
program_year
character. DESCRIPTION.
loss_date
double. DESCRIPTION.
rept_date
double. DESCRIPTION.
hire_date
double. DESCRIPTION.
report_lag
double. DESCRIPTION.
report_lag_group
integer. DESCRIPTION.
day_of_week
character. DESCRIPTION.
claim_type
character. DESCRIPTION.
claimant_state
character. DESCRIPTION.
loss_state
character. DESCRIPTION.
cause
character. DESCRIPTION.
department
character. DESCRIPTION.
tenure
double. DESCRIPTION.
tenure_group
integer. DESCRIPTION.
claimant_age
double. DESCRIPTION.
claimant_age_group
integer. DESCRIPTION.
driver_age
double. DESCRIPTION.
driver_age_group
integer. DESCRIPTION.
status
character. DESCRIPTION.
total_paid
double. DESCRIPTION.
total_incurred
double. DESCRIPTION.
count
double. DESCRIPTION.
open_count
double. DESCRIPTION.
close_count
double. DESCRIPTION.
incurred_group
integer. DESCRIPTION.
Open pkgdown site of the package
open_pkgdown()
open_pkgdown()
# open_pkgdown()
# open_pkgdown()
A function to simulate transactional actuarial claims/loss data for Property Casualty Insurance.
simulate_claims( n_claims = 1000, start_date = "2015-01-01", end_date = Sys.Date(), seed = 12345, loss_distribution = "lnorm", params = list(mean_log = 7.5, sd_log = 1.5), status_prob_open = 0.96, cache = FALSE, ... )
simulate_claims( n_claims = 1000, start_date = "2015-01-01", end_date = Sys.Date(), seed = 12345, loss_distribution = "lnorm", params = list(mean_log = 7.5, sd_log = 1.5), status_prob_open = 0.96, cache = FALSE, ... )
n_claims |
Numeric - Number of claims to be simulated. |
start_date , end_date
|
Character/Date - Start and End dates for simulation to create claims within (experience_period). |
seed |
Numeric - the seed is used to isolate randomness during statistical simulations. |
loss_distribution |
Character - must be one of the distributions mentioned in the details below. Defaults to lognormal. |
params |
Parameters associated with the specified |
status_prob_open |
Numeric - must be within |
cache |
Boolean/Logical - enable caching? |
... |
If needed |
Severity/Loss Distributions:
Normal: norm
Lognormal: lnorm
Gamma: gamma
LogGamma: lgamma
Pareto: pareto
Weibull: weibull
Generalized Beta: genbeta
The return value, if any, from executing the function.
To Proper
toproper( string, replace_underscores = TRUE, underscore_replacement = " ", return_as = c("titlecase", "uppercase", "lowercase", "asis"), uppers = c("Tpa") )
toproper( string, replace_underscores = TRUE, underscore_replacement = " ", return_as = c("titlecase", "uppercase", "lowercase", "asis"), uppers = c("Tpa") )
string |
string to manipulate on |
replace_underscores |
Logical: if |
underscore_replacement |
Character: if argument |
return_as |
How should the string be returned? Options are:
|
uppers |
Abbreviations to keep upper-case. |
"Proper" string
s <- "variable_a is awesome" toproper(s)
s <- "variable_a is awesome" toproper(s)
This is a comprehensive function that allows the user to gain valuable insights about an individual claim's historical transactions.
view_claim_history(claim_id, claims_data = NULL)
view_claim_history(claim_id, claims_data = NULL)
claim_id |
Claim ID |
claims_data |
Dataset |
a list containing a) claim details b) transactional history c) interactive timeline