Oct 19, 2014

pasteFromExcel

Suppose you have a triangle in Excel that you would like to analyze in R. In Excel, select the cells comprising the triangle including the labels identifying accident year and age and copy to the clipboard.

smalltri.png

Back in R, load the excelRio package and run the function that pastes content from the clipboard. Be sure to tell the function that your data has a “header” for the column names (development ages) and a “rowheader” for the row names (accident years).

library(excelRio)
tri <- pasteFromExcel(header = TRUE, rowheader = TRUE)

tri
       12   24   36   48   60
2010 1000 1200 1320 1386 1414
2011 1200 1452 1568 1631   NA
2012  800 1040 1113   NA   NA
2013 1100 1320   NA   NA   NA
2014  850   NA   NA   NA   NA

excelRio has special handling to strip out the dollar signs and commas when appropriate. excelRio’s interpretation of “appropriate” is, if after stripping out the dollar signs and commas the column of results can be successfully converted to numeric, then that’s what will happen. Without special handling, this would not normally happen, as below when the first two character elements lose numeric value.

as.numeric(c("$1,000", "1,200", "800"))
[1]  NA  NA 800
Warning message:
NAs introduced by coercion 

The argument that controls this special handling is convertFormattedNumbers and can be turned off, as in the code below. Note that I also set stringsAsFactors = FALSE because my installation of R has the default value TRUE for that global option.

tri <- pasteFromExcel(header = TRUE, rowheader = TRUE,
                      convertFormattedNumbers = FALSE,
                      stringsAsFactors = FALSE)

tri
     12         24         36         48         60        
2010 " $1,000 " " $1,200 " " $1,320 " " $1,386 " " $1,414 "
2011 " 1,200 "  " 1,452 "  " 1,568 "  " 1,631 "  NA        
2012 " 800 "    " 1,040 "  " 1,113 "  NA         NA        
2013 " 1,100 "  " 1,320 "  NA         NA         NA        
2014 " 850 "    NA         NA         NA         NA        

mode(tri)
[1] "character"

class(tri)
[1] "matrix"

The class of the above raises a subtle point about the value of the result from pasteFromExcel.

Data in Excel is often comprised of columns of data of many different types: numeric, character, and dates, to name a few. By default, therefore, pasteFromExcel will create a data.frame, an R object that allows differently typed columns. But when all the columns of data are “compatible,” pasteFromExcel will return a matrix object of the common “type” (or “mode”) by default, if possible, under the assumption that a matrix object would be “intuitively expected.” The argument that controls that option is simplify. By calling the function with simplify = FALSE, a data.frame will always be returned.

For more information about pasteFromExcel and its various optional arguments, see

?pasteFromExcel

Finally, Dear Reader, I leave you with a warning and a request for assistance! :)

Warning!

Formatted numbers in Excel are stripped of their “adornments” via a regular expression I found on the internet here. This expression will not work outside the U.S.

I repeat:

Warning! excelRio’s special currency handling will only work with dollar signs, with periods as decimal separator, and with commas separating triples. I apologize. :(

(For the curious, excelRio stores its current regex pattern for currency in the variable currencypattern on line 38 in the file excelRio.r in the R folder of the package on github. )

Request for Assistance!

If anyone is interested in suggesting a currency pattern for their location I would really appreciate the collaboration — woohoo!! You can reach me at danielmarkmurphy at gmail. Thank you sincerely in advance.

Dan


created (mostly) with RMarkdown in RStudio

Oct 6, 2014

AIM Matters

Props to my college roommate, Brian Conrey, who this summer celebrated the 20 year anniversary of the American Institute of Mathematics (AIM), a math "think tank" in Palo Alto, California. With the help of math enthusiast John Fry and a supportive board, AIM has discovered creative ways to generate mathematics research at all levels. For example, Math Teachers' Circles (MTCs) bring together middle school teachers and professional mathematicians around the country to work on interesting problems that help bolster the classroom skills of our early math instructors. AIM originated the SQuaREs program, an inventive way to gather researchers into concentrated week-long workshops, a fertile model that non-math disciplines could look to emulate. Science Nation recently highlighted AIM's partnership with Driscoll's (Watsonville CA) and other berry growers that modeled various water and land management techniques for balancing aquifer levels ("Strawberry fields forever - with some help from mathematicians!”).

This week the Frys.com Open golf tournament takes place at the Silverado Resort and Spa in Napa, California. Besides volunteering as Shotlink operators, AIM supporters will also be manning a booth. Come on by to learn more and meet some nice mathematicians!

Oct 5, 2014

Implementing CLFM with the ChainLadder Package

One of the requirements of the popular stochastic reserving method known as the Mack Method (see for example Dr. Mack's original paper, Murphy's original paper, Barnett & Zehnwirth, and others) is that the actuary select one of the "standard" averages, such as the simple or volume-weighted averages of the observed link ratios. If an actuary's selection differs from one of those then -- strictly speaking -- the results of the papers above do not apply.

Aug 28, 2013

IBNR with the Bornhuetter-Ferguson Method using the R ChainLadder package

This is the third of a trilogy of posts demonstrating how to implement three basic deterministic Property&Casualty/General/Non-Life insurance actuarial techniques using the ChainLadder package in R.

In simplest form, the Bornhuetter-Ferguson ("BF") Method estimates IBNR for an accident/policy/underwriting/origin year (tranch of exposure) as the product of an a-priori estimate of ultimate loss for that exposure and an estimate of the percent of that ultimate loss unknown/unreported/undeveloped at the time:

Jun 22, 2013

IBNR with the Loss Ratio Method using the ChainLadder package

ChainLadder is a package for the R statistical environment that contains various functions for performing loss reserving for Property/Casualty/General Insurance. Mostly, the package's functions are intended to implement sophisticated stochastic models, but many simpler, deterministic methods are relatively easy to perform using helper functions in the package. This post will demonstrate how to calculate IBNR with the Loss Ratio Method.

Jun 12, 2013

IBNR with the Chain-Ladder Method using the R ChainLadder package

ChainLadder is a package for the R statistical environment that contains various functions for performing loss reserving for Property/Casualty/General/Non-Life Insurance. Mostly, the package's functions are intended to implement sophisticated stochastic models, but many simpler, deterministic methods are relatively easy to perform using helper functions in the package. This post will demonstrate how to calculate IBNR with the Chain-Ladder Method using the ChainLadder package.