prettyglm

Lifecycle: experimental R build status

One of the main advantages of using Generalised Linear Models is their interpretability. The goal of prettyglm is to provide a set of functions which easily create beautiful coefficient summaries which can readily be shared and explained.

Installation

You can install the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("jared-fowler/prettyglm")

A Simple Example

To explore the functionality of prettyglm we will use a data set sourced from https://www.kaggle.com/volodymyrgavrysh/bank-marketing-campaigns-dataset which contains information about a Portugal banks marketing campaigns results. The campaign was based mostly on direct phone calls, offering clients a term deposit. The target variable y indicates if the client agreed to place the deposit after the phone call.

library(prettyglm)
library(dplyr)
library(Hmisc)
data("bank")

Pre-processing

A critical step for this package to work well is to set all categorical predictors as factors.

# Easiest way to convert multiple columns to a factor.
columns_to_factor <- c('job',
                       'marital',
                       'education',
                       'default',
                       'housing',
                       'loan')
bank_data  <- bank_data  %>%
  dplyr::mutate_at(columns_to_factor, list(~factor(.))) %>% # multiple columns to factor
  dplyr::mutate(age_cat = Hmisc::cut2(age, g=30, levels.mean=T)) #cut age variable into categories

Building a glm

For this example we will build a glm using stats::glm(), however prettyglm also supports parsnip and workflow model objects which use the glm model engine.

Note the point of this README is not to create the best model, but to highlight the features of this package.

deposit_model <- stats::glm(y ~ job +
                                marital +
                                education +
                                default +
                                loan +
                                age_cat,
                             data = bank_data,
                             family = binomial)

Create table of model coefficients with pretty_coefficients()

pretty_coefficients(deposit_model, type_iii = 'Wald')

Create plots of the model relativities with pretty_relativities()

pretty_relativities(feature_to_plot = 'job',
                    model_object = deposit_model)

pretty_relativities(feature_to_plot = 'age_cat',
                    model_object = deposit_model,
                    plot_factor_as_numeric = T)