dialr is an R interface to Google’s libphonenumber java library.
libphonenumber defines the PhoneNumberUtil
class, a set
of functions for extracting information from and performing processing
on a parsed Phonenumber
object. A phone number must be
parsed before any other operations (e.g. checking phone number validity,
formatting) can be performed.
dialr provides an interface to these functions to easily parse and process phone numbers in R.
A phone class vector stores a parsed java Phonenumber
object for further processing alongside the original raw text phone
number and default region. This “default region” is required to
determine the processing context for non-international numbers.
To create a phone vector, use the phone()
function. This
takes a character vector of phone numbers to parse and a default region
for phone numbers not stored in an international format (i.e. with a
leading “+”).
library(dialr)
# Parse phone number vector
<- c(0, 0123, "0404 753 123", "61410123817", "+12015550123")
x <- phone(x, "AU")
x
is.phone(x)
#> [1] TRUE
print(x)
#> # Parsed phone numbers: 5 total, 4 successfully parsed
#> [1] 0 123 0404 753 123 61410123817 +12015550123
is_parsed(x) # Was the phone number successfully parsed?
#> [1] FALSE TRUE TRUE TRUE TRUE
is_valid(x) # Is the phone number valid?
#> [1] FALSE FALSE TRUE TRUE TRUE
is_possible(x) # Is the phone number possible?
#> [1] FALSE FALSE TRUE TRUE TRUE
get_region(x) # What region (ISO country code) is the phone number from?
#> [1] NA NA "AU" "AU" "US"
get_type(x) # Is the phone number a fixed line, mobile etc.
#> [1] NA "UNKNOWN" "MOBILE"
#> [4] "MOBILE" "FIXED_LINE_OR_MOBILE"
Equality comparisons for phone numbers ignore formatting differences and compare the underlying phone number.
phone("0404 753 123", "AU") == phone("+61404753123", "US")
#> [1] TRUE
phone("0404 753 123", "AU") == phone("0404 753 123", "US")
#> [1] FALSE
phone("0404 753 123", "AU") != phone("0404 753 123", "US")
#> [1] TRUE
Parsed phone numbers can also be compared to character phone numbers stored in an international format.
phone("0404 753 123", "AU") == c("+61404753123", "0404 753 123")
#> [1] TRUE FALSE
Use is_match()
for more customisable comparisons.
is_match(phone("0404 753 123", "AU"), c("+61404753123", "0404753123", "1234"))
#> [1] TRUE FALSE FALSE
is_match(phone("0404 753 123", "AU"), c("+61404753123", "0404753123", "1234"), detailed = TRUE)
#> [1] "EXACT_MATCH" "NSN_MATCH" "NO_MATCH"
is_match(phone("0404 753 123", "AU"), c("+61404753123", "0404753123", "1234"), strict = FALSE)
#> [1] TRUE TRUE FALSE
The phone class has a format()
method implementing
libphonenumber’s core formatting functionality.
There are four phone number formats used by libphonenumber (see
“Further reading” for details): "E164"
,
"NATIONAL"
, "INTERNATIONAL"
and"RFC3966"
. These can be specified by the
format
argument, or a default can be specifed in option
dialr.format
.
If clean = TRUE
, all non-numeric characters are removed
except for a leading +
. clean = TRUE
by
default.
<- phone(c(0, 0123, "0404 753 123", "61410123817", "+12015550123"), "AU")
x
format(x, format = "RFC3966")
#> [1] NA "+61123" "+61404753123" "+61410123817" "+12015550123"
format(x, format = "RFC3966", clean = FALSE)
#> [1] NA "tel:+61-123" "tel:+61-404-753-123"
#> [4] "tel:+61-410-123-817" "tel:+1-201-555-0123"
format(x, format = "E164", clean = FALSE)
#> [1] NA "+61123" "+61404753123" "+61410123817" "+12015550123"
format(x, format = "NATIONAL", clean = FALSE)
#> [1] NA "123" "0404 753 123" "0410 123 817"
#> [5] "(201) 555-0123"
format(x, format = "INTERNATIONAL", clean = FALSE)
#> [1] NA "+61 123" "+61 404 753 123" "+61 410 123 817"
#> [5] "+1 201-555-0123"
format(x, format = "RFC3966", clean = FALSE)
#> [1] NA "tel:+61-123" "tel:+61-404-753-123"
#> [4] "tel:+61-410-123-817" "tel:+1-201-555-0123"
# Change the default
getOption("dialr.format")
#> [1] "E164"
format(x)
#> [1] NA "+61123" "+61404753123" "+61410123817" "+12015550123"
options(dialr.format = "NATIONAL")
format(x)
#> [1] NA "123" "0404753123" "0410123817" "2015550123"
options(dialr.format = "E164")
If the home
argument is supplied, the phone number is
formatted for dialling from the specified country.
format(x, home = "AU")
#> [1] NA "123" "0404753123" "0410123817"
#> [5] "001112015550123"
format(x, home = "US")
#> [1] NA "01161123" "01161404753123" "01161410123817"
#> [5] "12015550123"
format(x, home = "JP")
#> [1] NA "01061123" "01061404753123" "01061410123817"
#> [5] "01012015550123"
If strict = TRUE
, invalid phone numbers (determined
using is_valid()
) return NA
.
format(x)
#> [1] NA "+61123" "+61404753123" "+61410123817" "+12015550123"
format(x, strict = TRUE)
#> [1] NA NA "+61404753123" "+61410123817" "+12015550123"
By default, as.character()
returns the raw text phone
number. Use raw = FALSE
to use the format()
method instead.
as.character(x)
#> [1] "0" "123" "0404 753 123" "61410123817" "+12015550123"
as.character(x, raw = FALSE)
#> [1] NA "+61123" "+61404753123" "+61410123817" "+12015550123"
dialr functions are designed to work well in dplyr workflows.
# Use with dplyr
library(dplyr)
<- tibble(id = 1:4,
y phone1 = c(0, 0123, "0404 753 123", "61410123817"),
phone2 = c("03 9388 1234", 1234, "+12015550123", 0),
country = c("AU", "AU", "AU", "AU"))
%>%
y mutate_at(vars(matches("^phone")), ~phone(., country)) %>%
mutate_at(vars(matches("^phone")),
list(valid = is_valid,
region = get_region,
type = get_type,
clean = format))
#> # A tibble: 4 × 12
#> id phone1 phone2 country phone1_valid phone2_…¹ phone…² phone…³
#> <int> <phone> <phone> <chr> <lgl> <lgl> <chr> <chr>
#> 1 1 NA +61393881234 AU FALSE TRUE <NA> AU
#> 2 2 +61123 +611234 AU FALSE FALSE <NA> <NA>
#> 3 3 +61404753123 +12015550123 AU TRUE TRUE AU US
#> 4 4 +61410123817 NA AU TRUE FALSE AU <NA>
#> # … with 4 more variables: phone1_type <chr>, phone2_type <chr>,
#> # phone1_clean <chr>, phone2_clean <chr>, and abbreviated variable names
#> # ¹phone2_valid, ²phone1_region, ³phone2_region
"E164"
: general format for international telephone
numbers from ITU-T
Recommendation E.164
"NATIONAL"
: national notation from ITU-T Recommendation
E.123
"INTERNATIONAL"
: international notation from ITU-T Recommendation
E.123
"RFC3966"
: “tel” URI syntax from the IETF tel URI for Telephone
Numbers