fhircrackr: Recreate FHIR resources

2022-07-13

This vignette covers all topics concerned with recreating resources. If you are interested in a quick overview over the fhircrackr package, please have a look at the fhircrackr:intro vignette.

Before running any of the following code, you need to load the fhircrackr package:

library(fhircrackr)

Preparation

In the other vignettes you saw how to download and flatten resources. Now we’ll have a look at how to turn flattened tables back into FHIR resources. This allows you to extract resources from a server, manipulate their content in R and to upload them to a server again. One scenario where this might be useful is downloading data from one server, anonymizing it and uploading it to another server. If you are working with sensitive data please note that it is your responsibility alone to check that any resources you upload to an insecure server are sufficiently anonymized.

For the rest of the vignette, we’ll work with example_bundles2 from fhircrackr, which can be made accessible like this:

bundles <- fhir_unserialize(bundles = example_bundles2)

See ?example_bundles2 to what this bundle looks like.

Crack to wide format

Starting with the FHIR resources, the first thing you’ll have to do is to crack the data to a wide format. For more information on the process, please see the vignette on flattening resources. Make sure that you allow fhir_crack() to generate the column names automatically, i.e. don’t state explicit column names in the fhir_table_description.

patients <- fhir_table_description(
  resource = "Patient",
  brackets = c("[", "]"),
  sep      = " | ",
  format   = "wide"
)

table <- fhir_crack(bundles = bundles, design = patients, verbose = 0)

The resulting table looks like this:

table
#   [1.1]address.city [2.1]address.city [3.1]address.city [1.1]address.country
# 1         Amsterdam              <NA>              <NA>          Netherlands
# 2              Rome         Stockholm              <NA>                Italy
# 3            Berlin              <NA>            London                 <NA>
#   [2.1]address.country [3.1]address.country [1.1]address.type [2.1]address.type
# 1                 <NA>                 <NA>          physical              <NA>
# 2               Sweden                 <NA>          physical            postal
# 3               France              England              <NA>            postal
#   [3.1]address.type [1.1]address.use [2.1]address.use [3.1]address.use [1]id
# 1              <NA>             home             <NA>             <NA>   id1
# 2              <NA>             home             work             <NA>   id2
# 3            postal             home             <NA>             work   id3
#   [1.1]name.given [2.1]name.given
# 1           Marie            <NA>
# 2           Susie            <NA>
# 3           Frank             Max

Modify the data

You can now modify the data. For example, we could remove the name and id and change all city entries to xxx:

#remove name and id
modified_table <- subset(table, select = -c(`[1.1]name.given`, `[2.1]name.given`, `[1]id`))

#anonymize city
modified_table[,1:3] <- sapply(modified_table[,1:3], function(x){sub(".*", "xxx", x)})


modified_table
#   [1.1]address.city [2.1]address.city [3.1]address.city [1.1]address.country
# 1               xxx              <NA>              <NA>          Netherlands
# 2               xxx               xxx              <NA>                Italy
# 3               xxx              <NA>               xxx                 <NA>
#   [2.1]address.country [3.1]address.country [1.1]address.type [2.1]address.type
# 1                 <NA>                 <NA>          physical              <NA>
# 2               Sweden                 <NA>          physical            postal
# 3               France              England              <NA>            postal
#   [3.1]address.type [1.1]address.use [2.1]address.use [3.1]address.use
# 1              <NA>             home             <NA>             <NA>
# 2              <NA>             home             work             <NA>
# 3            postal             home             <NA>             work

Recreate a single resource

To create resources from this data, the fhircrackr makes use of the structure information inherent in the column names. If you want to get an overview over this structure before creating the actual xml-objects, you can use the function fhir_tree() that creates a string representing the structure which can be printed to the console using cat() or written to a text file:

cat(fhir_tree(modified_table, resource = "Patient", brackets = c("[", "]"), keep_ids = TRUE))
# Patient1
#   address1: physical
#   city1: Netherlands
#   country1: home
# Patient2
#   address1: physical
#   address3: work
#   city1: Italy
#   country1: home
# Patient3
#   address1: xxx
#   address2: postal
#   address3: France
#   city1: England
#   country1: work

To create a FHIR resource out of the first row of the table, you can use the function fhir_build_resource(). This function takes a single row of a cast table and the resource type you intend to create and builds an object of class fhir_resource, which is essentially an xml-object:

new_resource <- fhir_build_resource(row           = modified_table[1,], 
                                    resource_type = "Patient", 
                                    brackets      = c("[", "]"))

new_resource
# A fhir_resource object
# <Patient>
#   <address value="physical"/>
#   <city value="Netherlands"/>
#   <country value="home"/>
# </Patient>

Recreate a bundle of resources

It is also possible to bundle several resources to upload them to the server together. This is done using bundles of type transaction or batch (see https://www.hl7.org/fhir/bundle.html and https://www.hl7.org/fhir/http.html).

We can create such a bundle from a wide table using the function fhir_build_bundle(), which takes a wide table and the resource type represented in the table, as well as information on the type of bundle you want to create:

transaction_bundle <- fhir_build_bundle(
  table         = modified_table,
  brackets      = c("[", "]"),
  resource_type = "Patient",
  bundle_type   = "transaction",
  verbose       = 0
)

You can have a look at the bundle like this:

#Overview
transaction_bundle
# A fhir_bundle_xml object
# No. of entries : 3
#  
# {xml_node}
# <Bundle>
# [1] <type value="transaction"/>
# [2] <entry>\n  <Patient value="home"/>\n  <address value="physical"/>\n  <cit ...
# [3] <entry>\n  <Patient value="home"/>\n  <address value="physical"/>\n  <add ...
# [4] <entry>\n  <Patient value="home"/>\n  <address value="xxx"/>\n  <city val ...

#print complete string
cat(toString(transaction_bundle))
# <Bundle>
#   <type value="transaction"/>
#   <entry>
#     <Patient value="home"/>
#     <address value="physical"/>
#     <city value="Netherlands"/>
#     <resource value="xxx"/>
#   </entry>
#   <entry>
#     <Patient value="home"/>
#     <address value="physical"/>
#     <address value="work"/>
#     <city value="postal"/>
#     <resource value="Sweden"/>
#   </entry>
#   <entry>
#     <Patient value="home"/>
#     <address value="xxx"/>
#     <city value="work"/>
#     <resource value="postal"/>
#   </entry>
# </Bundle>

If you are familiar with transaction bundles, you’ll notice that this bundle is lacking some information to be POSTable to a server: The request element. To be able to upload resources to a server, a transaction/batch bundle must have a request element for each resource which holds the url and the HTTP verb (usually POST or PUT) for the respective resource, otherwise the server will throw an error.

The modified table we have used so far doesn’t have this information, so we have to add it like this:

request <- data.frame(
  request.method = c("POST",    "POST",    "POST"),
  request.url    = c("Patient", "Patient", "Patient")
)

request_table <- cbind(modified_table, request)

request_table
#   [1.1]address.city [2.1]address.city [3.1]address.city [1.1]address.country
# 1               xxx              <NA>              <NA>          Netherlands
# 2               xxx               xxx              <NA>                Italy
# 3               xxx              <NA>               xxx                 <NA>
#   [2.1]address.country [3.1]address.country [1.1]address.type [2.1]address.type
# 1                 <NA>                 <NA>          physical              <NA>
# 2               Sweden                 <NA>          physical            postal
# 3               France              England              <NA>            postal
#   [3.1]address.type [1.1]address.use [2.1]address.use [3.1]address.use
# 1              <NA>             home             <NA>             <NA>
# 2              <NA>             home             work             <NA>
# 3            postal             home             <NA>             work
#   request.method request.url
# 1           POST     Patient
# 2           POST     Patient
# 3           POST     Patient

Now when we build a transaction bundle, it has all the information we need:

transaction_bundle <- fhir_build_bundle(
  table         = request_table,
  resource_type = "Patient",
  bundle_type   = "transaction", 
  brackets      = c("[", "]"),
  verbose       = 0
)

cat(toString(transaction_bundle))
# <Bundle>
#   <type value="transaction"/>
#   <entry>
#     <request>
#       <method value="POST"/>
#       <url value="Patient"/>
#     </request>
#     <resource>
#       <Patient>
#         <address>
#           <city value="xxx"/>
#           <country value="Netherlands"/>
#           <type value="physical"/>
#           <use value="home"/>
#         </address>
#       </Patient>
#     </resource>
#   </entry>
#   <entry>
#     <request>
#       <method value="POST"/>
#       <url value="Patient"/>
#     </request>
#     <resource>
#       <Patient>
#         <address>
#           <city value="xxx"/>
#           <country value="Italy"/>
#           <type value="physical"/>
#           <use value="home"/>
#         </address>
#         <address>
#           <city value="xxx"/>
#           <country value="Sweden"/>
#           <type value="postal"/>
#           <use value="work"/>
#         </address>
#       </Patient>
#     </resource>
#   </entry>
#   <entry>
#     <request>
#       <method value="POST"/>
#       <url value="Patient"/>
#     </request>
#     <resource>
#       <Patient>
#         <address>
#           <city value="xxx"/>
#           <use value="home"/>
#         </address>
#         <address>
#           <city value="xxx"/>
#           <country value="England"/>
#           <type value="postal"/>
#           <use value="work"/>
#         </address>
#         <address>
#           <country value="France"/>
#           <type value="postal"/>
#         </address>
#       </Patient>
#     </resource>
#   </entry>
# </Bundle>

Different attributes

Almost all the time, the only xml attribute that is used in a FHIR resource is the value attribute like in this small example resource:

fhir_unserialize(example_resource1)
# A fhir_resource object
# <Patient>
#   <name>
#     <given value="Marie"/>
#   </name>
#   <gender value="female"/>
#   <birthDate value="1970-01-01"/>
# </Patient>

In rare cases, however, there can be other types of attributes, namely id or url, which looks for example like this:

fhir_unserialize(example_resource3)
# A fhir_resource object
# <Medication>
#   <code>
#     <coding>
#       <system value="http://www.nlm.nih.gov/research/umls/rxnorm"/>
#       <code value="1594660"/>
#       <display value="Alemtuzumab 10mg/ml (Lemtrada)"/>
#     </coding>
#   </code>
#   <ingredient id="1">
#     <itemReference>
#       <reference value="Substance/5463"/>
#     </itemReference>
#   </ingredient>
#   <ingredient id="2">
#     <itemReference>
#       <reference value="Substance/3401"/>
#     </itemReference>
#   </ingredient>
# </Medication>

As you can see, this example Medication has ingredient elements which have an id attribute. fhir_crack() will extract any kind of attributes, e.g. from this bundle containing the above Medication resource:

bundle <- fhir_unserialize(example_bundles4)
med <- fhir_table_description(resource = "Medication", 
                              cols     = c("id", "ingredient", "ingredient/itemReference/reference"),
                              format   = "wide",
                              brackets = c("[", "]")
)
without_attribute <- fhir_crack(bundles = bundle, design = med, verbose = 0)
without_attribute
#   [1]id [1]ingredient [2]ingredient [1.1.1]ingredient.itemReference.reference
# 1  1285             1             2                            Substance/5463
# 2 45226             1             2                            Substance/6912
#   [2.1.1]ingredient.itemReference.reference
# 1                            Substance/3401
# 2                            Substance/3710

If you are interested in which kind of attribute the extracted value had, you can set keep_attr=TRUE:

with_attribute <- fhir_crack(bundles = bundle, design = med, keep_attr = TRUE, verbose = 0)
with_attribute
#   [1]id@value [1]ingredient@id [2]ingredient@id
# 1        1285                1                2
# 2       45226                1                2
#   [1.1.1]ingredient.itemReference.reference@value
# 1                                  Substance/5463
# 2                                  Substance/6912
#   [2.1.1]ingredient.itemReference.reference@value
# 1                                  Substance/3401
# 2                                  Substance/3710

This is important when you want to recreate the resources properly. If there is no attribute information in the column names, fhir_build_resource() will assume that all columns have value attributes, which is wrong in this case:

fhir_build_resource(row = without_attribute[1,], resource_type = "Medication", brackets = c("[", "]"))
# A fhir_resource object
# <Medication>
#   <id value="1285"/>
#   <ingredient value="1">
#     <itemReference>
#       <reference value="Substance/5463"/>
#     </itemReference>
#   </ingredient>
#   <ingredient value="2">
#     <itemReference>
#       <reference value="Substance/3401"/>
#     </itemReference>
#   </ingredient>
# </Medication>

Instead one should build the resource from a table that contains the attribute information:

fhir_build_resource(row = with_attribute[1,], resource_type = "Medication", brackets = c("[", "]"))
# A fhir_resource object
# <Medication>
#   <id value="1285"/>
#   <ingredient id="1">
#     <itemReference>
#       <reference value="Substance/5463"/>
#     </itemReference>
#   </ingredient>
#   <ingredient id="2">
#     <itemReference>
#       <reference value="Substance/3401"/>
#     </itemReference>
#   </ingredient>
# </Medication>

Upload resources to a server

Upload a single resource

In general there are two modes of loading resources to a FHIR server. You either intend to newly create them on the server or you wish to update a resource that is already present on the server. These two modes correspond to using either POST (for creation) or PUT (for updating). When you POST a resource to the server, the URL you POST it to has the form [base]/[resourceType], e.g. http://hapi.fhir.org/baseR4/Patient. You can for example POST the resource we have just created like this:

fhir_post(url = "http://hapi.fhir.org/baseR4/Patient", body = new_resource)
# Resource sucessfully created

When you do this, the Patient resource in new_resource is created under a new, server generated id (also called logical or resource id) on the server. It therefore makes sense for the POSTed resource to not have a resource id, because if it does, most servers will overwrite this id.

Things are different if you intend to update a resource that is already present on the server. In this case, you’d PUT a resource to an URL containing the exact address of the targeted resource on the server which has the form [base]/[resourceType]/[resourceId]. The resource you are sending with a PUT must have a resource id that is identical to the the one on the server.

Assuming that the resource [base]/Patient/id1 exists on the server, we could for example update it like this:

#create resource
new_resource_with_id <- fhir_build_resource(table[1,], resource_type = "Patient", brackets = c("[", "]"))

new_resource_with_id
# A fhir_resource object
# <Patient>
#   <address>
#     <city value="Amsterdam"/>
#     <country value="Netherlands"/>
#     <type value="physical"/>
#     <use value="home"/>
#   </address>
#   <id value="id1"/>
#   <name>
#     <given value="Marie"/>
#   </name>
# </Patient>
fhir_put(url = "http://hapi.fhir.org/baseR4/Patient/id1", body = new_resource_with_id)
# Ressource successfully updated.

If the no resource exists under the id you are trying to PUT your resource to, the FHIR server will perform something called Update as create, meaning the the resource you send to the server is newly created with the specified id (as opposed to a server generated id). In this case fhir_put() will inform you like this:

fhir_put(url = "http://hapi.fhir.org/baseR4/Patient/id1", 
         body = new_resource_with_id, 
         brackets = c("[", "]"))
# Ressource successfully created.

Upload a bundle of resources

It is also possible to upload a bundle of resources together. The bundle in the we’ve created with fhir_build_bundle() is such a bundle:

transaction_bundle
# A fhir_bundle_xml object
# No. of entries : 3
#  
# {xml_node}
# <Bundle>
# [1] <type value="transaction"/>
# [2] <entry>\n  <request>\n    <method value="POST"/>\n    <url value="Patient ...
# [3] <entry>\n  <request>\n    <method value="POST"/>\n    <url value="Patient ...
# [4] <entry>\n  <request>\n    <method value="POST"/>\n    <url value="Patient ...

The request element we’ve added before specifies for each resource which HTTP verb (PUT or POST) and which url to use. Note that the URL must match the HTTP action, i.e. with PUT the URL must contain a resource id, while with POST it cannot contain a resource id.

You can POST the bundle to the server like this:

fhir_post("http://hapi.fhir.org/baseR4", body = transaction_bundle)
# Bundle sucessfully POSTed

Linked resources

Uploading independent resources of a single type to a server is easy, as you’ve seen above. Matters get a lot more complicated, however, when resources contain references to other resources, e.g. a MedicationStatement resource that links to a Patient resource.

How to best upload interlinked resources to a FHIR server depends on the individual settings of the server, but in most cases it makes sense to include the linked resources in the same transaction bundle. This can be achieved with fhir_build_bundle() by passing a list of tables to the function. The most tricky part in this is to get references right because you need to know the id of the referenced resource beforehand. That is why in most cases it is easier to PUT the resources instead of POSTing them, because this allows you to choose the resource id yourself. The details of creating valid transaction bundles is beyond the scope of this vignette, but here is a small example to illustrate the general process. First let’s crack and cast a simple example bundle containing 3 Patients and one Observation resource:

#unserialize example bundles
bundles <- fhir_unserialize(example_bundles3)

#crack
Patient <- fhir_table_description(
  resource = "Patient",
  sep      = ":::",
  brackets = c("[","]"),
  format   = "wide"
)

Observation <- fhir_table_description(
  resource = "Observation",
  sep      = ":::",
  brackets = c("[","]"),
  format   = "wide"
)

tables <- fhir_crack(
  bundles = bundles,
  design  = fhir_design(Patient, Observation),
  verbose = 0
)

Now we need to add the request information. We use PUT for all resources to have control over their ids.

#add request info to Patients
tables$Patient$request.method <- "PUT"
tables$Patient$request.url <- paste0("Patient/", tables$Patient$`[1]id`)

#add request info to Observation
tables$Observation$request.method <- "PUT"
tables$Observation$request.url <- paste0("Observation/", tables$Observation$`[1]id`)

The augmented tables look like this:

tables$Patient
#   [1.1]address.city [2.1]address.city [3.1]address.city [1.1]address.country
# 1         Amsterdam              <NA>              <NA>          Netherlands
# 2              Rome         Stockholm              <NA>                Italy
# 3            Berlin              <NA>            London                 <NA>
#   [2.1]address.country [3.1]address.country [1.1]address.type [2.1]address.type
# 1                 <NA>                 <NA>          physical              <NA>
# 2               Sweden                 <NA>          physical            postal
# 3               France              England              <NA>            postal
#   [3.1]address.type [1.1]address.use [2.1]address.use [3.1]address.use [1]id
# 1              <NA>             home             <NA>             <NA>   id1
# 2              <NA>             home             work             <NA>   id2
# 3            postal             home             <NA>             work   id3
#   [1.1]name.given [2.1]name.given request.method request.url
# 1           Marie            <NA>            PUT Patient/id1
# 2           Susie            <NA>            PUT Patient/id2
# 3           Frank             Max            PUT Patient/id3
tables$Observation
#   [1.1.1]code.coding.code [1.2.1]code.coding.code [1.1.1]code.coding.display
# 1                 29463-7                27113001                Body Weight
#   [1.2.1]code.coding.display [1.1.1]code.coding.system
# 1                Body weight          http://loinc.org
#   [1.2.1]code.coding.system [1]id [1.1]subject.reference request.method
# 1    http://snomed.info/sct  obs1            Patient/id2            PUT
#        request.url
# 1 Observation/obs1

You can build a bundle from them by giving this list to fhir_build_bundle():

bundle <- fhir_build_bundle(table    = tables,
                            brackets = c("[","]"))
# Created a  transaction Bundle with 4 resources.

The bundle looks like this:

cat(toString(bundle))
# <Bundle>
#   <type value="transaction"/>
#   <entry>
#     <request>
#       <method value="PUT"/>
#       <url value="Patient/id1"/>
#     </request>
#     <resource>
#       <Patient>
#         <address>
#           <city value="Amsterdam"/>
#           <country value="Netherlands"/>
#           <type value="physical"/>
#           <use value="home"/>
#         </address>
#         <id value="id1"/>
#         <name>
#           <given value="Marie"/>
#         </name>
#       </Patient>
#     </resource>
#   </entry>
#   <entry>
#     <request>
#       <method value="PUT"/>
#       <url value="Patient/id2"/>
#     </request>
#     <resource>
#       <Patient>
#         <address>
#           <city value="Rome"/>
#           <country value="Italy"/>
#           <type value="physical"/>
#           <use value="home"/>
#         </address>
#         <address>
#           <city value="Stockholm"/>
#           <country value="Sweden"/>
#           <type value="postal"/>
#           <use value="work"/>
#         </address>
#         <id value="id2"/>
#         <name>
#           <given value="Susie"/>
#         </name>
#       </Patient>
#     </resource>
#   </entry>
#   <entry>
#     <request>
#       <method value="PUT"/>
#       <url value="Patient/id3"/>
#     </request>
#     <resource>
#       <Patient>
#         <address>
#           <city value="Berlin"/>
#           <use value="home"/>
#         </address>
#         <address>
#           <city value="London"/>
#           <country value="England"/>
#           <type value="postal"/>
#           <use value="work"/>
#         </address>
#         <address>
#           <country value="France"/>
#           <type value="postal"/>
#         </address>
#         <id value="id3"/>
#         <name>
#           <given value="Frank"/>
#         </name>
#         <name>
#           <given value="Max"/>
#         </name>
#       </Patient>
#     </resource>
#   </entry>
#   <entry>
#     <request>
#       <method value="PUT"/>
#       <url value="Observation/obs1"/>
#     </request>
#     <resource>
#       <Observation>
#         <code>
#           <coding>
#             <code value="29463-7"/>
#             <display value="Body Weight"/>
#             <system value="http://loinc.org"/>
#           </coding>
#           <coding>
#             <code value="27113001"/>
#             <display value="Body weight"/>
#             <system value="http://snomed.info/sct"/>
#           </coding>
#         </code>
#         <id value="obs1"/>
#         <subject>
#           <reference value="Patient/id2"/>
#         </subject>
#       </Observation>
#     </resource>
#   </entry>
# </Bundle>

This bundle can be POSTed to a server like this:

fhir_post(url = "http://hapi.fhir.org/baseR4", body = bundle)
# Bundle sucessfully POSTed