Exporting large FHIR datasets from a FHIR server using regular RESTful API can be a tricky task: each resource type requires a separate API call and, taking pagination into account, it can result in hundreds of requests. Bulk Data Export is an operation that aimed to solve this issue.

The FHIR Bulk Export Service is intended to fulfill the 21st Century Cures bulk data export requirements. Bulk export operation allows to configure and invoke data export within one API call, whether that be data for all patients, data for a subset (defined group) of patients, or all FHIR data on the server. The export process happens asynchronously to lower the load on system performance and the results are available for several days to download from media storage.

The FHIR Bulk Export service enables API consumers to make export USCDI (United States Core Data for Interoperability) clinical data for all patients in a particular context.

This implementation based on Bulk Data Access IG.

Process overview

  1. Client invokes Bulk Data Export process by sending Kick-Off request
  2. Server validates the Kick-Off request and responds with a link to a job in the Content-Location header
  3. Client checks job status by poling from the Content-Location header. If the export process is complete, the response will contain links to the generated files.
  4. Client downloads generated files using links, provided in the job

A sequence diagram of the bulk export workflow provided by HL7 here

Implementation notes

  • Exported files are hosted in a protected AWS S3 compatible bucket in .ndjson format.
  • The URLs returned in the job completion response manifest are AWS S3 Self-Signed URLs. These URLs are valid for a period of 7 days after manifest retrieval.
  • When polling for job status, or canceling a job, FHIR client must have a valid auth token with the same client ID as the one used to initiate the export job.
  • Group-level export targets Patient-Compartment for resources required by USCDI v2. This means we export resources that are referenced by a resource within the patient compartment and excludes resources with no data available on the patient record. Additionally, server provides Encounter, Location, Organization, and Practitioner resources as they are referenced as must support elements in required resources.
  • For the Bulk Export JWT assertion-based authentication is required for the client regarding SMART Backend Services Protocol Details. For the supported authorization in Elation EMR FHIR API refer to the link.

Kick-Off request

Bulk Data Export process can be invoked via GET request for smaller sets of query params, or via POST request, supplying parameters in the FHIR Parameters Resource, for larger ones.

Levels of export

There are three endpoints available to customize export for a particular case of use:

NameDescriptionURL Syntax
System Level ExportExport data from a FHIR server, whether or not it is associated with a patient.[base url]/$export
All PatientsFHIR Operation to obtain a detailed set of FHIR resources of diverse resource types pertaining to all patients.[base url]/Patient/$export
Group of PatientsFHIR Operation to obtain a detailed set of FHIR resources of diverse resource types pertaining to all members of a specified Group.[base url]/Group/[id]/$export

Parameters

  • _since - FHIR instant
    • Resources are included in the response if their state has changed after the supplied time (e.g. if Resource.meta.lastUpdated is later than the supplied _since time).
    • Example: _since=2022-07-13T00:00:00Z
  • _type - string of comma-delimited FHIR resource types
    • Response is filtered to only include resources of the specified resource types(s).
    • Example: _type=Observation,Condition
    • Note: that for Patient and Group Level export referenced patients are always in the response
  • _elements - string of comma-delimited FHIR Elements
    • Unlisted, non-mandatory elements are omitted from the resources returned. Elements should be of the form [resource type].[element name] (eg. Patient.id) or [element name] (eg. id) and only root elements in a resource are permitted. If the resource type is omitted, the element returned for all resources in the response where it is applicable. Mandatory elements are always returned whether they are requested or not.
  • patient - FHIR Reference - Applied only for POST requests
    • Return resources in patient compartments belonging to patients from the list.
    • Note: Not applicable to system level export requests.
    • Example
        POST Example: {"name":"patient",
        "valueReference": {"reference": "Patient/ec89d1cc-48f3-49ad-a80e-bc94f2f542af"}}
    
  • _typeFilter - string of comma delimited values. String of comma separated FHIR REST search queries. When provided, the server filter the data in the response to only include resources that meet the specified criteria.
    • Note: Currently not supported for Patient resource
    • Example:
    POST Example:   {"name":"_typeFilter",
     "valueString": "Condition?asserter=Practitioner/d010b87b-6d7d-460c-a52d-8f741b9b0a72"},
    
    GET Example:
    _typeFilter=Observation%3Fcode%3Dhttp://loinc.org|718-7,Condition%3Fcategory%3Dhttp://terminology.hl7.org/CodeSystem/condition-category|encounter-diagnosis
    
  • _outputFormat - string - Can be set to application/fhir+ndjson or application/ndjson or ndjson.

Note that for Patient level Export and Group level export, patients are always exported regardless of search params, if they are referenced by resources, present in the response.

Headers

There are two required header parameters defined by the current $export specification:

  • Accept - application/fhir+json
  • Prefer - respond-async

Examples

  • _typeFilter parameter use:

The example below demonstrates how to configure a kick-off request to export only patients from a group who had a reaction to immunization. In this case, _typeFilter contains a search immunization query with :missing=false modifier.

curl --location --request GET 'https://kodjin-staging.edenlab.dev//fhir/Group/0a60d2a2-38ce-49f6-ac45-42347193af50/$export?_type=Immunization&_typeFilter=Immunization%3Freaction-date:missing%3Dfalse' \
--header 'content-type: application/json' \
--header 'prefer: respond-async' \
--data-raw ''
  • POST request use:
    Some requests may contain lots of filter parameters. In this case, it is convenient to use a POST request and supply filter parameters in the body. The example below demonstrates how to export resources filtered by practitioner ID. The request contains the _typeFilter parameter for each resource type.
curl --location --request POST 'https://kodjin-staging.edenlab.dev//fhir/$export' \
--header 'content-type: application/json' \
--header 'prefer: respond-async' \
--data-raw '{"resourceType" : "Parameters",
  "parameter" : [
     {"name":"_since",
    "valueInstant": "2022-01-01T00:00:00Z"},
    {"name":"_type",
     "valueString": "Observation, Condition, Procedure, Immunization"},
    {"name":"_typeFilter",
     "valueString": "Observation?performer=Practitioner/9bac339d-ac3b-4715-bf9a-1dab1dec7fa2"},
    {"name":"_typeFilter",
     "valueString": "Condition?asserter=Practitioner/9bac339d-ac3b-4715-bf9a-1dab1dec7fa2"},
    {"name":"_typeFilter",
     "valueString": "Procedure?performer=Practitioner/9bac339d-ac3b-4715-bf9a-1dab1dec7fa2"},
    {"name":"_typeFilter",
     "valueString": "Immunization?performer=Practitioner/9bac339d-ac3b-4715-bf9a-1dab1dec7fa2"}
]
}'
  • _elements parameter use:
    In some cases, you will need only a short set of fields for analysis instead of the entire resource. The example below demonstrates how to export only condition and observation codes using an _elements parameter.
curl --location --request GET 'https://kodjin-staging.edenlab.dev//fhir/$export?_type=Observation,Condition&_since=2022-07-13T00:00:00Z&_elements=code' \
--header 'content-type: application/json' \
--header 'prefer: respond-async' \
--data-raw ''

Status Request

When the Data Export process is invoked it can take time for the server to generate all the files. Client can check the status of the job export by polling from the Content-Location header, returned on Kick-Off Request.

Response can be one of:

  • 'In-progress' - Returned by the server while it is processing the $export request.

    • Response Headers and Body:
    Status: 202 Accepted
    
  • 'Error' - Returned by the server if the export operation fails.

    • Response Headers and Body:

    Status: 500 Internal Server Error
    Content-Type: application/json

    { 
        "resourceType": "OperationOutcome",
        "id": "1",
        "issue": [ {
            "severity": "error",
            "code": "processing",
            "details": {
                 "text": "An internal timeout has occurred"
             }
         } ]
    }
    
  • Complete - Returned by the server when the export operation has completed.

    • Response Headers and Body:

    Status: 200 OK
    Expires: Mon, 22 Jul 2022 23:59:59 GMT
    Content-Type: application/json

        {
        "transactionTime": "2022-08-21T00:00:00Z",
        "request" : "https://example.com/fhir/Patient/$export?_type=Patient,Observation",
        "output" : [{
        "type" : "Patient",
        "url" : "https://example.com/output/patient_file_1.ndjson"
        },{
        "type" : "Patient",
        "url" : "https://example.com/output/patient_file_2.ndjson"
        },{
        "type" : "Observation",
        "url" : "https://example.com/output/observation_file_1.ndjson"
        }],
        "deleted" : [{
        "type" : "Bundle",
        "url" : "https://example.com/output/del_file_1.ndjson"
        }],
        "error" : [{
        "type" : "OperationOutcome",
        "url" : "https://example.com/output/err_file_1.ndjson"
        }]
        }
    

Example of status request:

GET https://kodjin-staging.edenlab.dev/fhir/export/5df2f390-2285-4f0d-8917-f1064fb6479a

Retrieving Data

When the server completes files generation, the response contains links to generated files that can be now downloaded by the client. Links are signed URLs. The files contain data in NDJSON format, each resource type in a separate file.

Example of a response:

{
    "output": [
        {
            "type": "Observation",
            "url": "https://kodjin-staging.edenlab.dev/io/kodjin-export/5df2f390-2285-4f0d-8917-f1064fb6479a/Observation.ndjson?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=user%2F20220805%2F%2Fs3%2Faws4_request&X-Amz-Date=20220805T085248Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=fe04d160c9405d322cf511bb2fb47f5c584ae8bdb2fba962db2b6a1c7c19125a"
        },
        {
            "type": "Condition",
            "url": "https://kodjin-staging.edenlab.dev/io/kodjin-export/5df2f390-2285-4f0d-8917-f1064fb6479a/Condition.ndjson?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=user%2F20220805%2F%2Fs3%2Faws4_request&X-Amz-Date=20220805T085248Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=b0ed0b70defe60154574ca9fb7ea927efeaf8b6cefaf33132c936cce0977eca3"
        },
        {
            "type": "Patient",
            "url": "https://kodjin-staging.edenlab.dev/io/kodjin-export/5df2f390-2285-4f0d-8917-f1064fb6479a/Patient.ndjson?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=user%2F20220805%2F%2Fs3%2Faws4_request&X-Amz-Date=20220805T085248Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=b3655b6c577f4f7c5860587d3379a6f2a42f2129636091005eb43c62c75acf87"
        }
    ],
    "request": "curl -X GET 'https://elation-staging.edenlab.dev/fhir/Group/3a457da3-b10e-48f9-b78e-467c396f8092/$export?_type=Observation,Condition&_since=2022-07-13T00:00:00Z&_typeFilter=Observation%3Fcode%3Dhttp://loinc.org|718-7,Condition%3Fcategory%3Dhttp://terminology.hl7.org/CodeSystem/condition-category|encounter-diagnosis' -H 'prefer:respond-async'",
    "transactionTime": "2022-08-05T08:52:45.692Z"
}

Example of NDJSON file:

{"id":"5c41cecf-cf81-434f-9da7-e24e5a99dbc2","name":[{"given":["Brenda"],"family":["Jackson"]}],"gender":"female","birthDate":"1956-10-14T00:00:00.000Z","resourceType":"Patient"}
{"id":"3fabcb98-0995-447d-a03f-314d202b32f4","name":[{"given":["Bram"],"family":["Sandeep"]}],"gender":"male","birthDate":"1994-11-01T00:00:00.000Z","resourceType":"Patient"}
{"id":"945e5c7f-504b-43bd-9562-a2ef82c244b2","name":[{"given":["Sandy"],"family":["Hamlin"]}],"gender":"female","birthDate":"1988-01-24T00:00:00.000Z","resourceType":"Patient"}

Delete Request

Bulk Data Export process can be stopped by sending delete request to the URL, returned in Kick-Off response Content-Location header.

Example:

DELETE https://kodjin-staging.edenlab.dev/fhir/export/da4438a9-e1d7-4400-9d80-4ed23fbbccc3