Exporting large FHIR datasets from a FHIR server using regular RESTful API can be a tricky task: each resource type requires a separate API call and, taking pagination into account, it can result in hundreds of requests. Bulk Data Export is an operation that aimed to solve this issue.
The FHIR Bulk Export Service is intended to fulfill the 21st Century Cures bulk data export requirements. Bulk export operation allows to configure and invoke data export within one API call, whether that be data for all patients, data for a subset (defined group) of patients, or all FHIR data on the server. The export process happens asynchronously to lower the load on system performance and the results are available for several days to download from media storage.
The FHIR Bulk Export service enables API consumers to make export USCDI (United States Core Data for Interoperability) clinical data for all patients in a particular context.
This implementation based on Bulk Data Access IG.
Process overview
- Client invokes Bulk Data Export process by sending Kick-Off request
- Server validates the Kick-Off request and responds with a link to a job in the Content-Location header
- Client checks job status by poling from the Content-Location header. If the export process is complete, the response will contain links to the generated files.
- Client downloads generated files using links, provided in the job
A sequence diagram of the bulk export workflow provided by HL7 here
Implementation notes
- Exported files are hosted in a protected AWS S3 compatible bucket in
.ndjson
format. - The URLs returned in the job completion response manifest are AWS S3 Self-Signed URLs. These URLs are valid for a period of 7 days after manifest retrieval.
- When polling for job status, or canceling a job, FHIR client must have a valid auth token with the same client ID as the one used to initiate the export job.
- Group-level export targets Patient-Compartment for resources required by USCDI v2. This means we export resources that are referenced by a resource within the patient compartment and excludes resources with no data available on the patient record. Additionally, server provides Encounter, Location, Organization, and Practitioner resources as they are referenced as must support elements in required resources.
- For the Bulk Export JWT assertion-based authentication is required for the client regarding SMART Backend Services Protocol Details. For the supported authorization in Elation EMR FHIR API refer to the link.
Authorization
- Bulk exports respect the scopes provided to the authentication process. For instance if the authentication request for the token used to trigger a bulk export to
[base url]/$export
only includes the scopepatient/*.read
, then the export will only include patient data, if it contains some combination of<entity>/*.read
scopes then the export will contain all data that the scopes permit that token to access (system/*.read
will grant access to all data). - There are two additional scopes that need to be specified in order to allow a bulk export to be triggered and accessed upon completion,
system/Export.write
andsystem/Export.read
.system/Export.write
permits requests to trigger bulk exports (but not to access data, clients will need to include another scope in the request in order for the export to contain data),system/Export.read
permits clients to check the status of bulk exports (this is required to access the final download URL of the export).
Kick-Off request
Bulk Data Export process can be invoked via GET request for smaller sets of query params, or via POST request, supplying parameters in the FHIR Parameters Resource, for larger ones.
Levels of export
There are three endpoints available to customize export for a particular case of use:
Name | Description | URL Syntax |
---|---|---|
System Level Export | Export data from a FHIR server, whether or not it is associated with a patient. | [base url]/$export |
All Patients | FHIR Operation to obtain a detailed set of FHIR resources of diverse resource types pertaining to all patients. | [base url]/Patient/$export |
Group of Patients | FHIR Operation to obtain a detailed set of FHIR resources of diverse resource types pertaining to all members of a specified Group. | [base url]/Group/[id]/$export |
Parameters
-
_since
- FHIR instant- Resources are included in the response if their state has changed after the supplied time (e.g. if Resource.meta.lastUpdated is later than the supplied _since time).
- Example:
_since=2022-07-13T00:00:00Z
-
_type
- string of comma-delimited FHIR resource types- Response is filtered to only include resources of the specified resource types(s).
- Example:
_type=Observation,Condition
- Note: that for Patient and Group Level export referenced patients are always in the response
-
_elements
- string of comma-delimited FHIR Elements- Unlisted, non-mandatory elements are omitted from the resources returned. Elements should be of the form [resource type].[element name] (eg.
Patient.id
) or [element name] (eg.id
) and only root elements in a resource are permitted. If the resource type is omitted, the element returned for all resources in the response where it is applicable. Mandatory elements are always returned whether they are requested or not.
- Unlisted, non-mandatory elements are omitted from the resources returned. Elements should be of the form [resource type].[element name] (eg.
-
patient
- FHIR Reference - Applied only for POST requests-
Return resources in patient compartments belonging to patients from the list.
-
Note: Not applicable to system level export requests.
-
Example
POST Example: {"name":"patient", "valueReference": {"reference": "Patient/ec89d1cc-48f3-49ad-a80e-bc94f2f542af"}}
-
-
_typeFilter
- string of comma delimited values. String of comma separated FHIR REST search queries. When provided, the server filter the data in the response to only include resources that meet the specified criteria.-
Note: Currently not supported for Patient resource
-
Example:
POST Example: {"name":"_typeFilter", "valueString": "Condition?asserter=Practitioner/d010b87b-6d7d-460c-a52d-8f741b9b0a72"},
GET Example: _typeFilter=Observation%3Fcode%3Dhttp://loinc.org|718-7,Condition%3Fcategory%3Dhttp://terminology.hl7.org/CodeSystem/condition-category|encounter-diagnosis
-
-
_outputFormat
- string - Can be set toapplication/fhir+ndjson
orapplication/ndjson
orndjson
.
Note that for Patient level Export
and Group level export
, patients are always exported regardless of search params, if they are referenced by resources, present in the response.
Headers
There are two required header parameters defined by the current $export specification:
Accept
-application/fhir+json
Prefer
-respond-async
Examples
_typeFilter
parameter use:
The example below demonstrates how to configure a kick-off request to export only patients from a group who had a reaction to immunization. In this case, _typeFilter
contains a search immunization query with :missing=false modifier
.
curl --location --request GET 'https://kodjin-staging.edenlab.dev//fhir/Group/0a60d2a2-38ce-49f6-ac45-42347193af50/$export?_type=Immunization&_typeFilter=Immunization%3Freaction-date:missing%3Dfalse' \
--header 'content-type: application/json' \
--header 'prefer: respond-async' \
--data-raw ''
- POST request use:
Some requests may contain lots of filter parameters. In this case, it is convenient to use a POST request and supply filter parameters in the body. The example below demonstrates how to export resources filtered by practitioner ID. The request contains the _typeFilter parameter for each resource type.
curl --location --request POST 'https://kodjin-staging.edenlab.dev//fhir/$export' \
--header 'content-type: application/json' \
--header 'prefer: respond-async' \
--data-raw '{"resourceType" : "Parameters",
"parameter" : [
{"name":"_since",
"valueInstant": "2022-01-01T00:00:00Z"},
{"name":"_type",
"valueString": "Observation, Condition, Procedure, Immunization"},
{"name":"_typeFilter",
"valueString": "Observation?performer=Practitioner/9bac339d-ac3b-4715-bf9a-1dab1dec7fa2"},
{"name":"_typeFilter",
"valueString": "Condition?asserter=Practitioner/9bac339d-ac3b-4715-bf9a-1dab1dec7fa2"},
{"name":"_typeFilter",
"valueString": "Procedure?performer=Practitioner/9bac339d-ac3b-4715-bf9a-1dab1dec7fa2"},
{"name":"_typeFilter",
"valueString": "Immunization?performer=Practitioner/9bac339d-ac3b-4715-bf9a-1dab1dec7fa2"}
]
}'
_elements
parameter use:
In some cases, you will need only a short set of fields for analysis instead of the entire resource. The example below demonstrates how to export only condition and observation codes using an _elements parameter.
curl --location --request GET 'https://kodjin-staging.edenlab.dev//fhir/$export?_type=Observation,Condition&_since=2022-07-13T00:00:00Z&_elements=code' \
--header 'content-type: application/json' \
--header 'prefer: respond-async' \
--data-raw ''
Status Request
When the Data Export process is invoked it can take time for the server to generate all the files. Client can check the status of the job export by polling from the Content-Location
header, returned on Kick-Off Request.
Response can be one of:
-
'In-progress' - Returned by the server while it is processing the $export request.
-
Response Headers and Body:
Status: 202 Accepted
-
-
'Error' - Returned by the server if the export operation fails.
-
Response Headers and Body:
Status: 500 Internal Server Error
Content-Type: application/json
{ "resourceType": "OperationOutcome", "id": "1", "issue": [ { "severity": "error", "code": "processing", "details": { "text": "An internal timeout has occurred" } } ] }
-
-
Complete
- Returned by the server when the export operation has completed.-
Response Headers and Body:
Status: 200 OK
Expires: Mon, 22 Jul 2022 23:59:59 GMT
Content-Type: application/json
{ "transactionTime": "2022-08-21T00:00:00Z", "request" : "https://example.com/fhir/Patient/$export?_type=Patient,Observation", "output" : [{ "type" : "Patient", "url" : "https://example.com/output/patient_file_1.ndjson" },{ "type" : "Patient", "url" : "https://example.com/output/patient_file_2.ndjson" },{ "type" : "Observation", "url" : "https://example.com/output/observation_file_1.ndjson" }], "deleted" : [{ "type" : "Bundle", "url" : "https://example.com/output/del_file_1.ndjson" }], "error" : [{ "type" : "OperationOutcome", "url" : "https://example.com/output/err_file_1.ndjson" }] }
-
Example of status request:
GET https://kodjin-staging.edenlab.dev/fhir/export/5df2f390-2285-4f0d-8917-f1064fb6479a
Retrieving Data
When the server completes files generation, the response contains links to generated files that can be now downloaded by the client. Links are signed URLs. The files contain data in NDJSON format, each resource type in a separate file.
Example of a response:
{
"output": [
{
"type": "Observation",
"url": "https://kodjin-staging.edenlab.dev/io/kodjin-export/5df2f390-2285-4f0d-8917-f1064fb6479a/Observation.ndjson?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=user%2F20220805%2F%2Fs3%2Faws4_request&X-Amz-Date=20220805T085248Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=fe04d160c9405d322cf511bb2fb47f5c584ae8bdb2fba962db2b6a1c7c19125a"
},
{
"type": "Condition",
"url": "https://kodjin-staging.edenlab.dev/io/kodjin-export/5df2f390-2285-4f0d-8917-f1064fb6479a/Condition.ndjson?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=user%2F20220805%2F%2Fs3%2Faws4_request&X-Amz-Date=20220805T085248Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=b0ed0b70defe60154574ca9fb7ea927efeaf8b6cefaf33132c936cce0977eca3"
},
{
"type": "Patient",
"url": "https://kodjin-staging.edenlab.dev/io/kodjin-export/5df2f390-2285-4f0d-8917-f1064fb6479a/Patient.ndjson?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=user%2F20220805%2F%2Fs3%2Faws4_request&X-Amz-Date=20220805T085248Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=b3655b6c577f4f7c5860587d3379a6f2a42f2129636091005eb43c62c75acf87"
}
],
"request": "curl -X GET 'https://elation-staging.edenlab.dev/fhir/Group/3a457da3-b10e-48f9-b78e-467c396f8092/$export?_type=Observation,Condition&_since=2022-07-13T00:00:00Z&_typeFilter=Observation%3Fcode%3Dhttp://loinc.org|718-7,Condition%3Fcategory%3Dhttp://terminology.hl7.org/CodeSystem/condition-category|encounter-diagnosis' -H 'prefer:respond-async'",
"transactionTime": "2022-08-05T08:52:45.692Z"
}
Example of NDJSON file:
{"id":"5c41cecf-cf81-434f-9da7-e24e5a99dbc2","name":[{"given":["Brenda"],"family":["Jackson"]}],"gender":"female","birthDate":"1956-10-14T00:00:00.000Z","resourceType":"Patient"}
{"id":"3fabcb98-0995-447d-a03f-314d202b32f4","name":[{"given":["Bram"],"family":["Sandeep"]}],"gender":"male","birthDate":"1994-11-01T00:00:00.000Z","resourceType":"Patient"}
{"id":"945e5c7f-504b-43bd-9562-a2ef82c244b2","name":[{"given":["Sandy"],"family":["Hamlin"]}],"gender":"female","birthDate":"1988-01-24T00:00:00.000Z","resourceType":"Patient"}
Delete Request
Bulk Data Export process can be stopped by sending delete request to the URL, returned in Kick-Off response Content-Location header.
Example:
DELETE https://kodjin-staging.edenlab.dev/fhir/export/da4438a9-e1d7-4400-9d80-4ed23fbbccc3