Problem Statement
A client has asked me to create a solution to return the Police Forces of the UK into a Storage Account in JSON format. To achieve this, the client asked me to use the following:
Police Data API - https://data.police.uk/docs/
Azure Storage Account for the Export of the Forces as JSON files
Azure Data Factory to coordinate the pipeline
Task:
Loop through each force to return the Force Info. Using https://data.police.uk/docs/method/force/
The Sink datasets should be the Force Name + ".json"
Solution
To resolve the client's problem I designed the following pipe as my solution.
data:image/s3,"s3://crabby-images/3d5f2/3d5f20b8a1163427c35b5f4af33b75b75ddae99e" alt=""
The pre-requisites are:
Have a Azure Storage Account set up with a container for the data.
data:image/s3,"s3://crabby-images/a4e8e/a4e8e7374b46bdf839e473749e0ae6970b664cf0" alt=""
Additional steps before we start working on the pipe is to set-up the the following:
Have 2 linked services
Police Forces API -> https://data.police.uk/api/forces
Azure Storage Account Container
Pipeline parameters:
@MasterFileName - String - "Police_Force_Master.json"
Pipeline Variables:
PoliceFileName - String
PoliceFileId - String
The logical progression of the pipe is:
Copy Data Activity -> to copy all the police data from the API into one master file
data:image/s3,"s3://crabby-images/3a882/3a882ce641ac7a1f3d12f949b420bedf81d21ff3" alt=""
Source Dataset Configuration
data:image/s3,"s3://crabby-images/72f1a/72f1af6b109a35af8d8f2d454cc20e291ffb850b" alt=""
Sink Dataset Configuration
data:image/s3,"s3://crabby-images/223af/223af2e015ab8e838a6f1771553f407d229762d6" alt=""
I set up a dataset variable called FileName to pass over the Police Master File parameter that we created at the beginning of the pipe.
Lookup Activity -> This lookup activity uses the Master File that we created in the first step and loops through all the items to return their values.
data:image/s3,"s3://crabby-images/7efae/7efaed5aeeb15ef49aabde22bf69556c3635fe79" alt=""
ForEach loop Activity -> For each item - set @PoliceFileName variable and @PoliceFileId variable and create individual files in the blob container
We pass the following expression in the For each loop settings and tick the sequential.
@activity('lookup police forces').output.value
data:image/s3,"s3://crabby-images/a4e62/a4e62ca0ecfbff164e9b3a9a334e3af47ff67e83" alt=""
Set @PoliceFileName variable
data:image/s3,"s3://crabby-images/a8dbd/a8dbd944e1aa056d8fe03ef713bd22a2079800e9" alt=""
Set @PoliceFileId variable
data:image/s3,"s3://crabby-images/5cf61/5cf61d23798c4eb2def8ae50ebd1413b0cf4bc76" alt=""
Copy Activity -> Police Specific Forces Extract
Source Dataset Configuration
data:image/s3,"s3://crabby-images/3aa0b/3aa0b07b24ec3f030c74a2ed7d41430513e9a254" alt=""
Sink Dataset Configuration
data:image/s3,"s3://crabby-images/977d4/977d4ce1d5ccf0063992dc2d86737d5b252c18a1" alt=""
Result
The container is populated with JSON files for each police force with relevant police force data. Each file is named -> police force name + ".json".
data:image/s3,"s3://crabby-images/e17cf/e17cf2b7389a4cc5f6e8e775518ed05f0e47cbf5" alt=""
data:image/s3,"s3://crabby-images/271e6/271e6fed4be75e425aae42497e792d96be2fb16a" alt=""
Comments