Souce of Raw Data & API

This project drive data from clinicalTrials.gov by using its own API, ClinicalTrials.gov API. There are three categories of query URLs listed on this website, which obtain API data in different levels. This project employs Study Fields Query URL Type to access the API data. This is the link to Builder API.




Build API URL Online

Study Fields Query URL Type expect the user to specify search expression, study fields, minimum rank, maximum rank and format. Maximum limit of 20 study fields can be searched at once and 1,000 study records can be returned when minimum rank and maximum rank parameters are set. Based on the focus of the study, the search expression is bacterial infection. 20 study fields are selected and specified on the page in their API Field Name according webisite of ClinicalTrials.gov Data Element-to-API Field Crosswalks. Set Minimum Rank and Maximum Rank as 1 and 1000, respectively. Require the url in csv format. Then, click the 'Send Request' button on the page. The Response section of the page display the generated URL and study records subsequently.

Since there are more than 6000 studies found, while each search only return maximum number of 1000 studies. This project repeats the Send Request procedure with changing the values of minimum ranks and maximum ranks 7 times in total, to obtain the information of all study records.

Once this project has obtained the API URL for all 6000 studies, the next step is to code API in both Python and R.



Screenshot of raw data

Links of Dataset & Codes:

Python API
R API
Dataset 1
Dataset 2
Dataset 3
Dataset 4
Dataset 5
Dataset 6
Dataset 7