Bacterial infection affect large population worldwide, which is one of the most common disease condition. The prevalence of bacterial infection drive significant amount of researches and development towards anti-bacterial treatment in pharmaceutical companies.

Clinical trials are the processes of evaluating the safety and effectiveness of drugs, which are performed in both patient and volunteers. Each new medicine are required to go through 4 phases of clinical trials before reaching final approval. The whole process of clinical trials are extremely expensive, and take 5-7 years to carry out.


4 phases of clinical trial

Phase 1 is carried out on healthy volunteers. It usually take one year. This phase is useful in establishing dose levels, from small dose to high dose, as well as establishment of the maxim tolerated dose. It is also useful period for studying pharmacokinetics.

Phase 2 is carried out on patients as double blind studies, neither the doctor nor the patient knows whether a placebo or drug is administrated. This phase commonly takes 2 years. Since they are conducted on patients, they can demonstrates whether a drug is therapeutically useful and identify side-effects. Dosing regime is established on this phase.

Phase 3 is carried out on a larger number of patients for approximately 3 years. This phase establishes statistical proof for efficacy, safety and optimum dose, as well as continue to identify the side effects.

Phase 4 studies the long term effects of medicines when used chronically. The drug has been approved by the FDA, and can be used commercially. The phase involves thousands of participants and continues after a drug reaches the market to see whether any unusual side effects could occur.


cost of clinical trials

The average cost of phase 1, 2, and 3 clinical trials across therapeutic areas is around $4, 13, and 20 million respectively. In order to make sure gain profits, the investors and pharmaceutical companies require large amount of information to make delicate decision on both drugs and how to set the trial in the optimal way.

Different clinical trials have different length of studies, not only related to the properties of drug itself, but also the settings of clinical trials. Under the situation of ensuring the safety, quality and reliability, clinical trials with shorter durations are usually preferred, in terms of the precious time, massive amounts of money input and impacts on patients.

However, a large proportion of clinical trials face delays. The delay of clinical trials can place both significant financial impact and human implications. Apart from immediate impact on the budget of the trials, delays can shorten the patent windows of drugs, thus lower their long-term profitability. On the other hand, the delayed clinical trials can pose some unnecessary and unwanted burden on patients, such as exacerbation of their symptoms and adverse effects of medicines. On the other hand, the terminated clinical trials can pose a big hole in investigators’ budget.

This project is based on data gathered from ClinicalTrial.gov website, focusing on clinical trials on bacterial infections and try to answer some data science questions.


Reasons for trial termination



TEN Data Dcience Questions:


  1. How would the age of the participants affect the length of the clinical trials?

  2. whether the gender of participants affect the length of clinical trial.

  3. What is the focuses on trials of various types of bacterial infections?

  4. How would the health status of participants affect the length of clinical trials and occurrence of adverse events?

  5. Does the enrolment size of participants affect the length of the trial?

  6. How similar the trials are among various types of bacterial infections?

  7. How would the amount of adverse events affect the length of clinical trials?

  8. How likely a treatment would cause risk of adverse effect?

  9. How likely a clinical trial would stop without finishing?

  10. Can we predict whether a clinical trial would finish or stop based on the current attributes?



The goals for this project is to provide more information for pharmaceutical professionals, Investors and other related personnel to make better decision, speed up the clinical trials and save the costs. Thus, it ultimately provides better treatment options and healthcare service to patients.