The Study Similarity & Clinical Site Intelligence Web App is an interactive interface for users to query clinical site intelligence with study protocols. It distills the operational history of similar studies and associated clinical sites, representing the intelligence with easy-to-read charts. Users initiate the Web App by providing a study protocol and can interact with each step/component. Finally, users can export the list of selected clinical sites in the last step of the Web App for further analysis. 

There are three tabs in this Web App: **Study Summary**, **Studies and Site**, and **Site Cards**.
1. Initiate each search by entering a study protocol in Tab 1, Study Summary.
2. Review and select similar studies and their associated clinical sites in Tab 2, Studies and Site.
3. Read individual site cards, curate, and export the final list of site candidates in Tab 3, Site Cards.
 

# Tab 1: Study Summary
Start a new query with an existing or a novel study protocol. Users must select a valid National Clinical Trial (NCT) Identification Number for an existing study. Users can also enter a self-defined study protocol for a novel study. The fields include study title, summary, cohort age, sex, inclusion and exclusion criteria, healthy volunteers, and Mesh conditions. 

![webapp-study_summary.jpg](IlWC2YULGhsq)
**Figure 2**. There are two different queries for the Study Summary menu in the left panel. Users can select their method to query the Web App: Existing Study or Novel Study. Each method has its required field for data entry.


After submitting the query, the right main panel will display the result of the study summary. The Web App splits the summary into four tabs at the top: **Summary**, **Patient eligibility**, **Study arms**, and **Study sites**. The Web App returns all four tabs for existing studies and the first two for novel studies. 



# Tab 2: Studies and Sites
For a given study protocol, the Web App queries the study similarity index prebuilt by the Dataiku application and returns the top 20 similar study protocols. Then, it identifies clinical sites recruited by these similar studies. It shows the results in two tabs: **Similar studies** and **Candidate sites**. 
The left panel of both tabs serves as a filter for users to drop the studies or sites. The filter for the **Similar studies** tab will regenerate the list of sites in the **Candidate sites** tab. Meanwhile, the filter for the **Candidate sites** tab will pass on to generate the site scorecards in Tab 3, Site Scorecards. 



![Screenshot 2024-01-19 at 16.48.35.png](flOATZYuV0n7)
**Figure 3A**. Similar Studies Tab. The right main panel presents the results of similar studies in cards. The filter on the left panel allows users to filter similar studies by study features. The filter result will be passed on to the Candidate Sites tab.

![Screenshot 2024-03-25 at 11.14.28.png](Yc2ugWiin253)
**Figure 3B**. Candidate Sites Tab. It lists all clinical sites associated with the selected similar studies and the intelligence of competing studies. The filter on the left panel allows users to filter the candidate sites list by location. The result of the filter will pass on to generate the site scorecards. The 'Has Competing Studies' column indicates whether the site is recruiting for other similar studies. (We consider studies with the status 'recruiting' and 'not yet recruiting' as competing studies.)


# Tab 3: Site Cards
The Site Cards provide visualized insights on individual clinical research sites, including geolocation, SDOH,  studies involved, and competing sponsors. The left panel logs the users' review history on the list of candidate sites and allows users to drop locations. Finally, the user can export the curated list of sites for further analysis. 

The CT.gov database doesn't provide unique identifiers to clinical sites. Therefore, we implement a data-oriented approach to assign unique identifiers to clinical sites based on their name variation and shared geolocation information. Please review [article](article:9) for more details.

Please note that the insights provided by the Site Cards are restricted to the study query set by the user during project setup using the Dataiku application.

![Screenshot 2024-03-25 at 11.15.08.png](pgnZpRMyG7lQ)
**Figure 4A**. Site Review - Geo location and US SDOH. The SDOH metrics are available for US sites if the SDOH datasets are included in the Dataiku application. 

![Screenshot 2024-03-25 at 11.25.54.png](Vlo6a7MH3jNg)
**Figure 4B-1**. Site Card - Charts and metrics. The metrics and charts summarize the studies that have recruited participants from the site, including the total count and average quarterly count for current active studies, as well as historical and new studies.

![Screenshot 2024-03-25 at 11.27.36.png](3GwvYv2eYJwi)
**Figure 4B-2**. Site Card - Charts and metrics. 

![Screenshot 2024-03-25 at 11.28.40.png](ooCePLajmV6i)
**Figure 4B-3**. Site Card - Competing sponsors and study timeline. We consider studies with the status 'recruiting' and 'not yet recruiting' to be enrolling.

![Screenshot 2024-03-25 at 11.29.20.png](gu54ZTq74cw9)
**Figure 4C**. Site Card - Statistic of historical study recruitment 