[SVI Factors Data Preparation](flow_zone:9raboQY) applied a number of steps for processing data in [Census Data - SVI factors](flow_zone:D8dQK96) and generating a number of statistical metrics for applying data analytics in follow-up steps. 

![sviprep.png](9EXi9DQ0hlzD)

A number of variables are computed by following the [CDC SVI Documentation 2020](https://www.atsdr.cdc.gov/placeandhealth/svi/documentation/SVI_documentation_2020.html). Definitions of commonly used notions throughout this solution and the standard annotation of social vulnerability features mapped to the actual meaning are listed below.

## Definitions
The Social Vulnerability Index (SVI) uses U.S. Census data to determine the social vulnerability of every census tract. The SVI ranks each census tract on 16 social factors and groups them into four related themes.

 **Themes** 
  1. Socioeconomic Status: Below 150% Poverty, Unemployed, Housing Cost Burden, No High School Diploma, No Health Insurance
  2. Household Characteristics: Aged 65 & Older, Aged 17 & Younger, Civilian with a Disability, Single-Parent Households, English Language Proficiency
  3. Racial & Ethnic Minority Status: Hispanic or Latino (of any race); Black and African American, Not Hispanic or Latino; American Indian and Alaska Native, Not Hispanic or Latino; Asian, Not Hispanic or Latino; Native Hawaiian and Other Pacific Islander, Not Hispanic or Latino; Two or More Races, Not Hispanic or Latino; Other Races, Not Hispanic or Latino
  4. Housing Type & Transportation: Multi-Unit Structures, Mobile Homes, Crowding, No Vehicle, Group Quarters
  
See the mapping of each social factor and the Census variable name in [census_svi_dictionary](dataset:census_svi_dictionary).


**Rankings** 
We rank census tracts for the entire United States against one another.  This feature layer can be used for mapping and analysis of the relative vulnerability of census tracts in multiple states, or across the U.S. as a whole. Census tract rankings are based on percentiles (rank the percentage of each value and generate the quantile on 4 significant digits (1000). Percentile ranking values range from 0 to 1, with higher values indicating greater vulnerability. For each census tract, we generated its percentile rank among all census tracts for 1) the 16 individual variables, 2) the four themes, and 3) Its overall position.

## Annotation Table

|    Column Code    |   Statistical Meaning | Computation | Name Mapping | Recipe |
|------------|-------------------|-------------------|-------------------|-------------------|
|E_ | Actual Value Estimate | | | |
| EP_ | Percentage over the total population | ```(E_ / Total population estimate)*100``` | Percent |[compute_svi_tracts_prepared](recipe:compute_svi_tracts_prepared) |
| EPL_ | Percentile Percentage of actual value estimate  | Quantile window recipe on [```(rank(EP)/Number of records)*1000```](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rank.html) | Percentile | [compute_svr_tracts_percentage_ntile](recipe:compute_svr_tracts_percentage_ntile) |
| SPL_ | Sum of Series of the 4 themes | ```SUM{EPL_}``` | Percent Theme | [compute_svi_tracts_features](recipe:compute_svi_tracts_features)|
| RPL_ | Percentile ranking of a theme | Quantile window recipe on [```(rank(SPL)/Number of records)*1000```](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rank.html) | Theme Percentile | [compute_svi_tracts_window_themes](recipe:compute_svi_tracts_window_themes) |
| RPL_THEMES | Overall percentile ranking | | Social Vulnerability Index| [compute_svi_tracts_overall_vulnerability](recipe:compute_svi_tracts_overall_vulnerability)|

Recipes: [compute_svi_county_joined_prepared](recipe:compute_svi_county_joined_prepared), [compute_svi_county_oercentile_percentage](recipe:compute_svi_county_oercentile_percentage), [compute_svi_county_window_themes](recipe:compute_svi_county_window_themes), [compute_svi_county_overall_vulnerability](recipe:compute_svi_county_overall_vulnerability) generate the same metrics at a county level.

