Unmapped Concepts: Multi Site, Anomaly Detection, Cross-Sectional Analysis
| dc.contributor | Patient-Centered Outcomes Research Institute |
| dc.contributor.author | Wieand, Kaleigh |
| dc.contributor.author | Razzaghi, Hanieh |
| dc.contributor.other | PEDSnet Data Coordinating Center |
| dc.date.accessioned | 2026-04-07T17:41:56Z |
| dc.date.created | 2026-03-27 |
| dc.description.abstract | This check provides raw data and visualizations to aid a user in evaluating whether unmapped values are present in a dataset of interest. It summarizes the proportion of rows & patients with unmapped values, as well as the median number of unmapped rows per patient. |
| dc.identifier.uri | https://hdl.handle.net/20.500.14642/1570 |
| dc.identifier.uri | https://doi.org/10.24373/pdsp-641 |
| dc.publisher | PEDSnet |
| dc.relation.uri | https://github.com/ssdqa/unmappedconcepts |
| dc.rights | a CC-BY Attribution 4.0 License. |
| dc.rights.uri | http://creativecommons.org/licenses/by/4.0 |
| dc.subject | Event-Level Analysis |
| dc.subject | Clinical Data Distributions |
| dc.subject | Expected Clinical Event Representation |
| dc.subject | Missing Expected Data |
| dc.title | Unmapped Concepts: Multi Site, Anomaly Detection, Cross-Sectional Analysis |
| dspace.entity.type | DQCheck |
| local.code.package | # install.packages("devtools") devtools::install_github('ssdqa/https://github.com/ssdqa/unmappedconcepts') |
| local.description.raw | This check produces a raw data output containing 13 columns: <br> |Column |Data Type|Definition | |----------------|---------|--------------------------------------------------------------------------------------------| |`site` |character|the name of the site being targeted | |`variable` |character|the name of the variable being investigated for unmapped values | |`total_rows` |numeric |the total number of rows associated with the variable | |`total_pt` |numeric |the total number of patients associated with the variable | |`unmapped_rows` |numeric |the number of unmapped rows associated with the variable | |`unmapped_pt` |numeric |the number of patients with at least one unmapped row associated with the variable | |`unmapped_row_prop`|numeric|the proportion of unmapped rows| |`unmapped_pt_prop`|numeric|the proportion of patients with at least one unmapped row| |`median_all_with0s`|numeric|the median number of unmapped rows per patient, for all patients, across all sites| |`median_all_without0s`|numeric|the median number of unmapped rows per patient, for only patients with evidence of the variable, across all sites| |`median_site_with0s`|numeric|the median number of unmapped rows per patient, for all patients, across a specific site| |`median_site_without0s`|numeric|the median number of unmapped rows per patient, for only patients with evidence of the variable, for a specific site| |`mean_val` | numeric | the mean proportion of patients or rows (based on user selection) for each group across sites | |`median_val` | numeric | the median proportion of patients or rows (based on user selection) for each group across sites | |`sd_val` | numeric | the standard deviation of the proportion of patients or rows (based on user selection) for each group across sites | |`mad_val` | numeric | the median absolute deviation of the proportion of patients or rows (based on user selection) for each group across sites | |`cov_val` | numeric | the coefficient of variance of the proportion of patients or rows (based on user selection) for each group across sites | |`max_val` | numeric | the maximum proportion of patients or rows (based on user selection) for each group across sites | |`min_val` | numeric | the minimum prorportion of patients or rows (based on user selection) for each group across sites | |`range_val` | numeric | the range of the proportion of patients or rows (based on user selection) for each group across sites | |`total_ct` | numeric | the total number of group members | |`analysis_eligible` | character | a string indicating whether the group is eligible for anomaly detection analysis | |`lower_tail` | numeric | the lower bound used to identify low anomalies | |`upper_tail` | numeric | the upper bound used to identify high anomalies | |`anomaly_yn` | character | a string indicating whether the value is anomalous or not | |`output_function`|character|a string indicating the type of visualization that should be generated by uc_output| {.dqcheck-table} |
| local.description.viz | This check outputs a dot plot representing anomalous proportions of patients or rows with unmapped values for a given variable per site. This graph summarizes the mean value for the variable by the dot size, the proportion of unmapped values by the dot color, and whether that variable is anomalous by replacing the dot with a star. A tooltip provides metadata for the variable and the site and precise values for proportion, mean proportion, median proportion, standard deviation and MAD upon hover. |
| local.dqcheck.category | Conformance |
| local.dqcheck.measurement | Hotspots Outlier Detection |
| local.dqcheck.requirement | cohort |
| local.dqcheck.requirement | uc_input_file |
| local.dqcheck.requirement | omop_or_pcornet |
| local.dqcheck.requirement | single_or_multi_site |
| local.dqcheck.requirement | anomaly_or_exploratory |
| local.dqcheck.requirement | time |
| local.dqcheck.requirement | patient_level_tbl |
| local.dqcheck.requirement | output_level |
| local.dqcheck.type | Concept Set Testing |
| local.dqcheck.viz | Dot and Star Plot |
Files
Original bundle
1 - 1 of 1
