Skip to contents

Get information for variable of interest (e.g., clinical endpoints) from long data frame of protocol- or result-related trial information as returned by dfTrials2Long. Parameters `valuename`, `wherename` and `wherevalue` are matched using Perl regular expressions and ignoring case.

Usage

dfName2Value(df, valuename = "", wherename = "", wherevalue = "")

Arguments

df

A data frame (or tibble) with four columns (`_id`, `identifier`, `name`, `value`) as returned by dfTrials2Long

valuename

A character string for the name of the field that holds the value of the variable of interest (e.g., a summary measure such as "endPoints.*tendencyValue.value")

wherename

(optional) A character string to identify the variable of interest among those that repeatedly occur in a trial record (e.g., "endPoints.endPoint.title")

wherevalue

(optional) A character string with the value of the variable identified by `wherename` (e.g., "response")

Value

A data frame (or tibble, if tibble is loaded) that includes the values of interest, with columns `_id`, `identifier`, `name`, `value` and `where` (with the contents of `wherevalue` found at `wherename`). Contents of `value` are strings unless all its elements are numbers. The `identifier` is generated by function dfTrials2Long to identify matching elements, e.g endpoint descriptions and measurements.

Examples


dbc <- nodbi::src_sqlite(
    dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
    collection = "my_trials"
)
#> RSQLite version has enabled accelerating docdb_create() and docdb_update() functions when used with value = <NDJSON file name>.

dfwide <- dbGetFieldsIntoDf(
    fields = c(
        ## ctgov - typical results fields
        # "clinical_results.baseline.analyzed_list.analyzed.count_list.count",
        # "clinical_results.baseline.group_list.group",
        # "clinical_results.baseline.analyzed_list.analyzed.units",
        "clinical_results.outcome_list.outcome",
        "study_design_info.allocation",
        ## euctr - typical results fields
        # "trialInformation.fullTitle",
        # "baselineCharacteristics.baselineReportingGroups.baselineReportingGroup",
        # "trialChanges.hasGlobalInterruptions",
        # "subjectAnalysisSets",
        # "adverseEvents.seriousAdverseEvents.seriousAdverseEvent",
        "endPoints.endPoint",
        "subjectDisposition.recruitmentDetails"
    ), con = dbc
)

dflong <- dfTrials2Long(df = dfwide)
#> clinical_results.outcome_list.outcome                                                                                                                                                                   

#> study_design_info.allocation                                                                                                                                                                            

#> endPoints.endPoint                                                                                                                                                                                      

#> subjectDisposition.recruitmentDetails                                                                                                                                                                   

#>                                                                                                                                                                                                         

#> . 
#> . 
#> . 
#> . 
#> . 
#> . 
#> . 
#> . 
#> 
#> Total 7096 rows, 79 unique names of variables

## get values for the endpoint 'response'
dfName2Value(
    df = dflong,
    valuename = paste0(
        "clinical_results.*measurement.value|",
        "clinical_results.*outcome.measure.units|",
        "endPoints.endPoint.*tendencyValue.value|",
        "endPoints.endPoint.unit"
    ),
    wherename = paste0(
        "clinical_results.*outcome.measure.title|",
        "endPoints.endPoint.title"
    ),
    wherevalue = "response"
)
#> Returning values for 2 out of 12 trials
#>                  _id identifier
#> 1  2012-003632-23-CZ          1
#> 2  2012-003632-23-CZ          1
#> 3  2012-003632-23-CZ          2
#> 4  2012-003632-23-CZ          6
#> 5  2012-003632-23-CZ        6.1
#> 6  2012-003632-23-CZ        6.2
#> 7  2012-003632-23-CZ        6.3
#> 8  2012-003632-23-CZ        6.4
#> 9  2012-003632-23-CZ        6.5
#> 10 2012-003632-23-CZ          8
#> 11 2012-003632-23-CZ          8
#> 12 2012-003632-23-SE          1
#> 13 2012-003632-23-SE          1
#> 14 2012-003632-23-SE          2
#> 15 2012-003632-23-SE          6
#> 16 2012-003632-23-SE        6.1
#> 17 2012-003632-23-SE        6.2
#> 18 2012-003632-23-SE        6.3
#> 19 2012-003632-23-SE        6.4
#> 20 2012-003632-23-SE        6.5
#> 21 2012-003632-23-SE          8
#> 22 2012-003632-23-SE          8
#>                                                                                                                        name
#> 1                                                                                                   endPoints.endPoint.unit
#> 2  endPoints.endPoint.subjectAnalysisSetReportingGroups.subjectAnalysisSetReportingGroup.tendencyValues.tendencyValue.value
#> 3                                                                                                   endPoints.endPoint.unit
#> 4                                                                                                   endPoints.endPoint.unit
#> 5  endPoints.endPoint.subjectAnalysisSetReportingGroups.subjectAnalysisSetReportingGroup.tendencyValues.tendencyValue.value
#> 6  endPoints.endPoint.subjectAnalysisSetReportingGroups.subjectAnalysisSetReportingGroup.tendencyValues.tendencyValue.value
#> 7  endPoints.endPoint.subjectAnalysisSetReportingGroups.subjectAnalysisSetReportingGroup.tendencyValues.tendencyValue.value
#> 8  endPoints.endPoint.subjectAnalysisSetReportingGroups.subjectAnalysisSetReportingGroup.tendencyValues.tendencyValue.value
#> 9  endPoints.endPoint.subjectAnalysisSetReportingGroups.subjectAnalysisSetReportingGroup.tendencyValues.tendencyValue.value
#> 10                                                                                                  endPoints.endPoint.unit
#> 11 endPoints.endPoint.subjectAnalysisSetReportingGroups.subjectAnalysisSetReportingGroup.tendencyValues.tendencyValue.value
#> 12                                                                                                  endPoints.endPoint.unit
#> 13 endPoints.endPoint.subjectAnalysisSetReportingGroups.subjectAnalysisSetReportingGroup.tendencyValues.tendencyValue.value
#> 14                                                                                                  endPoints.endPoint.unit
#> 15                                                                                                  endPoints.endPoint.unit
#> 16 endPoints.endPoint.subjectAnalysisSetReportingGroups.subjectAnalysisSetReportingGroup.tendencyValues.tendencyValue.value
#> 17 endPoints.endPoint.subjectAnalysisSetReportingGroups.subjectAnalysisSetReportingGroup.tendencyValues.tendencyValue.value
#> 18 endPoints.endPoint.subjectAnalysisSetReportingGroups.subjectAnalysisSetReportingGroup.tendencyValues.tendencyValue.value
#> 19 endPoints.endPoint.subjectAnalysisSetReportingGroups.subjectAnalysisSetReportingGroup.tendencyValues.tendencyValue.value
#> 20 endPoints.endPoint.subjectAnalysisSetReportingGroups.subjectAnalysisSetReportingGroup.tendencyValues.tendencyValue.value
#> 21                                                                                                  endPoints.endPoint.unit
#> 22 endPoints.endPoint.subjectAnalysisSetReportingGroups.subjectAnalysisSetReportingGroup.tendencyValues.tendencyValue.value
#>                           value                           where
#> 1                          Days        Time to Overall Response
#> 2                           7.0        Time to Overall Response
#> 3  At least 1 response (number)    Durability of First Response
#> 4         Overall Response Rate                Overall Response
#> 5                          0.63                Overall Response
#> 6                             0                Overall Response
#> 7                          0.65                Overall Response
#> 8                          0.59                Overall Response
#> 9                          0.60                Overall Response
#> 10 Percentage of treatment time Cumulative Duration of Response
#> 11                         78.6 Cumulative Duration of Response
#> 12                         Days        Time to Overall Response
#> 13                          7.0        Time to Overall Response
#> 14 At least 1 response (number)    Durability of First Response
#> 15        Overall Response Rate                Overall Response
#> 16                         0.63                Overall Response
#> 17                            0                Overall Response
#> 18                         0.65                Overall Response
#> 19                         0.59                Overall Response
#> 20                         0.60                Overall Response
#> 21 Percentage of treatment time Cumulative Duration of Response
#> 22                         78.6 Cumulative Duration of Response