J Patient Rep Outcomes. 2025 May 7;9(1):50. doi: 10.1186/s41687-025-00875-4.
ABSTRACT
BACKGROUND: Vasomotor symptoms (VMS; hot flashes) associated with menopause have significant impacts on health-related quality of life and are a leading cause for women seeking medical attention. Patient-reported outcome (PRO) instruments are commonly used to assess treatment benefit in VMS clinical trials and must demonstrate supportive evidence of measurement properties within the context of use. This study evaluated the measurement properties of scores from the Hot Flash Daily Diary (HFDD), PROMIS Sleep Disturbance Short Form 8b (PROMIS SD SF 8b) and Menopause-Specific Quality of Life (MENQOL) for measuring treatment efficacy in VMS clinical trials.
METHODS: Measurement properties of the HFDD, PROMIS SD SF 8b, and MENQOL scores were assessed using data (n = 400 participants) from a randomized, placebo-controlled, phase 3 study evaluating the efficacy and safety of elinzanetant for the treatment of VMS in postmenopausal women (OASIS 2). Analyses assessed distributional properties, reliability, validity, responsiveness, and thresholds for meaningful change.
RESULTS: Minimal floor and ceiling effects were found across the instruments at baseline. Inter-item correlations, and confirmatory factor analysis or item-response theory supported dimensionality and scoring for the MENQOL and PROMIS SD SF 8b, respectively. Test-retest reliability between Weeks 8 and 12 was good to excellent for HFDD Frequency and Severity of moderate-to-severe hot flashes scores, PROMIS SD SF 8b T-score and MENQOL Total score (intra-class correlation coefficients 0.835-0.971). Convergent and divergent correlations with instruments assessing similar or distinct constructs were consistent with pre-specified hypotheses. Known-groups validity was supported by significant differences (p < 0.0001) between subgroups hypothesized a priori as being clinically distinct. Responsiveness was indicated by consistent and statistically significant differences (p < 0.0001) in mean changes from baseline to Week 4 and 12 between groups of participants classified as ‘improved’, ‘stable’ and ‘worsened’ (effect sizes for improvement 0.81-4.62). Triangulation of estimates from multiple anchor-based analyses derived meaningful within-individual change thresholds for the HFDD, PROMIS SD SF 8b and MENQOL scores that were likely to exceed measurement error.
CONCLUSIONS: Findings provide evidence that HFDD, PROMIS SD SF 8b, and MENQOL scores are valid, reliable and responsive to change, supporting their use for assessing key efficacy endpoints in VMS clinical trials.
PMID:40332718 | DOI:10.1186/s41687-025-00875-4