Statistical methods for primary and secondary outcomes {20a}

This analysis plan provides the initial framework for examining primary and secondary outcomes. A significance level of 0.05 will be applied unless otherwise specified, with corrections for multiple comparisons (e.g., Bonferroni) used when necessary. All analyses will follow the intention-to-treat principle, accounting for the nested structure within centers if appropriate. The analyses outlined here may be subject to minor modifications based on advancements in the field. Should any deviations from the original plan occur, these will be clearly documented and justified. Our reporting will adhere to the guidelines set by the CONSORT statement [98].

Study Sample Description

We will report the descriptive statistics of socio-demographic and clinical variables for the whole sample, per site, and per treatment arm. In specific analyses involving subsamples (e.g., melatonin assessments, EMA, and MRI), we will provide socio-demographic and clinical characteristics of the subsample and report any distinctions from the overall sample.

Primary Study Outcome – Clinical Improvement between Treatment Conditions

To compare the three different BLT administration strategies in terms of clinical improvement (measured as the difference in MADRS scores between baseline and post-treatment assessments), we will employ a multilevel mixed-effects modelling approach. MADRS scores will serve as the dependent variable, with a fixed effects for timepoint (pre- and post-treatment), and the interaction between timepoint and treatment arm (Light@Home, LightCafé, and LightCafé +). Random intercepts for participants and study centers will be included to account for individual differences and center-level variability, but only if they improve model fit, as determined by evaluation of the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). Random slopes for time within participants will also be considered and included based on the same criteria. Socio-demographic (age, sex) will be evaluated for potential confounding effects, similar to the credibility and expectancy of the treatment arms as assessed with the CEQ. The primary analysis will focus on the interaction between timepoint and treatment arm to determine whether clinical improvement differs across the three BLT strategies. Furthermore, multilevel logistic regression will be used to test differences in response (yes/no) and remission (yes/no) between treatment arms using the same structure and procedure as above.

Secondary Study OutcomesSubjective Assessment of Depressive Symptoms, Relapse Rates and Functional Outcomes

Multilevel mixed-effects models will be constructed to assess depressive symptom improvement, measured weekly during therapy and at follow-up assessments using the QIDS-SR scores as the dependent variable. Fixed effects will include timepoint as an categorical variable (baseline, 1 week, 2 weeks, 3 weeks, 4 weeks, 10 weeks, and 4 months), and the interaction between timepoint and treatment condition (Light@Home, LightCafé, and LightCafé +). Random intercepts and slopes will be included for participants and study centers, as well as random slopes for time within participants, based on model fit criteria. Confounding variables (age, sex, clinical diagnosis, and credibility/expectancy of treatment) will also be evaluated and included as necessary. A variable will be added to account for treatment duration (1 week, 2 weeks, 3 weeks). Similar models will be considered for analyses of relapse rates, functional outcome, and Quality of Life with inclusion of the respective timepoints as fixed effects (see Table 1).

Time to remissionand time to response analysis

To assess differences in time to remission, we will utilize survival analysis techniques. Participants will be categorized based on the number of weeks until remission (1 week, 2 weeks, 3 weeks, or censored if not reached within the treatment duration). Kaplan–Meier survival analysis will be employed to visualize the probability of achieving remission over time between treatment conditions (Light@Home, LightCafé, and LightCafé +). Additionally, we will apply Cox proportional hazards modelling to evaluate the effects of treatment condition and covariates (age, gender, clinical diagnosis) on the time to remission (outcome). Results will be presented with survival curves and hazard ratios. Similar analyses will be conducted on time to response.

Treatment effects on sleep parameters

To analyze changes in sleep patterns over time and between treatment conditions, multilevel mixed-effects models will also be utilized. Sleep outcomes include the following outcomes from both sleep diaries and actigraphy: sleep duration, sleep onset latency, wake after sleep onset, midsleep timing, sleep onset, and sleep offset. Additionally, sleep quality will be calculated from sleep diary and sleep fragmentation from actigraphy. The fixed effects structure will encompass time (days since treatment start), and the interaction between time and treatment condition (Light@Home, LightCafé, and LightCafé +), allowing us to explore how sleep patterns evolve throughout the intervention and whether these changes differ across conditions. The inclusion of random intercepts for participants and study centers, along with random slopes for time within participants, will be evaluated based on model fit criteria. Additionally, the model will incorporate a factor indicating whether a day is a weekend or weekday to account for social jet lag experienced by participants.

Furthermore, we will investigate the relationship between subjective sleep measures from sleep diaries and objective data obtained from actigraphy through correlation analyses. Next, we will assess the total PSQI and ISI scores over time and between conditions using a linear mixed model approach as described above, focusing on the timepoints of Baseline, 4 weeks post-treatment, 10 weeks post-treatment, and 4 months post-treatment instead of days since treatment start.

Circadian effects of BLT and their relation to the antidepressant outcomes

To analyze changes in circadian parameters, including DLMO, Phase Angle Difference, inter-daily stability, and intra-daily variability, multilevel mixed-effects models will be employed. The fixed effect’s structure will incorporate time (Baseline vs. Post-treatment), and the interaction between time and treatment condition (Light@Home, LightCafé, and LightCafé +). Inclusion of random intercepts for participants and study site and random slopes for time within participants will be evaluated based on model fit criteria, including AIC and BIC. Circadian effects will be compared between timepoints by evaluating the fixed effect of time in the model, while the effect of the treatment condition will be interpreted by assessment of the interaction term between time and treatment condition.

Following the assessment of changes in circadian parameters (e.g., DLMO, Phase Angle Difference, inter-daily stability, and intra-daily variability), multilevel mediation analyses will be conducted to determine whether these changes mediate the relationship between timepoint and depressive symptoms (QIDS-SR). Each mediator will be tested individually using an approach based on Bauer et al. [99] which includes decomposing within- and between-person components to yield valid indirect effect estimates in nested data. We will apply bootstrapping (5000 samples) to estimate the indirect path (ab) and construct 95% confidence intervals for each mediator. In addition, moderation analyses will examine whether baseline depression severity or sleep quality interact with treatment conditions to affect depressive outcomes. Results will report direct effects (c′), indirect effects (ab), and any interaction effects (a × b), along with the corresponding confidence intervals and p-values.

Dynamics of sleep, affect, and energy levels throughout therapy

The primary objective of the EMA is to investigate the dynamic interplay between sleep, affect, and energy levels during BLT. We aim to uncover whether the antidepressant effects of BLT primarily result from improvements in mood, sleep, daytime energy levels, or a combination of these factors. For statistical analysis, we will first calculate daily averages for all momentary measures. If fewer than three measures are available for a given day, the daily average will be set to not applicable (NA). To isolate within-person variability, all measures will subsequently be person-mean-centered. Additionally, baseline measures—such as depressive symptoms and sleep quality—will be grand-mean centered.

To analyze the data, we will specify a Random Intercept Crossed-Lagged Panel Model (RI-CLPM) that focuses on the relationships between sleep parameters, affect, and energy levels over time. This model will incorporate dynamic variables, specifically the person-mean centered scores for sleep parameters, affect, and energy levels measured at the day level, alongside trait variables represented by grand-mean centered scores from baseline assessments, including severity of depressive symptoms and sleep quality. The model structure will include random intercepts for each participant, autoregressive paths to assess the stability of each variable, and cross-lagged paths to evaluate how one variable at a previous time point influences another variable at a subsequent time point. To estimate the RI-CLPM, we will utilize the lavaan package in R, employing maximum likelihood estimation for model fitting.

Prior to model construction, we will conduct an exploratory analysis to identify the most relevant parameters related to sleep, affect, and energy levels for inclusion in the model. This step is crucial to ensure that we accurately measure the constructs involved and capture the dynamics effectively. The exploratory analysis will involve visualizations, such as time series plots, lag plots, and scatter plots, to assess relationships and patterns in the data. Additionally, we will investigate how the variables change over time, seeking patterns in daily fluctuations and overall trends throughout the therapy period. Correlations between different variables at various time scales will also be examined to identify initial relationships that could inform model selection.

Based on the findings from the exploratory analysis, we will construct a model that best represents the dynamics of sleep, affect, and energy levels. We may incorporate covariates and trait-level variables, such as demographic factors, baseline depressive symptoms, and sleep quality, which could influence the relationships being studied. In our reporting, we will present the overall model fit indices (e.g., comparative fit index, Tucker–Lewis index, root mean square error of approximation) to assess the adequacy of the RI-CLPM. We will detail the autoregressive paths indicating the stability of each variable over time, as well as the cross-lagged paths, which reveal how changes in one variable at a previous time point influence changes in another variable at a subsequent time point. This systematic examination aims to enhance our understanding of the therapeutic mechanisms underlying BLT and how improvements in sleep, affect, and energy levels interact throughout the treatment process.

Following the data analysis, we will conduct a post hoc power analysis to evaluate the power of the RI-CLPM. If this analysis reveals insufficient power, we have the option to revert to a mixed model approach, which necessitates fewer participants to achieve similar power levels.

Identification of predictors of response and remission

Another aim of the study is to identify combinations of patient characteristics and behaviors that predict treatment response and remission. To achieve this, we will conduct AIC penalized bootstrapped stepwise logistic regression in both forward and backward directions, examining predictors of remission and treatment response in two separate models. The examined predictors will include:

  • Demographic Factors: Age, Sex, Educational Level

  • Clinical Characteristics: Clinical Diagnosis, Severity of Depressive Symptoms, Diurnal Variation in Affect

  • Sleep-Related Factors: Subjective Sleep Quality, Sleep Efficiency, Sleep Duration, Sleep Onset Latency, Midsleep Timing, Wake after Sleep Onset, Severity of Insomnia Symptoms, Energy Levels

  • Circadian Factors: Circadian Phase, Intradaily Stability, Intradaily Variability, Phase Angle Difference between DLMO and Sleep Onset, chronotype.

  • Light-related parameters: Light Sensitivity

  • Expectations: Expectations of Treatment

All predictors will be measured either at intake or during the baseline week. Prior to the stepwise regression analysis, we will assess multicollinearity by calculating the Pearson correlation coefficient and Variance Inflation Factor (VIF). Variables exhibiting a Pearson correlation coefficient greater than 0.8 or a VIF score exceeding 10 will be removed to mitigate multicollinearity and examined separately. A final competitive regression model will be run with all significant predictors.

To ensure the robustness of our models, we will run 1000 bootstrap samples and retain predictors that are significant in at least 60% of the bootstrapped samples. We will also calculate partial R-squared values for all predictors in the regression models to quantify the proportion of variance in the dependent variable explained by each specific independent variable while controlling for the influence of others. Finally, we will conduct variance partitioning to investigate the unique and shared variance explained by the predictors. By comparing the explained variance from regression models of individual predictors against those with combinations of predictors, we will compute both unique and shared variance. The vegan package in R will be used for variance partitioning, and results will be visualized using Euler plots from the Euler package.

Neural Correlates of BLTPreprocessing and data cleaning

Neuroimaging data (functional and structural) will be visually inspected and subjects with artifacts in the images (e.g., based on movement) will be discarded from the analysis. All brain data will be transformed into the brain imaging data structure (BIDS), in order to be shareable and easily accessible for further steps [100]. Data will then be pre-processed and analyzed using existing standardized neuroimaging pipelines depending on the data types (structural or functional). Functional data will be pre-processed using fMRIPrep [101, 102].

Neuroimaging data types and software

We will gather three types of data based on the outputs from the four scanning sequences employed.

The first type is structural imaging data, which includes measures of gray matter (GM), white matter (WM), and LGI. This data will be pre-processed using the standardized FreeSurfer analysis pipeline (http://surfer.nmr.mgh.harvard.edu/). The preprocessing steps are automated and briefly include motion correction, intensity normalization, skull striping, segmentation in GM and WM, and inflation of these surfaces. Afterwards the extraction of specific GM and WM metrics is performed, this encompasses, among others, THK, VOL, SA, and LGI.

The second type is WM tract data obtained from DTI sequence. We will focus on extracting properties of the main WM tracts, specifically FA and MD. For this, we will use a probabilistic tracking analysis approach implemented as a standardized analysis pipeline of Freesurfer, namely the TRActs Constrained by UnderLying Anatomy (TRACULA) [103, 104].

The third type is functional connectivity data based on the BOLD signal of the resting state functional scan; it will allow to assess the functional connectivity properties of certain brain networks under resting state conditions using standardized pipelines from the CONN toolbox [104].

Analysis of brain function and structure before vs. after BLT

Brain function and structure will be analyzed between pre- vs. post-BLT intervention in the whole sample of 60 participants at the Leiden study site.

For the structural parameters of GM, the longitudinal two-stage analysis framework of general linear modelling (GLM) will be used. This is implemented as an automated analysis pipeline in the longitudinal data analysis framework of FreeSurfer. In brief, this method first extracts the regional GM parameters (VOL, SA, THK, and LGI) at the level of each subject and each brain point (vertex) for each hemisphere separately. In a second step, a group level analysis will be performed by using a repeated measures ANOVA. Hereby, we will compare the structural parameters THK, VOL, SA, and LGI between pre- and post- BLT while including age, sex, and treatment arm (Light@Home, LightCafé, LightCafé +) as covariates and to correct for their potential confounding effects. To account for multiple testing across the whole brain, we will perform Monte Carlo simulations with 10,000 iterations to identify significant contiguous clusters of vertex-wise group differences (p 

For the analysis of the structural WM properties of main WM tracts, the probabilistic tractography pipeline TRACULA will be used. This automatic pipeline is first pre-processing the DTI images (correcting for image distortions, eddy currents, and B0 field inhomogeneities). Afterwards, head motion measures are extracted which can be used in further statistical analysis as covariates, co-registration of DTI and anatomical images is performed, the registration of individual images to a common space occurs, then the probabilistic distributions of 42 major WM fiber tracts are reconstructed from which the main properties of WM bundles can be extracted (FA and MD). Then the pipeline enters the longitudinal analysis stream which involves as a main statistical approach, the earlier mentioned GLM model. Hence, a repeated measures analysis of variance (ANOVA) will be performed on group level in which the comparison of FA and MD between pre- and post-BLT will be evaluated. Afterwards, clusterwise correction for multiple comparisons using Monte Carlo simulations with 10,000 iterations will be employed to account for multiple testing effects.

For the analysis of brain function, the connectivity maps of main functional resting state networks will be analyzed using the CONN toolbox [104] from the software Statistical Parametric and Mapping (SPM). For this, we will replicate the study by [59] [59]. Hence, we will both investigate inter-network connectivity (in DMN, FPN, SMN, SN) but also at the intra-network connectivity as a post hoc test using a graph analysis. In the first level analysis, the network properties of each network in each subject will be established. In the second level analysis, a group-level analysis will be performed to analyze the differences in the brain networks (DMN, FPN, SMN, SN) between pre- and post-BLT. A significance threshold of 0.05 will be set and a false discovery rate (FDR) correction method will be applied to correct for multiple comparisons.

Analysis of functional and structural brain correlates and clinical symptoms

To assess whether the encountered brain changes are linked to the clinical symptom improvement, multiple regression analysis will be performed. Hence, brain measures (functional or structural) which showed changes between the pre- vs post-BLT intervention will be correlated with the clinical symptom change scores of depression (QIDS-SR, MADRS). More specifically, the brain measures involved are functional connectivity measures of specific brain networks (DMN, FPN, SMN, SN) which showed changes between pre and post BLT, the structural measures—GM properties (THK, VOL, SA) but also WM tracts properties (FA, MD) which showed changes between pre-post BLT.

Analysis of melatonin and BLT linked brain changes.

We will evaluate if a change in DLMO is also linked to the encountered brain changes on functional and structural level. Hence a multiple regression analysis will be performed between the encountered brain changes as predictors (in function and structure) and the DLMO changes as outcome measure in the analysis.

To evaluate the potential link between circadian light sensitivity and brain alterations in depression, a multiple regression analysis will be performed. Hence the melatonin suppression will be used as a predictor and functional and structural brain changes and as outcomes in the model.

Interim analyses {21b}

N/a—we are not conducting any interim analyses.

Methods for additional analyses (e.g., subgroup analyses) {20b}

Exploratory subgroup analyses will be conducted to examine variations in the effects of different administration strategies on depressive symptoms, sleep, and circadian rhythms across various depression diagnoses. Additionally, the whole brain structure and functional parameters will be compared between the three treatment arms. As a post hoc explorative analysis the functional and structural brain parameters will be compared between remitters vs non-remitters. Multiple comparison corrections for repeated measures will be done.

Furthermore, we will explore the relationship between treatment adherence (as measured by self-report and light sensor data) and treatment effectiveness. Adherence will be operationalized as the percentage of completed light therapy sessions (based on sensor data, the timing of light exposure relative to the prescribed schedule, and daily self-reported compliance). These adherence indicators will be added as covariates in regression models predicting changes in depressive symptoms, sleep, and circadian phase markers. Primary outcome models will also include medication type and dose as time-varying covariates to test for interaction with BLT. Exploratory subgroup comparisons (e.g., on versus off antidepressant) may further probe whether concurrent treatments alter BLT’s therapeutic impact.

Methods in analysis to handle protocol non-adherence and any statistical methods to handle missing data {20c}

The data analysis is an intent-to-treat analysis with participants analyzed as randomized, regardless of adherence to the treatment protocol. Additionally, completer analyses will be conducted, including only those participants who have both pre-treatment and post-treatment data. Furthermore, per protocol analyses will focus on participants who completed at least 80% of the treatment (i.e., attending 4 out of 5 days of BLT per week). Any substantial differences between these analytical approaches will be evaluated and reported.

We will describe the extent and patterns of missing MADRS scores and covariates overall and by treatment arm. To evaluate whether missingness is completely at random (MCAR), we will apply Little’s MCAR test and examine associations between missingness and key variables (e.g., age, sex, diagnosis, depression severity, treatment expectancy). Under a plausible missing-at-random (MAR) assumption, we will perform multiple imputation by chained Eqs. (10 imputations) as a sensitivity analysis, refitting the primary multilevel mixed-effects models in each imputed dataset and pooling estimates via Rubin’s rules. We will then compare results from the complete-case analysis and the MAR sensitivity analysis to our main intent-to-treat (observed-data) findings. Any notable discrepancies in effect estimates, model fit, or inferential conclusions across these approaches will be reported.

Plans to give access to the full protocol, participant-leveldata, and statistical code {31c}

Participant-level data (from consenting participants) and the corresponding statistical code will be made available upon request. Parts of the data, i.e., data published in peer-reviewed journals will be stored in the public repository. The aim of this step is that other researchers can have access to include these datasets in meta-analyses, reviews, or similar or enable replication of our study. This data is pseudonymized and will not contain any personal information. The complete data set will be stored on the local research institutional research servers in a pseudonymized fashion. We will adhere to the requirements of the DataverseNL platform for data archiving.