FormalPara Key Points

This study generated a value set for the EQ-5D-Y-3L instrument using a standardised protocol, which enables the use of utility index scores in economic evaluations in paediatric populations in the Netherlands.

Both composite time trade-off data and discrete choice experiment data were collected, and modelled independently. The discrete choice experiment data were used for the final value set, with the composite time trade-off data being used to anchor the discrete choice experiment values onto the quality-adjusted life-year scale.

Pain and feeling worried, sad or unhappy were identified as the most important health dimensions, with looking after yourself receiving the smallest weight.

1 Introduction

The EQ-5D-Y-3L instrument is a generic health-related quality-of-life measure aimed at paediatric populations, such as children and adolescents between ages 8 and 15 years [1]. It was developed by adapting the adult version of the EQ-5D, to make the labels and descriptors relevant for children and adolescents [2]. Several studies have tested the psychometric properties of the EQ-5D-Y, generally demonstrating the instrument is valid and responsive [3, 4]. Instruments such as the EQ-5D-Y-3L are suitable for both population studies and for economic evaluations. For use in economic evaluations, a value set, i.e. scoring based on preferences, is commonly employed.

In the Netherlands, the National Health Care Institute, Zorginstituut Nederland, recommends that new technologies and interventions should be assessed using a cost-utility analysis, making quality-adjusted life-years (QALYs) the outcome of reference [5]. The EQ-5D-5L and the accompanying Dutch value set is the preferred instrument for the computation of QALYs [6]. Although value sets for Dutch versions of the EQ-5D family of instruments are available, there is currently no EQ-5D-Y-3L value set for the Netherlands. This is problematic for three reasons. First, preferences for health states might differ for child and adult beneficiaries [7, 8]. Second, the descriptive system of the EQ-5D-Y-3L is different from the adult EQ-5D-3L, resulting de facto in a different classification system. Third, although EQ-5D-Y-3L value sets exist for other countries, it is not recommended to use these in the Netherlands. Earlier work with value sets for adult versions of the EQ-5D instruments has suggested using another countries’ value set may misrepresent the preferences of the Dutch population [9]. The same is likely to be true for child versions, thus a Dutch version is needed.

Preliminary work has been conducted on the valuation of health states for the EQ-5D-Y-3L instrument, leading to the development of a standardised protocol for conducting EQ-5D-Y-3L valuation studies [7, 10]. Four national value sets have already been published using this protocol, in Slovenia, Germany, Spain and Japan [11,12,13,14]. In this protocol, valuation data are collected in two samples: (1) a representative sample of respondents each completing paired comparisons in a discrete choice experiment (DCE) and (2) a smaller sample of respondents that each complete composite time trade-off (cTTO) tasks [15]. Discrete choice experiment data are used for the estimation of the value set, using logit models. However, as these models produce values on a latent scale, the cTTO data are used to anchor the DCE values on the full health (1) and dead (0) scale, necessary for QALY computations, using a mapping approach or by rescaling on a single cTTO mean value. In both the cTTO and DCE, respondents are asked to imagine a 10-year-old child, who is experiencing the health states described in the choice tasks.

The aim of this study is to develop a value set for the EQ-5D-Y-3L instrument for the Netherlands. The availability of such a value set would enable a cost-utility analysis for paediatric interventions using values specifically generated for children and adolescents aged between 8 and 15 years.

2 Methods

2.1 General Approach

We largely follow the protocol outlined by Ramos-Goñi et al., that is, we recruited the sample of respondents required to answer the DCE and cTTO questions, as well as using the standard proxy perspective of the protocol [10]. The latter entails asking adults of the general public to imagine a 10-year-old child experiencing health problems. We depart from the standard protocol by including a different and larger set of health states in the cTTO, allowing us to model the cTTO data independently. This enables comparisons between DCE-based and cTTO-based value sets for this instrument. Furthermore, the cTTO data were collected through videoconferencing interviews, rather than the recommended face-to-face interviews, owing to the coronavirus disease 2019 pandemic.

2.2 Instrument

The EQ-5D-Y-3L instrument has five dimensions; mobility (MO), looking after myself (SC), usual activities (UA), pain/discomfort (PD) and feeling worried, sad or unhappy (AD).Footnote 1 Mobility is defined as walking about, while looking after myself is defined as washing or dressing. For UA, examples relevant to the paediatric population are provided: for example, going to school, hobbies, sports, playing or doing things with friends.

On each of these five dimensions, EQ-5D-Y-3L differentiates between three levels of severity of problems: (1) no problems; (2) some problems and (3) a lot of problems/very worried, sad or unhappy in the last dimension. With these five dimensions with three levels of severity each, the instrument can distinguish between 243 different unique health states for individuals aged between 8 and 15 years. Five-digit numerical codes are assigned to each health state, with the place of a digit in this code indicating the dimension order as MO, SC, UA, PD and AD. The numbers themselves are then either 1, 2 or 3, referring to no problems, some problems and a lot of problems, respectively. An example of this is state “21323”, which is defined as having some problems with walking about, no problems with washing or dressing, a lot of problems doing usual activities, some pain or discomfort and very worried, sad or unhappy.

2.3 Valuation Methods

Two valuation methods are used in this study: cTTO and DCE. Complete composite time trade-off uses an indifference procedure to elicit the preferences of respondents, whom are asked to choose between two lives for a hypothetical 10-year-old child, A and B, or state that they are indifferent. Life A always equals a number between 0 and 10 of life-years in full health, while life B equals 10 years in some health state X. The number of years in full health in life A is varied until indifference is reached. If a respondent prefers immediate death to living life B, he/she enters the lead-time TTO task, in which more life-years can subsequently be traded. The cTTO task is more elaborately described in Janssen et al. and Stolk et al. [15, 16]. The cTTO produces values on a scale range between 1 and −1. Examples of the choice tasks are presented in the Electronic Supplementary Material (ESM).

In the DCE, respondents are presented with two EQ-5D-Y-3L health states and are asked to indicate which of the two health states they prefer for the hypothetical 10-year-old child. No specification of the duration of living in such a health state is included. Ordinal information on preferences is collected, and latent values are inferred using a logit model. Examples of the choice tasks are presented in the ESM.

2.4 Study Design, Sampling and Survey Procedure

In the cTTO study, an orthogonal health state design comprising 18 states is used [17, 18]. Five mild health states (21111, 12111, 11211, 11121 and 11112), the worst health state 33333 and four additional intermediate health states were added to the orthogonal array. These were selected to maintain near-orthogonality after adding the mild and worst health states to the design. The 28 states were assigned to three blocks of ten health states each, with all blocks containing health state 33333 and at least one mild health state. The DCE survey uses the efficient design suggested in the protocol, with a level overlap, meaning two identical levels, on two of the five dimensions in each choice situation [10]. The full design comprises 150 pairs of health states, blocked into ten blocks of 15 choice tasks per respondent.

Two samples of data were collected; a small target sample of 200 respondents completing ten cTTO tasks each and a larger target sample of 1000 respondents, each completing the 15 DCE choice situations. Ramos-Goñi et al. provide a detailed justification for using these sample sizes [10]. The respondents for the DCE survey and cTTO interviews were recruited by two different market research companies. For the DCE survey, respondents were recruited from a large online panel. Respondents received an invitation by e-mail to participate in the study with a link to the survey. For the cTTO sample, respondents were recruited from a different smaller panel. Respondents were invited by the panel company to an online-administered interview. Hard quota sampling was applied for age and sex for the DCE study, with soft quotas for education because of the difficulty of recruiting respondents with a lower education. For the cTTO study, soft quotas were set for both age, sex and education. To reach certain demographic groups, recruitment via social media was used as an additional recruitment strategy for the cTTO sample.

The cTTO data were collected by four trained interviewers through videoconferencing interviews. Interviewers connected with the respondents through videoconferencing software and shared their screens to show the respondents the questions and tasks to be completed. Videoconferencing interviews have been used in several valuation studies and methodological studies, and have so far shown to be feasible [19,20,21]. The standard EQ-VT 2.1 software was used [16, 22]. First, each respondent completed the EQ-5D-5L questionnaire, followed by a short demographic survey and the EQ-5D-Y-3L, to get acquainted with the instrument. Subsequently, the respondents entered the cTTO task, valuing “being in a wheelchair”, followed by a state either “much better than being in a wheelchair” or “much worse than being in a wheelchair”, depending on their answer to the first question. Next, respondents were presented three practice questions using EQ-5D-Y-3L health states (states 21121, 23332 and 13211), followed by ten more health states from one of the three blocks of the study design, presented in a random order. This was followed by the feedback module, debriefing questions and additional demographic questions. The feedback module is a procedure in which respondents are presented the rank order of their answers in the cTTO, allowing them to flag any mistakes [23].

The DCE data were collected through an online survey. Each respondent completed the self-description of the EQ-5D-5L, followed by several demographic questions and the self-description of the EQ-5D-Y-3L. Subsequently, respondents were presented the DCE survey, followed by additional sociodemographic questions. Each respondent completed 15 DCE paired comparisons; one block of questions from the efficient design, plus three dominant choice pairs used as a quality check. The first paired comparison was always a dominant pair, while the other two dominant pairs were randomly placed between the other 15 choice pairs.

2.5 Data Quality

The cTTO interviews were conducted in sets of ten interviews per interviewer, after which they quality of the data was assessed using EuroQol’s quality-control process [21]. This process checked whether a reasonable amount of time was spent on explaining and completing the task. Furthermore, it was assessed whether the lead-time TTO was explained to each respondent and whether there were any inconsistent responses. After these assessments, any necessary feedback was shared with the interviewers. For a more detailed explanation of the quality-control process, please refer to Ramos-Goñi et al. [24].

In the DCE survey, respondents were only included in the final sample if they did not fail any of the three dominance tests and completed the DCE survey in a reasonable amount of time, set at a minimum of 150 seconds. Failing dominance tests may indicate that the respondent is not paying sufficient attention, and 150 seconds was deemed the minimum time needed to read all the descriptors and make a choice. This cut-off point was also used in the EQ-5D-Y-3L valuation study in Slovenia [11].

2.6 Analyses

The cTTO data and the DCE data are modelled independently in Stata 14.2. The DCE data are used to estimate a value set, while the cTTO data are used to anchor the DCE data on the QALY scale. Modelling the cTTO data separately allows us to compare value sets generated from the cTTO data with value sets generated from the DCE data.

2.7 Modelling Valuation Data

Several ten-parameter models are estimated for the cTTO data, each of the form of Equation 1.

$${U}_{j}={\beta }_{0}+{\beta }_{1}{MO2}_{j}+{\beta }_{2}{MO3}_{j}+{\beta }_{3}{SC2}_{j}+{\beta }_{4}{SC3}_{j}+{\beta }_{5}{UA2}_{j}+{\beta }_{6}{UA3}_{j}+{\beta }_{7}{PD2}_{j}+{\beta }_{8}{PD3}_{j}+{\beta }_{9}{AD2}_{j}+{\beta }_{10}{AD3}_{j}.$$
(1)

Here, the utility assigned to health state \(j\) is defined as \({U}_{j}\). \({\beta }_{0}\) represents the intercept, which can be interpreted as the utility of being without health problems, e.g. in health state 11111. \({\beta }_{1}{MO2}_{j}\) to \({\beta }_{10}{AD3}_{j}\) represent the utility decrements assigned to the level-dimension combinations of EQ-5D-Y-3L health states. For example, \({\beta }_{1}{MO2}_{j}\) represents the utility decrement (\({\beta }_{1}\)) assigned to level 2 problems in mobility \({MO2}_{j}\). As each respondent completes ten cTTO tasks, we assume that responses may be correlated within respondents, for which we account using a random intercept model. Furthermore, models for health state valuations using cTTO often suffer from heteroskedasticity, as the variance in responses for milder health states is much smaller than those for more severe health states. We first test, using a Breusch–Pagan test, whether heteroskedasticity is present, and then account for this by defining an explicit variance function, where the variance \({\sigma }_{j}\) of the error term \({\varepsilon }_{j}\) is dependent on the levels of the dimensions of the EQ-5D-Y-3L of the health state, as in Equation 2:

$${\sigma }_{j}=\mathrm{exp}\left({\gamma }_{0}+{\gamma }_{1}{MO2}_{j}+{\gamma }_{2}{MO3}_{j}+{\gamma }_{3}{SC2}_{j}+{\gamma }_{4}{SC3}_{j}+{\gamma }_{5}{UA2}_{j}+{\gamma }_{6}{UA3}_{j}+{\gamma }_{7}{PD2}_{j}+{\gamma }_{8}{PD3}_{j}+{\gamma }_{9}{AD2}_{j}+{\gamma }_{10}{AD3}_{j}\right).$$
(2)

The DCE data are modelled using conditional logit models and mixed-logit models. These take the same form as Equation 1, yet use a logit link function to model the binary data collected in the DCE. We test the model performance of the DCE models by the Akaike Information Criterion (AIC). Furthermore, we perform sensitivity analyses including respondents that were excluded from the final sample because of suspicious response patterns (different cut-offs for time spent on the DCEs, and allowing for failing one or more dominance tests).

2.8 Anchoring the DCE Data on to the QALY Scale

We explored two methods to anchor the DCE data onto the QALY scale. First, we anchored the DCE data directly on the mean observed value cTTO value of state 33333, by dividing the mean cTTO value for state 33333 by the predicted DCE value for state 33333. Second, we mapped the mean observed cTTO values for the 28 health states included in the design onto the predicted DCE values, using ordinary least-squares regression, with and without specifying an intercept, using a linear equation of the form \({U}_{TTO}=\alpha *{U}_{DCE}+\beta\). Here, \({U}_{TTO}\) is the mean observed cTTO value for a certain health state, \({U}_{DCE}\) is the predicted DCE value, and \(\alpha\) and \(\beta\) (\(\beta\) equals 0 if no intercept is used in the mapping) are the rescaling parameters identified through the mapping. The final rescaling approach was selected based on prediction accuracy criteria such as R-squared, mean absolute error (MAE) and root mean square error (RMSE) and on how well values for the most commonly observed health states in the general population, mild health states, are predicted.

3 Results

The data were collected between early November 2020 and April 2021. One hundred and ninety-seven respondents completed ten cTTO tasks. Fewer than 5% of cTTO interviews were flagged to be of suspicious quality, and all were included in the final sample. Nine hundred and fifty-nine respondents completed the online DCE survey without failing the quality-control criteria (dominance tests or completing the task in less than 150 seconds). An additional 341 respondents either failed at least one dominance task (n = 72) or completed the task in less than 150 seconds (n = 100) or both (n = 169), leading us to exclude these 341 out of 1300 respondents to reach our sample of 959. Table 1 reports on the demographics of both samples and compares them with those of the adult general population [25]. In the cTTO, there was a slight over-sampling of female subjects, as 113 out of 197 respondents were female. Furthermore, although the mean age of the sample (41.9) was similar to that of the Dutch general population (42.3), fewer elderly were sampled. [26] The DCE sample was representative in terms of age and sex, with an over-sampling of the lower-middle and upper-middle educated (9.7 and 12.0 percentage points higher, respectively, compared with the Dutch population). Table 2 reports the responses of both samples to the EQ-5D-5L questionnaire.

Table 1 Sample characteristics of the cTTO and DCE samples
Table 2 Responses to the EQ-5D-5L questionnaire for the cTTO and DCE samples

3.1 cTTO Data

Figure 1 reports the distribution of cTTO responses over the ranges of possible values. 16.3% of responses were negative, indicating worse than dead preferences. The mean value assigned to state 33333 equalled −0.372. Only for one other health state (13133) a negative mean was reported (−0.038). All other 26 health states reported means between 0.968 (12111) and 0.054 (22333). The means and standard deviations of the other health states are reported in the ESM. Table 3 reports the results from the modelling of the cTTO data. The random intercept model (first column) produces logically inconsistent and insignificant parameter estimates for level 2 problems on multiple dimensions. The Breusch–Pagan test showed that heteroskedasticity was present. The other two models, both correcting for heteroskedasticity, performed better and produced logically consistent results, with the constrained intercept model also producing significant parameter estimates. Both models showed a better fit in terms of the Akaike Information Criterion compared with the random intercept model. In all models, AD receives the largest weight, followed by PD. Looking after myself receives the smallest weight.

Fig. 1
figure 1

Distribution of cTTO responses

Table 3 Modelling results of cTTO data

3.2 DCE Data

Table 4 reports the results from the modelling of the DCE data, using conditional logit (column 1) and mixed-logit (column 2) models. Columns 3 and 4 report rescaled coefficients, after anchoring these on the QALY scale (see next section). In each of these models, PD receives the largest weight, followed by AD. Looking after myself is assigned the smallest weight. The mixed-logit model outperformed the conditional logit model in terms of model fit, as assessed by the AIC and was therefore selected as the model to be used for the final value set.

Table 4 Modelling results of DCE data

We assessed the effect of including the respondents that failed the DCE response quality criteria along with the final sample in both a conditional logit model and a mixed-logit model. For both models, the relative change in AIC was much larger than the relative change in sample size (see ESM for more details). This large marginal decrease in AIC shows that the responses from the excluded respondents are much noisier than the responses included in the final sample. Using less stringent quality criteria (e.g. respondents can fail one or two dominant tasks) produced similar results, with the criteria used in the current sampling producing the best AIC to sample size ratio.

3.3 Anchoring and Final Value Set

Table 5 reports the results of the anchoring of mean observed cTTO values onto the predicted DCE values. Three anchoring strategies were explored: linear mapping (column 2), linear mapping while constraining the constant at 1 (column 3) and directly rescaling on the mean observed value for state 33333 (column 4). All anchorings were done on the mixed logit model. Rescaling on the mean observed value for state 33333 resulted in MAEs and RMSEs that are consistently larger than those of the other two strategies. This is visualised in Fig. 2. The linear mapping without a constant produced the highest R-squared and showed smaller MAEs and RMSEs for the empirically more prevalent milder health states, while reporting only slightly higher MAEs and RMSEs for the mean values of the observed 28 health states. Therefore, the mapping without a constant was selected as the final model, to be used for the value set. All mixed-logit coefficients were therefore multiplied by 0.0606, the rescaling factor identified by the mapping without a constant. The rescaled DCE models are reported in the last two columns of Table 4. The final model used for the value set, the rescaled mixed-logit model 2 (Table 4, column 4), is represented mathematically as:

Table 5 Results anchoring analyses
Fig. 2
figure 2

 Scatter plots for the predicted, rescaled DCE values and the mean observed cTTO values for the 28 health states included in the health state design, for all three anchoring strategies.

$$U\left(HS\right)=1-0.036*MO2-0.191*MO3-0.028*SC2-0.139*SC3-0.058*UA2-0.211*UA3-0.111*PD2-0.363*PD3-0.096*AD2-0.314*AD3.$$

Using this value set, health state 13213 would for example be assigned the value \(U=1-0.139-0.058-0.314=0.489\).

Out of all predicted values, 8 of the 243 (3.3%) health states were negative. Figure 3 plots the values generated by the best fitting cTTO model (Table 3, column 3) and the rescaled mixed-logit model (Table 4, column 4) against each other, to compare cTTO-only and DCE-rescaled value sets, to see whether the two methods produce the same values. Figure 3 identifies differences in the values assigned to health states, which may differ between the cTTO and DCE methods. In the extreme case, this difference is more than 0.3.

Fig. 3
figure 3

 Values for all 243 health states, generated through either DCE (mixed logit model rescaled on the mean cTTO value for state 33333) or cTTO responses (heteroskedastic model without constant)

4 Discussion

4.1 Main Findings

This study aimed to estimate a value set for the EQ-5D-Y-3L instrument for the Netherlands, and by doing so investigated the preferences of the Dutch population for health states for children. Respondents assigned the highest value to PD, followed by AD, UA, MO and SC. Although the ordering of the dimensions was roughly similar between the cTTO and DCE models, there were substantial differences between the health state values generated by the two methods, as is illustrated in Fig. 3.

4.2 Interpretation and Comparison with Existing Value Sets

The ordering of the weights assigned to the levels of the five dimensions was similar in the Netherlands as compared to Spain, Germany, Japan and Slovenia, suggesting that the same health dimensions are relatively important in all five countries [11,12,13,14]. The Dutch weights assigned to health problems were much larger than those found in Japan, smaller than those found in Slovenia and Spain, and relatively similar to those found in Germany. In all countries, PD was the most important dimension, and SC the least important, except for Germany, where MO was the least important. However, the relative ratio of the largest level 3 weight compared to the smallest level 3 weight was 2.62 in the Netherlands, 2 in Slovenia, 3.86 in Japan, 2.37 in Spain and 3.56 in Germany. This shows that Dutch respondents placed relatively more emphasis on a single health dimension for children as compared with Slovenian respondents, but relatively less than Japanese and German respondents. Generally, the Dutch value set was most similar to the German value set, with roughly the same scale length, but slightly higher values for milder health states and lower values for more severe states.

Negative values accounted for 3.3% of health states in the Dutch value set, indicating that they are considered worse than dead. The lowest score predicted in the current study was −0.218, for state 33333. In Germany, Spain, Japan and Slovenia, these were −0.283, −0.539, 0.288 and −0.691, respectively. In Japan, no negative values were predicted in the final model, compared to 6.6% in Germany, 16.1% in Spain and 20.6% in Slovenia. This shows that Dutch respondents found only a small set of health states to be worse than dead for children. These findings are similar to findings for the EQ-5D-5L instrument, with Japan showing higher values and Germany lower values [12, 13]. An exception is Spain, showing similar values for the worst health state in the EQ-5D-5L instrument, but lower values for the EQ-5D-Y-3L instrument [14]. For Slovenia, no EQ-5D-5L value set is currently available [11]

The relative importance of the health dimensions is roughly similar between EQ-5D-Y-3L and EQ-5D-5L in the Netherlands [6]. For EQ-5D-5L, PD and AD receive the largest weight, followed by MO, UA and SC, while for EQ-5D-Y-3L, PD receives the largest weight, followed by AD, UA, MO and SC. The values assigned to the worst health states (−0.446 for 55555 in the EQ-5D-5L, and −0.218 for 33333 in the EQ-5D-Y-3L) differ substantially, with the percentage of negative values being higher in the EQ-5D-5L value set at 15.5%, compared with 3.3% for the EQ-5D-Y-3L value set. This may mean that Dutch respondents consider the same health dimensions important for both adults and children, but differ in the degree in which they are willing to give up life-years to avoid these health problems, which may reflect a different willingness to trade between adults and children. These differences may be partially caused by a different level structure of the instruments, different descriptors (e.g. anxiety/depression in the EQ-5D-5L vs worried, sad or unhappy in the EQ-5D-Y-3L) or different measurement strategies (EQ-5D-5L value set is cTTO based, while the EQ-5D-Y-3L value set is DCE based). Figure 3 shows a substantial disagreement between cTTO-based and DCE-based value sets for the EQ-5D-Y-3L, which may also reflect the differences in valuation protocols between the EQ-5D-Y-3L and EQ-5D-5L value sets. There is an ongoing discussion on how to interpret differences between youth and adult value sets. One study provides qualitative evidence that respondents differ in the rate at which they are willing to trade life-years to relieve health problems for adults or children [27]. Furthermore, there appears to be more variance in the values assigned to health states for children [8]. One reason for this could be that respondents assign values to health states for other, hypothetical persons in the valuation of health states for children. Incorporating adolescent preferences into the valuations may overcome this issue, and has been shown to be feasible in the valuation of other instruments [28]. A review by Rowen et al. identifies future directions for research that may help us better understand the difference between youth and adult health state values [29].

4.3 Limitations and Strengths

Composite time trade-off data were collected through videoconferencing interviews rather than the recommended face-to-face interviews. Collecting cTTO data via face-to-face interviews was not feasible, owing to the coronavirus disease 2019 pandemic, with the Netherlands experiencing lockdown measures between October 2020 until June 2021. Therefore, it was decided to interview respondents using videoconferencing software. Preliminary studies have shown the feasibility of this mode of administration [19, 20]. A more recent study shows that the mode of administration is unlikely to lead to different values when employing videoconferencing interviews. However, sample selection, i.e. differences in responders due to the recruitment strategy, may affect the collected values [21]. Another limitation of this study is that for the cTTO data, soft quota sampling was used, leading to an over-sampling of the highly educated, while obtaining a reasonable balance of sex and age in the cTTO sample. Furthermore, in the DCE survey, there was an under-sampling of those with the lowest attained education, compensated by those with lower-level and upper-middle-level education. Over-sampling highly educated respondents is a problem observed more frequently in valuation studies for various instruments, regardless of the mode of administration [30,31,32,33,34]. The sampling issue may be partially related to the choice of recruitment strategy, by employing panel companies to recruit respondents for the cTTO and DCE samples. Last, a limitation of the current study is that the performance of the health state design used for the cTTO cannot be assessed directly with the one recommended by Ramos-Goñi et al., as the Ramos-Goñi et al. design is not a subset of the currently used design [10]. Although anchoring the DCE values onto the cTTO data seems to work well with a larger set of health states, we make no claim of superiority, as no direct comparison can be made.

A strength of this study is regardless of restrictions related to the coronavirus disease 2019 pandemic, it was still possible to collect cTTO data using computer-assisted personal interviews. Furthermore, the data were collected following a strict protocol as outlined by Ramos-Goñi et al., with quality-control procedures being implemented in both the cTTO and DCE data collection [10]. No issues with data quality were identified during the collection of the cTTO data, as defined by EuroQol’s quality-control process [24].

5 Conclusions

This study yields a value set for the EQ-5D-Y-3L instrument, enabling a cost utility analysis for paediatric interventions in the Netherlands. Pain/discomfort was identified as the most important dimension, followed by feeling worried, sad or unhappy, usual activities, mobility and looking after myself. Of all health states, 3.3% had a negative value, and the lowest value assigned to any health state was −0.218. The relative importance of the health dimensions seems to be the same between the adult EQ-5D-5L instrument and the paediatric EQ-5D-Y-3L instrument. However, other properties, such as the scale length and the percentage if negative values, seem to differ between the value sets.