An Item Response Theory Analysis of the FTND

(2004) Psychological Assessment.

An Item Response Theory Analysis of the Fagerstrom Test for Nicotine Dependence
G. Scott Acton
Rochester Institute of Technology
Janice Y. Tsoh
University of California, San Francisco
Hiroyuki Yamada
University of California, Berkeley
Author Note

The Fagerstrom Test for Nicotine Dependence (FTND) assesses physical nicotine dependence. Item response theory analyses performed on English and Chinese versions of the FTND in a community sample of 409 American smokers indicated differential item functioning based on language but not on gender or major depression. In both English- (n = 241) and Chinese-speaking (n = 168) samples, ROC curves showed that the FTND accurately predicted major depression, and linear regressions showed that the FTND was associated with daily cigarettes, years of smoking, and age began smoking. Applications of the dimension/category framework (De Boeck, Wilson, & Acton, 2003) argued against viewing DSM-IV nicotine dependence as a discrete category, instead suggesting that at the latent level DSM-IV nicotine dependence is thoroughly dimension-like. Assessment implications for a stepped-care model of treatment are discussed.

What is nicotine dependence? In the literature on this elusive concept, two major meanings have arisen. The first meaning is physical dependence, which is usually measured by the Fagerstrom Test for Nicotine Dependence (FTND; Heatherton, Kozlowski, Frecker, & Fagerstrom, 1991) or one of its derivatives (e.g., the Heaviness of Smoking Index comprises two of the six items on the earlier FTND, which comprises six of the eight items on the earlier Fagerstrom Tolerance Questionnaire). The second meaning is psychological dependence, which is usually measured by an interview (e.g., the Diagnostic Interview Schedule [DIS] or the Composite International Diagnostic Interview [CIDI]) assessing symptoms of the nicotine dependence diagnosis in the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV; American Psychiatric Association, 1994) (e.g., Breslau & Johnson, 2000; Moolchan et al., 2002).
Because two different meanings were at issue, it has widely been assumed that the different measurement methodologies used to assess these different meanings were also very different (e.g., Dijkstra & Tromp, 2002). One way in which the measurement of nicotine dependence has differed is in terms of questionnaire versus interview assessment techniques. A second way is in terms of continuous (or graded) versus categorical (or discrete) measurement; for example, the FTND is continuous, whereas a DSM-IV diagnosis is categorical. Yet previous investigations of these differences have taken an exclusively indirect approach to testing them—specifically, examining varying correlates of the two techniques (e.g., Breslau & Johnson, 2000; Moolchan et al., 2002)—which we will call a correspondence approach. A more direct approach would be to test the relations between the two techniques directly, which we will call a coherence approach. In Embretson’s (1983) terminology for construct validation, coherence is analogous to construct representation, and correspondence is analogous to nomothetic span. In Van de Vijver and Leung’s (2001) terminology for cross-cultural equivalence, coherence is analogous to structural equivalence, and correspondence is analogous to functional equivalence; thus, it is possible to identify someone either by reference to the person’s personal qualities (structural equivalence) or by reference to the person’s relatives (functional equivalence).
In the present article, we sought to fill this gap in the literature on nicotine dependence by testing whether the manifest category of DSM-IV psychological dependence differed qualitatively or quantitatively from its absence (no psychological dependence) in terms of the latent dimension of FTND physical dependence. We also described the latent dimension of FTND physical dependence psychometrically. Finally, we sought to add to what is known about the correlates of FTND physical dependence by describing its relations to other measures such as number of cigarettes per day and major depressive episode [MDE], given the documented association between smoking and depression (e.g., Acton, Prochaska, Kaplan, Small, & Hall, 2001; Covey, Glassman, & Stetner, 1998; Tsoh et al., 2000).

FTND Nicotine Dependence Versus DSM-IV Nicotine Dependence
Previous research has shown the FTND to have acceptable internal-consistency reliability for a brief measure (alphas range from .61 to .70; Etter, Vu Duc, & Perneger, 1999; Heatherton et al, 1991), to show acceptable test-retest reliability (rtt = .85; Etter et al., 1999), and to be unidimensional (Etter et al., 1999; Heatherton et al., 1991). The FTND was also related to biochemical measures of heaviness of smoking (Heatherton et al., 1991). Carbon monoxide explained 32% (Heatherton et al., 1991), 8% (Payne et al., 1994), and 7% (Kozlowski et al., 1994) of the variance in FTND scores, and cotinine explained about 25% (Heatherton et al., 1991) and 15% (Pomerleau, Carton, Lutzke, Flessland, & Pomerleau, 1994) of the variance in FTND scores. The FTND predicted failure to quit smoking after 7 months (Etter et al., 1999). The FTND was associated with self-efficacy toward quitting smoking and with smokers’ expectations that they would be irritable if they tried to quit, with smokers’ expectations that smoking would provide relief from withdrawal symptoms, and with smokers’ embarrassment over having to smoke (Etter et al., 1999). Although the FTND is considered a measure of physical dependence, it also showed positive correlations with measures of psychological dependence such as loss of function, expected withdrawal symptoms, and self-efficacy, with shared variance ranging from 19% to 23%—results that somewhat challenge the sharp distinction between physical and psychological nicotine dependence (Dijkstra & Tromp, 2002).
Previous analyses have suggested that different ways of assessing nicotine dependence have different external correlates. On the one hand, nicotine dependence assessed using the FTND appears to measure physiological symptoms of dependence. On the other hand, nicotine dependence assessed using criteria from the DSM-III-R or DSM-IV in interviews such as the DIS appears to measure adverse consequences, desire to cut down, and mood changes during withdrawal (Moolchan et al., 2002). A content analysis showed that the FTND contains items with content similar to some (but not all) DSM-IV diagnostic criteria for substance dependence and nicotine withdrawal (Etter et al., 1999). The DSM-III-R definition was far less able than the FTND definition to predict who would quit and who would not (Breslau & Johnson, 2000).
Although the assessment of external correlates is one way to determine the relative importance of these two meanings of nicotine dependence, a more direct approach to assessing their similarities and differences is possible. Advances in the psychometric technique of item response theory (De Boeck, Wilson, & Acton, 2003) permit a direct assessment of whether DSM-IV nicotine dependence is dimension-like at the latent level. The hypothesis of this article is that at the latent level DSM-IV nicotine dependence is thoroughly dimension-like.

Item Response Theory
Item response theory (IRT) assumes that the probability of endorsing any given item can be described by an S-shaped (logistic) function of an underlying latent dimension (e.g., Embretson, 1996). This function is called an item characteristic curve (ICC). In the one-parameter (Rasch) model (Rasch, 1960/1980), the only parameter that varies among items is their location on the latent dimension, defined as the point at which the probability of endorsing the item is 0.5. The slope of the ICC is called its discrimination. Because the Rasch model holds discrimination constant across items, it is simpler than other IRT models and has other desirable measurement properties (e.g., Acton, 2003; Perline, Wright, & Wainer, 1979; Wright, 1977). One practical advantage of the Rasch model is that it permits stable parameter estimates to be estimated without resort to the large sample sizes required by other IRT models.
Estimates of the location of each item are a unique feature of IRT-based measurement. Similarly, IRT provides estimates of the location of each person on the latent dimension. Because item location and person location are on the same metric, they are directly comparable; indeed, probability of endorsing an item is an additive function of the two (cf. Perline et al., 1979).
Sometimes one group of persons will respond to an item differently from another group of persons. Such person-by-item interactions contradict the assumption of additive measurement inherent in an IRT model and signal the non-comparability of scores from one group to another. Initially conceived as a test for item bias, this situation, called differential item functioning (DIF), was assessed with respect to the physical dependence of several groups in this study.

The Dimension/Category Framework (Dimcat)
The concept of DIF provides an avenue for testing the non-comparability—the qualitative distinctness—of any two groups of persons. In this study, we tested whether the group of persons diagnosed with DSM-IV nicotine dependence differed qualitatively from the group of persons not diagnosed with DSM-IV nicotine dependence in terms of their locations on the latent dimension of FTND nicotine dependence. An absence of DIF would indicate that these two groups differ from one another only in degree rather than in kind (De Boeck et al., 2003).
Quantitative (as opposed to qualitative) differences are a prerequisite to assessing bimodality; the latter is often thought to be a primary indicator that two groups are"categorical." That is to say, only if DSM-IV nicotine dependence differs from its absence as a matter of degree could it still be "categorical" in terms of bimodality. Under these conditions, a bimodal joint distribution of person locations in the two groups would indicate what we call abrupt quantitative differences, whereas a unimodal joint distribution would indicate what we call smooth quantitative differences.
Within-group distributions can be either heterogeneous, as in a normal distribution, or homogeneous, as in a point distribution. The former are considered more dimension-like and the latter more category-like. A finding of quantitative differences, an absence of bimodality, and within-group heterogeneity would indicate that the groups were thoroughly dimension-like (De Boeck et al., 2003).

Implications
The dimension/category question has important implications for both theory and assessment. With respect to theory, the debate over whether nicotine dependence is category-like or dimension-like is a subset of the broader debate within psychopathology over whether mental disorders are category-like or dimension-like, with most psychologists arguing for the dimension-likeness of common mental disorders (e.g., Krueger & Finger, 2001; Krueger & Piasecki, 2002).
With respect to assessment, if persons diagnosed as nicotine dependent are heterogeneous, then matching them to appropriate treatment requires valid assessment techniques. For example, smokers with minimal nicotine dependence could receive a less intensive intervention to begin with and then could receive progressively more intensive interventions until they succeed in quitting smoking—this stepped care model of treatment could allow the maximum number of smokers in the population to receive effective treatment with limited resources (Kassel & Yates, 2002). For example, smokers low on nicotine dependence may benefit from a stop-smoking pamphlet or telephone quitline, whereas smokers high on nicotine dependence may require antidepressants or intensive cognitive-behavioral treatment.

Method
Analyses
Using IRT, we estimated the locations of the FTND items on a latent dimension of nicotine dependence, assessed the dimensionality of these items, assessed whether different subgroups (including Chinese- versus English-speakers) who either completed a translated Chinese version or the English version of the FTND performed differently on these items, and assessed reliability and validity. Using receiver operator characteristic (ROC) curves and linear regressions, we assessed the correspondence of FTND nicotine dependence with other measures. The following analyses were performed.

Dimensionality: One-dimensional and two-dimensional models of the FTND were compared using confirmatory IRT methods.

The model: The coherence of the manifest category of DSM-IV nicotine dependence was assessed by dividing the sample into subgroups based on DSM-IV nicotine dependence and testing the category-like versus dimension-like nature of these subgroups based on Dimcat.

Reliability: Internal consistency and measurement information of the FTND were assessed.

Validity: Item fit, concurrent validity, and cross-validity in different subsamples (based on DIF analyses) of the FTND were assessed.

Prediction: The correspondence between FTND nicotine dependence and (a) measures of smoking behavior and (b) presence of current MDE was assessed using ROC curves and linear regressions.

Participants
The present study represents analyses of a subset of data from a larger study examining smoking and depression in Chinese and non-Chinese American smokers. A convenience sample of 409 smokers who resided in Northern California was recruited by flyers and advertisements via radio, newspapers, television, and the internet. In order to be eligible for the study, participants must have smoked at least five cigarettes in the past 7 days and must have been able to read English or Chinese. Table 1 describes sample characteristics. The fact that this was a community sample makes it more representative of smokers in the general population than would be a treatment-seeking sample, as are most samples in the literature.
Procedure
After a brief telephone screening, eligible participants completed the assessments by telephone and mail, or in-person. Participants were assured of confidentiality, study procedures were explained, and informed consent was obtained. The interview and self-report measures were administered. After completing baseline assessments, all participants received $30 and a self-help manual on smoking cessation. Baseline data collection was completed from July 2000 to September 2001.
Measures
Although most of the measures used in this study (see below) are widely used and standardized, few have been used with a Chinese population. For those that have been used with this population, limited psychometric data have been reported. Therefore, the FTND was translated into Chinese, and existing translations of other measures were used, so they could be administered in either Chinese or English. All measures were translated and back-translated from English to Chinese and vice versa several times until both the translated and back-translated versions were consistent with one another. No translator was involved in more than one sequence of the translation process, so that a translator who back-translated the measures had no prior knowledge of the English version. After the translation process was complete, three focus groups with four to six participants in each group were conducted to review the English and Chinese versions of each measure. Questionnaires were evaluated item-by-item to determine the accuracy and understandability of the translations. Focus group members were either former or current smokers, were bilingual, and were paid $40 for participation.
Demographic questionnaire. General sociodemographic information was obtained.
Fagerstrom Test for Nicotine Dependence (FTND). Nicotine dependence was measured using this six-item questionnaire (Heatherton et al., 1991). The FTND was modified from the most commonly used nicotine dependence measure, the Fagerstrom Tolerance Questionnaire (FTQ; Fagerstrom, 1978), with little difference between measures other than improved psychometric properties for the FTND (Payne et al., 1994). The FTND was translated into Chinese for this study. One FTND item"How many cigarettes do you smoke per day?" was not included in the study, because participants were asked to provide the total number of cigarettes smoked in the past 7 days. Participants’ number of cigarettes smoked per day was calculated from the self-reported number of cigarettes smoked in the past 7 days and was categorized into one of the four response categories as in the FTND corresponding to the score range from 0 to 3. The FTND has been used in the Chinese population in China, but limited information is available regarding the psychometric properties of the measure (Niu et al., 2000). The Chinese and English versions of FTND used in the current study are included in the Appendix.
Composite International Diagnostic Interview (CIDI). The CIDI was administered in English or Chinese, depending on participants’ preferences, to assess past and current diagnosis of MDE and nicotine dependence (WHO, 1997). The computerized version, CIDI-Auto, version 2.1, which addresses both DSM-IV and ICD-10 criteria and can be self-administered or administered by an interviewer, was used (WHO, 1997). Comparisons between the computerized and paper-and-pencil CIDI were good to excellent (Peters & Andrews, 1995). Test-retest Kappa reliabilities were acceptable to excellent for most items, and acceptable validity of CIDI-Auto has been reported with comparison to consensual diagnoses of experts (Andrews & Peters, 1998; Wittchen, 1994). The Chinese version used in these previous studies was used.
Center for Epidemiological Studies--Depression Scale (CES-D). This standardized measure is commonly used to measure depression symptoms in community samples (Radloff, 1977). The CES-D has high internal consistency and adequate discriminability. A score of 16 on the CES-D has been used to indicate current depression. The instrument has been translated into Chinese and validated in Taiwan (Ying & Liese, 1990; Ying & Liese, 1991), the People’s Republic of China (Ying & Zhang, 1995), and the U.S. (Ying, 1988). The Chinese version used in these previous studies was used.

Results
FTND items 1 and 4 were originally polytomous items, with response options 0, 1, 2, and 3. Because of the sparseness of data in some of these response options, we dichotomized these items, assigning a score of 0 to 0 and a score of 1 to 1, 2, and 3. As a result, six dichotomous FTND items were analyzed. All IRT models were implemented in the IRT software Conquest (Wu, Adams, & Wilson, 1998).
Differential Item Functioning on Language
We estimated a differential item functioning (DIF) model to determine whether item location estimates for the six FTND items varied across two languages, English (n = 241) and Chinese (n = 168). Results revealed that there existed DIF based on language. The correlation in the derived location estimates between English and Chinese was r = .77. We further inspected differences in those estimates for each item. Statistical significance of differences was determined based on two standard errors. This inspection revealed that the location estimates of FTND item 2 (1.15) and FTND item 5 (1.17) for English were significantly larger than those (0.48 and 0.40, respectively) for Chinese, whereas the location estimates of FTND item 4 (-0.34) and FTND item 6 (-0.06) were significantly smaller than those (0.21 and 0.81, respectively) for Chinese. These results suggested that FTND items 2 and 5 were relatively more difficult in the English version than in the Chinese version, whereas FTND items 4 and 6 were relatively easier in the English version than in the Chinese version. Given these results, the remaining analyses were performed separately in smokers who had taken an English or Chinese version of each measure.
Dimensionality
We assessed the unidimensionality of the FTND. This assessment was conducted by comparing a one-dimensional model with a two-dimensional orthogonal model. Based on previous research (Radzius, Moolchan, Henningfield, Heishman, & Gallo, 2001), the two-dimensional model was formulated such that FTND items 1, 2, 4, and 6 loaded exclusively on one latent dimension and FTND items 3 and 5 loaded exclusively on the other latent dimension. Because the one- and two-dimensional models were not hierarchically related, the Akaike information criterion (AIC; Akaike, 1973) and the Bayesian information criterion (BIC; Schwarz, 1978) were employed to determine which model showed a better fit to the observed data. Smaller AIC and BIC values would indicate better fit. In the English version, AIC and BIC values of the one-dimensional model (1706.72 and 1738.81, respectively) were smaller than those of the two-dimensional model (1738.90 and 1779.01, respectively), indicating that the one-dimensional model showed a better fit. In the Chinese version, AIC and BIC values of the one-dimensional model (1158.43 and 1190.52, respectively) were nearly equivalent to those of the two-dimensional model (1148.45 and 1188.57, respectively), but the one-dimensional model was more parsimonious. Thus, in further analyses, we treated the FTND as unidimensional.
The Model
Item locations. Item locations for the six FTND items were estimated by applying the Rasch model (Rasch, 1960/1980) to the English sample (n = 241) and Chinese sample (n = 168). This calibration resulted in item locations that can be displayed in an item analysis (Table 2) and in item and person locations that can be displayed in location maps (Figures 1 and 2). Column 1 of the location maps contains FTND raw scores. Column 2 contains nicotine dependence scaled scores as measured in logits (or log-odds)—that is, the natural logarithm of the estimated probability of endorsing the item divided by the estimated probability of not endorsing the item, or ln(p / q) = ln(p / [1 – p]), where p = e(q - b) / [1 + e(q - b)]. Column 3 contains the distribution of person locations (q). Column 4 contains the distribution of item locations (b).
The average of the person distribution was set to 0 logits, allowing the item locations to vary freely. In the English version, the person distribution ranged from nearly -4.0 to 4.0 logits (corresponding to probabilities of endorsement from 0.02 to 0.98), whereas the item distribution ranged from –1.2 to 1.4 logits (corresponding to probabilities of endorsement from 0.23 to 0.80 for a person at 0.0 logits). In the Chinese version, the person distribution also ranged from nearly 4.0 to -4.0 logits, whereas the item distribution ranged from –0.6 to 1.6 logits (corresponding to probabilities of endorsement from 0.35 to 0.83 for a person at 0.0 logits). The score equivalence index (Table 3) allows one to estimate person locations (i.e., scaled scores) based on raw scores without recalibrating the instrument.
Dimcat. We assessed whether DSM-IV nicotine dependence (DSM-IV+) was qualitatively distinct from the absence of DSM-IV nicotine dependence (DSM-IV-) by examining location equivalence between the two groups. After dividing the English-speaking sample into DSM-IV+ (n = 152) and DSM-IV- (n = 89) groups, we estimated item locations for each group based on the Rasch model. Figure 3 shows a plot of the location parameter estimates of the six FTND items in DSM-IV+ versus DSM-IV- groups. This plot demonstrates a linear relation in locations between the two groups, which was further indicated by its large correlation (r = .99). Based on this finding, we concluded that DSM-IV nicotine dependence differed only quantitatively, not qualitatively, from its absence in the English-speaking sample.
Similarly we divided the Chinese-speaking sample into DSM-IV+ (n = 80) and DSM-IV- (n = 87) groups and estimated item locations for each group based on the Rasch model. The linear relation in locations between the two groups was suggested by the plot (Figure 3) and by the large correlation (r = .93). Based on this finding, we concluded that DSM-IV nicotine dependence differed only quantitatively from its absence in the Chinese-speaking sample.
We further examined whether this quantitative difference was smooth or abrupt. A difference between the means of the two groups that was larger than 2.20 standard deviations would yield bimodality in the joint distribution, indicating abrupt latent differences (De Boeck et al., 2003). Using weighted likelihood estimates (Warm, 1989) as person location estimates, we obtained the standard deviation in the English-speaking sample (SD = 1.56) and the means of DSM-IV+ (M = 0.29) and DSM-IV- (M = -0.45) groups. Computed from these values, the difference between the means of the two groups was 0.74, which was smaller than 2.20. Similarly, we obtained the standard deviation in the Chinese-speaking sample (SD = 1.53) and the means of DSM-IV+ (M = 0.11) and DSM-IV- (M = 0.07) groups. Based on these values, the difference between the means of the two groups was 0.03, which was smaller than 2.20. Therefore, we concluded that the latent differences on the FTND in both samples were not only quantitative but also smooth.
We also obtained the variance of persons for each group as an indicator of within-group homogeneity. The variance of English-speaking persons for the DSM-IV+ group was 2.42, and that for DSM-IV- group was 2.17. The variance of Chinese-speaking persons for the DSM-IV+ group was 2.42, and that for DSM-IV- group was 2.26. Acceptably high variances indicate within-group heterogeneity, whereas very low variances indicate within-group homogeneity. Values such as those obtained can be regarded as acceptably high. Therefore, we concluded that the DSM-IV+ and DSM-IV- groups in both samples were heterogeneous with respect to person locations on the FTND. Consequently, the manifest category of DSM-IV nicotine dependence in both English-speaking and Chinese-speaking Americans can be regarded as thoroughly dimension-like.
Reliability
Internal consistency. The internal-consistency reliabilities of items were calculated for the entire English-speaking and the entire Chinese-speaking samples using separation reliability, an internal-consistency metric that is interpreted the same as is Chronbach’s alpha. The separation reliability of items in the entire English-speaking sample was .96. The separation reliability of items in the entire Chinese-speaking sample was .90. For a brief measure such the FTND, these reliabilities were excellent.
Measurement information. Under the Rasch model, standard error of measurement describes expected score fluctuations in estimated person location due to error. Computed from weighted likelihood estimates, average standard errors of measurement for the English-speaking smokers (M = .02, SD = 1.56) and for the Chinese-speaking smokers (M = .09, SD = 1.53) were very low, indicating that the model fit these smokers well. (Because standard error of measurement cannot be lower than zero, the large standard deviation values indicated skewed distributions.)
Validity
Item fit. The item fit statistic is an indicator of how well the item fits the model. Calculated as the sum of squared residuals over persons for any one item (Wright & Masters, 1982), the index of item fit that we used was the weighted infit meanquare (Wu et al., 1998), which has an expected value of 1.0. It is possible for responses to an item to contradict the model by being either too orderly (thus denying the probabilistic nature of the model), as indicated by a weighted infit meansquare lower than 0.75, or by being too random, as indicated by a weighted infit meansquare greater than 1.33 (Wright & Masters, 1982). All of the FTND items fit quite well, ranging from 0.96 to 1.19 in the English-speaking sample and from 0.85 to 1.14 in the Chinese-speaking sample.
Concurrent validation. We assessed relations of FTND person locations with two diagnoses: DSM-IV lifetime nicotine dependence and DSM-IV current MDE. For this assessment, we generated receiver operator characteristic (ROC) curves (Metz, 1978; Swets, 1988) using SAS. The ROC curve is a graphical method for showing the relation between the true-positive fraction (sensitivity) and the true-negative fraction (specificity). The larger the area under the curve, the more accurate the diagnostic test. A value of 0.50 for the area under the ROC curve would indicate that the FTND had no power to discriminate positive from negative cases, whereas a value of 1.00 would indicate that the FTND was a perfect discriminator.
FTND person locations were defined as weighted likelihood estimates, as above. Figure 4 shows the ROC curve for FTND person locations (q) in assessing diagnosed DSM-IV nicotine dependence. The area under the curve was 0.64 in the English-speaking sample and 0.52 for the Chinese-speaking sample. Each sample (English- and Chinese-speaking) was further broken down into three groups: smokers who had never been depressed, smokers who had been depressed before but not in the past year, and smokers who had been depressed in the past year. We refer to the latter as having current MDE and to the former two as not having current MDE. Figure 5 shows the ROC curve for FTND person locations (q) in assessing diagnosed DSM-IV current MDE. The area under the curve was 0.65 for the English-speaking sample and 0.61 for the Chinese-speaking sample. The former value is somewhat higher than the value of 0.52 found by Breslau and Johnson (2000), who used FTND raw scores rather than IRT-estimated scaled scores to predict lifetime MDE rather than current MDE. Our results indicated that the FTND had some discriminatory power in the English-speaking sample for assessing DSM-IV nicotine dependence and in both samples for assessing DSM-IV current MDE.
Cross-validation. We assessed DIF on the FTND. Specifically, we inspected whether item location estimates for the six FTND items were variant across (a) males versus females and (b) MDE presence versus absence. For MDE, we conducted three DIF analyses: one for all three groups, another for never versus remitted and current depressed groups, and one for never and remitted versus current depressed groups. No DIF was found across any of these groups in either sample.
Prediction
We assessed the ability of the FTND to predict current smoking behavior and depression. We used weighted likelihood estimates as latent FTND scaled scores. Current smoking behavior consisted of (a) average cigarettes per day, (b) years of smoking, and (c) age when started smoking regularly. Depression was measured by CES-D raw scores. We performed four univariate regression analyses for each language using these outcome variables, with all values statistically significant at p < .01 unless otherwise noted (cf. Table 3).
The regression analyses in the English-speaking sample revealed that there were linear relations between the FTND and daily cigarettes (F[1,239] = 125.41), years of smoking (F[1,239] = 32.63), age of smoking (F[1,239] = 10.34), and depression (F[1,239] = 16.04). As the regression lines showed, an increase of one logit in FTND scaled score was associated with an increase of 0.59 cigarettes per day, with an increase of 0.35 years of smoking, with a decrease of 0.20 years in age of beginning smoking, and with an increase of 0.25 in CES-D raw score. FTND scaled scores explained 34% of the variance in daily cigarettes, 12% of the variance in years of smoking, 4% of the variance in age of smoking, and 6% of the variance in depressive symptoms.
Similarly, the regression analyses for the Chinese-speaking sample revealed linear relations between the FTND and daily cigarettes (F[1,165] = 139.46), years of smoking (F[1,165] = 20.80), and age of smoking (F[1,165] = 8.43), but no linear relation between the FTND and depressive symptoms (F[1,165] = 0.27, ns). As the regression lines showed, an increase of one logit in FTND scaled score was associated with an increase of 0.68 cigarettes per day, with an increase of 0.36 years of smoking, and with a decrease of 0.22 years in age of beginning smoking. FTND scaled scores explained 46% of the total variance of daily cigarettes, 11% of that of years of smoking, and 5% of that of age of smoking.

Discussion
This study contributes to an understanding of the relation between DSM-IV nicotine dependence and FTND nicotine dependence. It can now be more confidently stated that persons with diagnosed nicotine dependence differ quantitatively rather than qualitatively from persons without diagnosed nicotine dependence in terms of their locations on the latent dimension of physical dependence underlying the FTND. This finding contributes to the growing literature in the classification of psychopathology arguing that dimensional approaches to the conceptualization and diagnosis of mental disorders are superior to categorical approaches. It appears from these results that diagnosed nicotine dependence should be conceptualized and assessed using a dimensional approach. Also noteworthy are the cross-validation results showing that the FTND is robust to differences in gender and diagnosed major depression.
The latent dimension defined by the FTND was related to DSM-IV nicotine dependence, to smoking behaviors, and to major depression in the expected way, sometimes (e.g., for daily cigarettes) strongly related. Some have argued (e.g., Moolchan et al., 2002) that DSM-IV nicotine dependence has a stronger association with DSM-IV major depression, partly due to the way DSM-IV defines nicotine dependence by"counting" symptoms. From our findings, however, it seems that the FTND also has a strong association with DSM-IV major depression.
This study reports psychometric analyses of a new Chinese translation of the FTND and relations between Chinese-speaking and English-speaking Americans on the FTND and other measures. Despite the finding of differential item functioning based on language, results based on the English and Chinese measures were remarkably similar. One difference between the English and Chinese measures was that the English FTND was associated with depressive symptoms as measured by CES-D, and the Chinese FTND was not. Based on ethnic Chinese participants in the current sample (including both English- and Chinese-speaking Chinese Americans), neither FTND score nor diagnosed nicotine dependence was found to be associated with CES-D scores (Tsoh, Lam, Delucchi, & Hall, in press). The difference between the English and Chinese versions of the FTND in their associations with CES-D scores could be due to differences between Chinese and non-Chinese smokers or due the choice of language. This study, however, is unable to examine the reasons for such a difference, because a majority of the Chinese participants completed the Chinese FTND, whereas only a small proportion of the Chinese participants completed the English FTND. Nevertheless, the correlates of the two measures and the dimension-likeness of DSM-IV nicotine dependence were otherwise in good agreement. Thus, it appears that across languages the FTND demonstrates functional equivalence but not structural equivalence (Van de Vijver & Leung, 2001).
Limitations of this study include the fact that only a streamlined version of Dimcat was applied. Specifically, a one-parameter IRT model with location as the only item parameter was assessed. The second item parameter, discrimination, was held constant because a one-parameter model is simpler and has other desirable measurement properties (e.g., Acton, 2003; Perline et al., 1979; Wright, 1977). Nevertheless, it may be desirable for future studies to explore the effect of varying item discrimination on the FTND, both in the full sample and in subgroups who are versus are not diagnosed with nicotine dependence. In addition, smooth versus abrupt differences were assessed using IRT, but other methods could also have been used, such as taxometric methods (e.g., Meehl, 1995), and future studies of diagnosed nicotine dependence might make use of these methods. In addition, findings apply only to current smokers; the study did not include never smokers or former smokers.
Strengths of this study include the following. First, this is among the first studies to have applied the IRT-based conceptual and methodological approach called Dimcat (De Boeck et al., 2003). Second, this is the first study of which we are aware to investigate whether diagnosed nicotine dependence is category-like or dimension-like. Third, this is the first study of which we are aware to have applied IRT to the FTND. Fourth, most studies include samples of treatment-seeking smokers, limiting generalizations to smokers who are trying to quit, but this study’s use of a community sample of smokers makes it easier to generalize to the entire population of smokers.
Assessment of nicotine dependence should play a vital role in smoking cessation treatment, as Kassel and Yates (2002) have persuasively argued. Our results are consistent with those of other investigators in finding that persons with diagnosed nicotine dependence are a heterogeneous lot. Therefore, to match them with appropriately informed and tailored treatment requires valid assessment techniques. Our results lend significant evidence to the contention that the FTND is a reliable, valid, and important technique for assessing nicotine dependence. Given that cigarette smoking remains the leading preventable cause of death in the United States, the assessment and treatment of cigarette smoking should remain a vital pragmatic concern for treatment providers in the years to come.

References
Acton, G. S. (2003). What is good about Rasch measurement? Rasch Measurement Transactions, 16, 902-903.
Acton, G. S., Prochaska, J. J., Kaplan, A. S., Small, T., & Hall, S. M. (2001). Depression and stages of change for smoking in psychiatric outpatients. Addictive Behaviors, 26, 621-631.
Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrov & F. Csaki (Eds.), Second international symposium on information theory (pp. 267-281). Budapest, Hungary: Akademiai Kiado.
American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.). Washington, DC: Author.
Andrews, G., & Peters, L. (1998). The CIDI-Auto: A computerised diagnostic interview for psychiatry [Web page]. URL http://www.unsw.edu.au/clients/crufad/cidi/discuss.httm [Accessed: 1998, December 21].
Breslau, N., & Johnson, E. O. (2000). Predicting smoking cessation and major depression in nicotine-dependent smokers. American Journal of Public Health, 90, 1122-1127.
Covey, L. S., Glassman, A. H., & Stetner, F. (1998). Cigarette smoking and major depression. Journal of Addictive Diseases, 17, 35-46.
De Boeck, P., Wilson, M., & Acton, G. S. (2003). A conceptual and psychometric framework for distinguishing categories and dimensions. Revised and resubmitted. Psychological Review.
Dijkstra, A., & Tromp, D. (2002). Is the FTND a measure of physical as well as psychological tobacco dependence? Journal of Substance Abuse Treatment, 23, 367-374.
Embretson, S. E. (1983). Construct validity: Construct representation versus nomothetic span. Psychological Bulletin, 93, 179-197.
Embretson, S. E. (1996). The new rules of measurement. Psychological Assessment, 8, 341-349.
Etter, J. F., Vu Duc, T., & Perneger, T. V. (1999). Validity of the Fagerstrom Test for Nicotine Dependence and of the Heaviness of Smoking Index among relatively light smokers. Addiction, 94, 269-281.
Fagerstrom, K. O. (1978). Measuring degree of physical dependence to tobacco smoking with reference to individualization of treatment. Addictive Behaviors, 3, 235-241.
Heatherton, T. F., Kozlowski, L. T., Frecker, R. C., & Fagerstrom, K. O. (1991). The Fagerstrom Test for Nicotine Dependence: A revision of the Fagerstrom Tolerance Questionnaire. British Journal of Addiction, 86, 1119-1127.
Kassel, J. D., & Yates, M. (2002). Is there a role for assessment in smoking cessation treatment? Behaviour Research and Therapy, 40, 1457-1470.
Kozlowski, L. T., Porter, C. Q., Pope, M. A., & Heatherton, T. (1994). Predicting smoking cessation with self-reported measures of nicotine dependence: FTQ, FTND, and HSI. Drug and Alcohol Dependence, 34, 211-216.
Krueger, R. F., & Finger, M. S. (2001). Using item response theory of understand comorbidity among anxiety and unipolar mood disorders. Psychological Assessment, 13, 140-151.
Krueger, R. F., & Piasecki, T. M. (2002). Toward a dimensional and psychometrically informed approach to conceptualizing psychopathology. Behaviour Research and Therapy, 40, 485-499.
Meehl, P. E. (1995). Bootstraps taxometrics: Solving the classification problem in psychopathology. American Psychologist, 50, 266-275.
Metz, C. E. (1978). Basic principles of ROC analysis. Seminars in Nuclear Medicine, 8, 283-298.
Moolchan, E. T., Radzius, A., Epstein, D. H., Uhl, G., Gorelick, D. A., Cadet, J. L., & Henningfield, J. E. (2002). The Fagerstrom Test for Nicotine Dependence and the Diagnostic Interview Schedule: Do they diagnose the same smokers? Addictive Behaviors, 27, 101-113.
Niu, T., Chen, C., Ni, J., Wang, B., Fang, Z., Shao, H., & Xu, X. (2000). Nicotine dependence and its familial aggregation in Chinese. International Journal of Epidemiology, 29, 248-252.
Payne, T., Smith, P., McCracken, L., McSherry, W., & Antony, M. (1994). Assessing nicotine dependence: A comparison of the Fagerstrom Tolerance Questionnaire (FTQ) with the Fagerstrom Test for Nicotine Dependence (FTND) in a clinical sample. Addictive Behaviors, 19, 307-317.
Perline, R., Wright, B. D., & Wainer, H. (1979). The Rasch model as additive conjoint measurement. Applied Psychological Measurement, 3, 237-255.
Perng, R., Hsieh, W., Chen, Y., Lu, C., & Chiang, S. (1998). Randomized, double-blind, placebo-controlled study of transdermal nicotine patch for smoking cessation. Journal of the Formosan Medical Association, 97, 547-551.
Peters, L., & Andrews, F. (1995). A procedural validity study of the computerised version of the Composite International Diagnostic Interview. Psychological Medicine, 25, 1269-1280.
Pomerleau, C. S., Carton, S. M., Lutzke, M. L., Flessland, K. A., & Pomerleau, O. F. (1994). Reliability of the Fagerstrom Tolerance Questionnaire and the Fagerstrom Test for Nicotine Dependence. Addictive Behaviors, 19, 33-39.
Radloff, L. S. (1977). The CES-D scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, 1, 385-401.
Radzius, A., Moolchan, E. T., Henningfield, J. E., Heishman, S. J., & Gallo, J. J. (2001). A factor analysis of the Fagerstrom Tolerance Questionnaire. Addictive Behaviors, 26, 303-310.
Rasch, G. (1980). Probabilistic models for some intelligence and attainment tests (expanded ed.). Chicago: The University of Chicago Press. (Originial work published 1960).
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461-464.
Swets, J. A. (1988). Measuring the accuracy of diagnostic systems. Science, 240, 1285-1293.
Takeuchi, D. T., Chung, R. C. Y., Lin, K. M., Shen, H. K., Kurasaki, K., Chun, C. A., & Sue, S. (1998). Lifetime and twelve-month prevalence rates of major depressive episodes and dysthymia among Chinese Americans in Los Angeles. American Journal of Psychiatry, 155, 1407-1414.
Tsoh, J. Y., Humfleet, G. L., Muñoz, R. F., Reus, V. I., Hartz, D. T., & Hall, S. M. (2000). Development of major depression after treatment for smoking cessation. American Journal of Psychiatry, 157, 368-374.
Tsoh, J. Y., Lam, J. N., Delucchi, K., & Hall, S. M. (in press). Smoking and depression in Chinese Americans. The American Journal of the Medical Sciences.
Van de Vijver, F. J. R., & Leung, K. (2001). Personality in cultural context: Methodological issues. Journal of Personality, 69, 1007-1031.
Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54, 427-450.
Wittchen, H. U. (1994). Reliability and validity studies of the WHO Composite International Diagnostic Interview (CIDI): A critical review. Journal of Psychiatric Research, 28, 57-84.
World Health Organization. (1997). CIDI-Auto Version 2.1: Administrator's guide and reference. Sydney, Australia: Training and Reference Centre for WHO CIDI.
Wright, B. D. (1977). Solving measurement problems with the Rasch model. Journal of Educational Measurement, 14, 97-116.
Wright, B. D., & Masters, G. N. (1982). Rating scale analysis: Rasch measurement. Chicago: MESA Press.
Wu, M. L., Adams, R. J., & Wilson, M. (1998). ACER Conquest: Generalized item response modelling software [computer software]. Melbourne, Australia: Australian Council for Educational Research.
Ying, Y.-W. (1988). Depressive symptomatology among Chinese-Americans as measured by the CES-D. Journal of Clinical Psychology, 44, 739-746.
Ying, Y.-W., & Liese, L. H. (1990). Initial adaptation of Taiwan foreign students to the United States: The impact of prearrival variables. American Journal of Community Psychology, 18, 825-845.
Ying, Y.-W., & Liese, L. H. (1991). Emotional well-being of Taiwan students in the U.S.: An examination of pre- to post-arrival differential. International Journal of Intercultural Relations, 15, 345-366.
Ying, Y.-W., & Zhang, X. (1995). Mental health in rural and urban Chinese families: The role of intergenerational personality discrepancy and family solidarity. Journal of Comparative Family Studies, 26, 233-246.

Last modified September 2003
Visited times since September 2003
Comments?

Home to Great Ideas in Personality

An Item Response Theory Analysis of the Fagerstrom Test for Nicotine Dependence

FTND Nicotine Dependence Versus DSM-IV Nicotine Dependence

Item Response Theory

The Dimension/Category Framework (Dimcat)

Implications

Method

Results

Discussion

References