User Tools

Site Tools


smoking_derivatives

Smoking derivatives

Adult Lifelines participants were asked whether they (had) smoked any tobacco.
Using the raw self-reported data from 1A questionnaire 1, researchers from the UMCG Department of Epidemiology calculated various derivative variables for smoking behavior (sections: lifestyle & environment and secondary & linked variables).
These derivative variables can be requested from Lifelines (data@lifelines.nl) or requested in the Lifelines catalogue.
Please note: An updated version of the smoking derivatives is available which includes derivatives for more time points and an updated version for the baseline derivaties (see updated smoking derivatives).

The following derivative variables were calculated for ~152,300 adult participants at baseline:

  • Pack years
  • Pack years w/o cigars
  • Duration of smoking (in years)
  • Recent quitter
  • Recent starter
  • Smoking habits
  • Age at start of smoking
  • Age at end of smoking
  • Current smoker
  • Ever smoker
  • Ex-smoker
  • Current number of cigarettes/day
  • Current number of cigarillos/day
  • Current number of cigars/day
  • Current grams of pipe tobacco/day

Data cleaning

Before the calculation of the derivative variables, the researchers checked the questionnaire smoking variables on missing and inconsistent answers. Missing and inconsistent answers were checked and corrected by comparing the original questionnaire with the answer in the database. Additionally, inconsistent answers were corrected using the validation rules listed below. The remaining items and inconsistencies were corrected according to the general principle that participants tend to underestimate their smoking habits and if a choice had to be made between two answers, the answer indicating the largest smoking history was chosen.

SMK1 'Have you ever smoked for as long as a year?'
IFTHEN
the answer is missing and the other items show that the participant has never smoked or has smoked shorter than one yearthe answer is 'NO'
the answer is missing and the other items show that the participant has smoked longer than one yearthe answer is 'YES'
the answer is 'NO' and the other items show that the participant has smoked longer than one yearthe answer is 'YES'
the answer is 'YES' and the other items show that the participant has never smoked or has smoked shorter than one yearthe answer is 'NO'
SMK2 'How old were you when you started smoking?'
IFTHEN
the answer is missingfill in the youngest age given in SMK7A1-SMK7E4
the answer is older than the youngest age given in SMK7A1-SMK7E4fill in the youngest age given in item 7*
the answer is younger than the youngest age given in SMK7A1-SMK7E4do not change SMK2
* Exception: if the youngest age in item 7 is 16 then do not change SMK2
SMK3 'Do you smoke now or have you been smoking in the last month?'
IFTHEN
the answer is missing and the other items show that the participant is a current smokerthe answer is 'YES'
the answer is missing and the other items show that the participant is not a current smokerthe answer is 'NO'
the answer is 'NO' and the other items show that the participant is a current smokerthe answer is 'YES'
the answer is 'YES' and the other items show that the participant is not a current smoker and has not stopped recently (SMK6 not equal to current age)the answer is 'NO'
the answer is 'YES' and the other items show that the participant is not a current smoker but has stopped recently (SMK6 is equal to current age)the answer is 'YES'
SMK4A-4D 'How many cigarettes/cigarillos/cigars/pipe tobacco do you smoke now on average per day?'
IFTHEN
the participant is female and in SMK4A-4D she indicated that she is smoking cigars and in SMK7A1-7E4 she indicated that she is smoking cigarettes and the number of cigarettes in SMK7A1-7E equals the number of cigars in SMK4CSMK4C should be 0 and SMK4A should be the number that was originally at SMK4C
SMK5 'Did you quit smoking?'
IFTHEN
the answer is missing and the other items show that the participant quit smoking and continued quitting smokingthe answer is 'YES'
the answer is missing and the other items show that the participant is a current smokerthe answer is 'NO'
the answer is 'NO' and the other items show that the participant quit smoking and continued quitting smokingthe answer is 'YES'
the answer is 'YES' and the other items show that the participant is a current smokerthe answer is 'NO'
NB. Participants who quitted but restarted smoking very often answer ‘YES’ to this question. These answers were recoded to ‘NO’ since this variable was intended to identify the ex-smokers. Quitting and restarting can be identified using the answers to SMK7A1-7E.
SMK6 'How old were you when you quit smoking?'
IFTHEN
the answer is missing and the participant quit smoking and continued quitting smoking (i.e. SMK5='YES')fill in the oldest age given in SMK7A1-7E
the answer is not missing, but the participant is a current smoker (i.e. SMK5='NO')the answer should be deleted
the answer is younger than the oldest age given in SMK7A1-7E and the participant quit smoking and continued quitting smoking (i.e. SMK5='YES')fill in the oldest age given in SMK7A1-7E
the answer is older than the oldest age given in SMK7A1-7E and the participant quit smoking and continued quitting smoking (i.e. SMK5='YES')do not change SMK6
NB. The oldest age given in SMK7A1-7E only applies to the ages at which the number of cigarettes etcetera is greater than zero.
SMK7A1-7E 'How much did you smoke up till now? From age .. till age.. I smoked … cigarettes/cigarillos/cigars/pipe tobacco per day.'
IFTHEN
a participant filled in more than one kind of tobacco product in one line including cigarettesthe answer should be cigarettes
a participant filled in more than one kind of tobacco product in one line excluding cigarettes but including cigarsthe answer should not be cigars
the participant did not fill in any kind of tobacco productthe answer should be cigarettes
the participants has clearly misinterpreted the question and filled in very odd values (e.g. the value 10 in all places)delete all answers in SMK7A1-7E4
the youngest age is less than 5 years older than the age given in SMK2replace the youngest age in SMK7A1-7E4 with the age given in SMK2
the youngest age in SMK7A1-7E4 is 16replace the youngest age in SMK7A1-7E4 with the age given in SMK2
the youngest age is more than 4 years different from the age given in SMK2do not change the youngest age in SMK7A1-7E4
the participant quit smoking and continued quitting smoking and the oldest age is less than 5 years younger than the age given in SMK6replace the oldest age in SMK7A1-7E4 with the age given in SMK6
the participant quit smoking and continued quitting smoking and the oldest age is more than 4 years different from the age given in SMK6do not change the oldest age in SMK7A1-7E4
the participant only completed the first age in every linefill in the second age on every line with the first age in the next line (if the participant did not complete the questions in chronological order then first determine the chronological order and fill the items based on this)
the participant is a current smoker but did not complete the last line in SMK7A1-7E4 (i.e. the second age is missing) fill in the current age
the participant is a current smoker and the oldest age is less than 5 years different from the current agereplace the oldest age in SMK7A1-7E4 with the current age
the participant is a current smoker and the oldest age is more than 4 years different from the current agedo not change the oldest age in SMK7A1-7E4
NB. The youngest and oldest ages given in item 7 only applies to the ages at which the number of cigarettes etc is greater than 0.
NB2. Since some participants followed the example at item 7 they started item 7 at age 16. Even if they started smoking earlier or later than 16 years of age. In this case the age given at SMK2 was regarded as the correct starting age.

Calculation derivative variables

To enable calculation of the derivative variables, three additional variables were added:

  • pyok: This variable indicates if the data given in SMK7A1-7E4 are complete (complete period from starting age (SMK2) until stopping age (SMK5) or current age (if SMK3=1 is given and number of cigarettes etc. is never missing). In this case packyears can be calculated (0=no, 1=yes).
  • smokduurok: This variable indicates if the ages givein in SMK7A1-7E4 are complete and do not overlap and thus the duration of smoking can be calculated by simple summing up the durations given in item 7 (0=no, 1=yes). Overlap in ages occurs if a participant smoked 2 or more kinds of tobacco products (e.g. 10 cigarettes/day and 1 cigar/day) during the same period. Packyears can be calculated but smoking duration calculation cannot be performed by simply summing up the durations. In the present data release the smoking duration is not calculated in these cases.
  • netgestopt: This variable indicates if the participant has recently quit smoking. This is the case if SMK3 is 'YES', SMK4A-D are zero or missing, SMK5 is 'YES' and SMK6 is equal to the current age (1=yes).

Derivative variables

Code Variable name Variable label Additional information
packyears_calculable_adu_c_1 pyok pack years can be calculated -
packyears_cumulative_adu_c_1 py packyears cumulative smoking history: 1 packyear = 20 cigarettes per day for 1 year (or 10 cigarettes for 2 years, or 1 cigarette for 20 years). Cigars are regarded as 3 cigarettes.
packyears_nocigars_adu_c_1 pywocig packyears without cigars Since cigars may add very much to the number of packyears, the total number of packyears without cigars is also calculated.
smoking_calculable_adu_c_1 smokdurok smoking duration can be calculated -
smoking_duration_adu_c_1 smokduration duration of smoking in years Total number of years smoking.
smoking_habit_adu_c_1 smoking smoking Categorical variable with three categories: 0=never smoker, 1=ex smoker, 2=current smoker
smoking_startage_adu_c_1 smkstart age at start of smoking -
smoking_endage_adu_c_1 smkstop age at end of smoking -
current_smoker_adu_c_1 currsmoker current smoker 0=no, 1=yes
ever_smoker_adu_c_1 eversmoker ever smoker 0=no, 1=yes. Both current and ex smokers are regarded as ever smokers.
ex_smoker_adu_c_1 exsmoker ex smoker 0=no, 1=yes
cigarettes_frequency_adu_c_1 currentcigarettes current number of cigarettes per day -
cigarillos_frequency_adu_c_1 currentcigarillos current number of cigarillos per day -
cigars_frequency_adu_c_1 currentcigars current number of cigars -
pipetobacco_frequency_adu_c_1 currentpipe current grams of pipe tobacco per day -
recent_quitter_adu_c_1 recent_quit recent quitter Variable indicating if the participant has recently quit smoking. This is the case if SMK3 is 'YES', SMK4A-D are '0' or missing, SMK5='YES' and SMK6 is equal to the current age. (1=yes)
recent_starter_adu_c_1 recent_start recent starter -

Papers using Lifelines smoking derivative data

  • Dijkstra, A.E., de Jong, K., Boezen, H.M., Kromhout, H., Vermeulen, R., Groen, H.J., Postma, D.S. and Vonk, J.M., 2014. Risk factors for chronic mucus hypersecretion in individuals with and without COPD: influence of smoking and job exposure on CMH. Occup Environ Med, 71(5), pp.346-352.
  • Vonk, J.M., Scholtens, S., Postma, D.S., Moffatt, M.F., Jarvis, D., Ramasamy, A., Wjst, M., Omenaas, E.R., Bouzigon, E., Demenais, F. and Nadif, R., 2017. Adult onset asthma and interaction between genes and active tobacco smoking: the GABRIEL consortium. PLoS One, 12(3).
  • de Vries, M., van der Plaat, D.A., Vonk, J.M. and Boezen, H.M., 2018. No association between DNA methylation and COPD in never and current smokers. BMJ open respiratory research, 5(1), p.e000282.
  • de Vries, M., Nedeljkovic, I., van der Plaat, D.A., Zhernakova, A., Lahousse, L., Brusselle, G.G., Amin, N., van Duijn, C.M., Vonk, J.M., Boezen, H.M. and BIOS Consortium, 2019. DNA methylation is associated with lung function in never smokers. Respiratory Research, 20(1), p.268.
You could leave a comment if you were logged in.
smoking_derivatives.txt · Last modified: 2022/03/11 15:46 by laura