ISSN: 2378315X BBIJ
Biometrics & Biostatistics International Journal
Research Article
Volume 3 Issue 3  2016
Cancer Risk from Exposure to Low to Moderate Level of Arsenic Using MetaAnalysis of Flexible Regression Models
Munni Begum^{1}*, John B Horowitz^{2} and Naim Al Mahi^{3}
^{1}Department of Mathematical Sciences, Ball State University, USA
^{2}Department of Economics, Ball State University, USA
^{3}Department of Environmental Health, University of Cincinnati, USA
Received: January 11, 2016  Published: February 23, 2016
*Corresponding author: Munni Begum, Department of Mathematical Sciences, Ball State University, Muncie, Indiana, USA, 47306, Email:
Citation: Begum M, Horowitz JB, Mahi NAl (2016) Cancer Risk from Exposure to Low to Moderate Level of Arsenic Using MetaAnalysis of Flexible Regression Models. Biom Biostat Int J 3(3): 00067. DOI: 10.15406/bbij.2016.03.00067
Abstract
Flexible regression models, such as fractional polynomial models and spline regression models, offer a rich class of models for linear and nonlinear doseresponse relationships in epidemiology and clinical trials. In this paper, we consider first and second order fractional polynomials, and spline regression models to estimate the combined trend coefficients from doseresponse models for cancer risk and exposure to low to moderate dose arsenic. The combined relative risks of bladder cancer and lung cancer are predicted for a sequence of low to moderate dose levels of arsenic from each model. Bestfit fractional polynomial models generate nonsignificant relative risks of both bladder cancer and lung cancer from low to moderate dose of arsenic levels in the range of 3 to 100 microgram per liter. The synthesis of results suggests that there is no or minimal risk of both bladder and lung cancer from dosetomoderate lose arsenic exposure.
Keywords: Arsenic; Flexible regression models; Spline regression models; Cancer; Carcinogenicity
Introduction
Epidemiological studies on arsenic exposure through drinking water conducted in arsenic endemic regions of the world provide clear evidence of cancer risks at highdose levels. Fortunately, very few humans are exposed to high dose levels. Much more common is exposure to low to moderate dose levels where evidence of carcinogenicity is mostly inconclusive. For example, a metaanalysis conducted by Chu & CrawfordBrown [1] found a small but measurable increase in the risk of bladder cancer from arsenic exposure through drinking water at 10 ppb. These results are 10 times lower than those extrapolated by the NRC [2]. However, Brown [1] argues that problems with their methodology and analysis mean that their results may not be reliable. Mink et al. [3] replicated Chu & CrawfordBrown’s [1] results, finding a generally weak and statistically insignificant relationship between lowdose exposure to arsenic and bladder cancer. Likewise Begum et al. [4] found a generally weak relationship between bladder and lung cancer and exposure to lowdose arsenic via drinking water. Other than the NRC [2], the combined results from these studies find no statistically significant doseresponse relationship under the assumption of linear models for the logarithm of relative risks and levels of exposure to arsenic.
These metaanalysis studies on arsenic exposure and disease risk assume linear exposureresponse models. The linearity assumption for the logarithm of relative risks and levels of exposure to arsenic is overly simplified and is not adequate to capture the local structure accurately. This article applies fractional polynomial and cubic spline regression models in order to capture the shapes of the exposureresponse relationships between both bladder and lung cancer risk and exposure to low to moderate dose arsenic. We consider low to moderate dose levels with concentrations from near 0 to 300µg/l. We also consider more recent studies on low to moderate dose exposure to arsenic and the risk of bladder and lung cancer. These flexible models are used to identify a combined general exposureresponse relationship for the logarithm of relative risk and the levels of exposure to arsenic. The primary objective of this study is to predict overall risks of bladder and lung cancers by combining findings from systematically selected studies on these cancers under both linear and nonlinear modeling assumptions. We predict overall risks of bladder and lung cancers for a series of exposure levels from the best fitting models.
Though flexible regression models have been used to combine results in other epidemiological studies, such as alcohol consumption and allcause mortality by Bagnardi [5], this is the first application to low to moderate dose arsenic consumption and the risk of internal cancers. This article is organized as follows: section 2 considers the systematic review of the bladder and lung cancer studies, section 3 discusses the fractional polynomials and the spline regression models, section 4 explains the results, and section 5 contains our conclusion and discussion.
Background
Systematic reviews are carried out to select both bladder and lung cancer studies and exposure to low to moderate levels of arsenic through ingestion. Flowcharts of the stepbystep study selection procedure are presented in (Figures 1 & 2).
We searched the Medline database with four arsenic search terms: arsenic, arsenite, arsenate, arsenicals and eight bladder cancer search terms: bladder cancer, transitional cell carcinoma of the bladder, urothelial cancer, urinary tract cancer, bladder neoplasm, urinary bladder neoplasm or urinary bladder cancer. Using these search terms, we identified 273 studies published before November 24, 2014 (Figure 1). We screened titles/abstracts of 222 studies. Of these, we reviewed the full text for 68 studies that met our selection criteria (Figure 1). Of these, 12 bladder cancer studies [617] met all the inclusion criteria. Inclusion criteria were set as checkpoints which include
Figure 1: PRISMA Flow Diagram for Bladder Cancer Study Selection.
 English language human study,
 Bladder cancer as the health outcome,
 Longterm exposure to arsenic through drinking water not above 300 µg/l,
 Prospective cohort or retrospective casecontrol studies conducted at low exposure levels,
 Population based study, and
 Relative risk estimate such as risk ratios or odds ratios with measures of variability or data that allowed for such calculations and available covariate information.
For lung cancer, we searched the Medline database using four arsenic keywords: arsenic, arsenite, arsenate, arsenicals and seven lung cancer search terms: lung cancer, lung neoplasm, small cell lung carcinoma, nonsmall cell lung carcinoma, bronchioloalveolar carcinoma, bronchiectasis, and bronchorrhea. Using these terms, we identified 461 studies published before November 24, 2014 (Figure 2). We screened titles/abstracts of 342 studies. Of these, we reviewed full text articles for 32 studies. Of these, 11 lung cancer studies [1727] met all the inclusion criteria.
Figure 2: PRISMA Flow Diagram for Lung Cancer Study Selection.
Of the studies that fit the inclusion criteria, several included the same study populations. For example, Ferreccio et al. [25] and Ferreccio et al. [22] are from the same casecontrol study in northern Chile. To avoid double counting, we only included Ferreccio et al. [22]. Smith et al. [26] uses the Ferreccio et al. [22] data in their analysis. Therefore, we did not include Smith et al. [26].
Likewise, Steinmaus et al. [17] and Steinmaus et al. [27] analyze data from the same casecontrol study. To avoid double counting, we only include Steinmaus et al. [17] because it includes a broader range of exposure levels. Avoiding double counting by dropping these studies means that eight lung cancer studies remain in the metaanalysis.
Summary of the included bladder and lung cancer studies: Table 1 summarizes the twelve bladder cancer studies and Table 2 summarizes the eight lung cancer studies. These tables list authors of each study, publication year, study design, outcome measure, exposure measure, and whether the analysis was adjusted for covariates. The outcome measure RR refers to relative risk or risk ratio, OR refers to odds ratio, and HR refers to hazard ratio. Two separate metaanalyses are conducted to generate combined doseresponse relationships for the bladder and lung cancer studies.
Bladder Cancer Studies Description 
Study (Publication Yr) 
Type of Study 
Study Population 
Outcome Measure 
Exposure Measure 
Analysis Adjusted for Covariates? 
Bates et al. [6] 
Casecontrol 
117 cases and 266 controls were considered. 
OR 
Two arsenic exposure indices (total cumulative exposure) and intake concentration were used as exposure measures. 
Statistical analysis was adjusted for smoking. 
Bates et al. [7] 
Casecontrol 
A total of 114 case control pairs were considered. 
OR 
Exposure to arsenic was estimated from water samples collected from subjects’ current residence. 
Statistical analysis was adjusted for covariates. 
Chen et al. [8] 
Casecontrol 
49 patients with newly diagnosed bladder cancer 224 controls. 
OR 
Average exposure estimated from village they lived in 30 years before and the average AR in well water in that village in 1974 and 1976. 
Statistical analysis was adjusted for smoking and other covariates 
Chen et al. [9] 
Prospective (Cohort) study 
A cohort of 8086 subjects 
RR 
Water samples from wells, collected from households. 
Adjusted for smoking and other relevant covariates. 
Chiou et al. [10] 
Prospective (Cohort) study 
A cohort of 8102 subjects was considered. 
RR 
Well water samples were assayed to estimate arsenic concentrations to which study subjects were exposed. 
Multivariate analysis was adjusted for smoking and other covariates. 
Karagas et al. [11] 
Casecontrol 
459 bladder cancer cases and 665 controls were considered. 
OR 
Exposure to arsenic was determined by analyzing toenail clipping samples using instrumental neuron activation analysis. 
Adjusted for smoking and other relevant covariates. 
Kurttio et al. [12] 
Case cohort 
61 bladder cancer cases, 49 kidney cancer cases and 275 subjects in the reference cohort were considered. 
RR 
Arsenic exposure was estimated for short and long latency periods and daily dose of arsenic was calculated from reported consumption of drinking water from wells. 
Statistical analysis was adjusted for smoking and other covariates. 
Kwong et al. [13] 
Casecontrol 
832 cases of bladder cancer diagnosed from a population based case control study 
HR 
Both toenail arsenic concentration and concentration from the drinking water were collected. 
Adjusted for smoking and other relevant covariates. 
Meliker et al. [14] 
Casecontrol 
411 bladder cancer cases and 566 controls were considered. 
OR 
A life time exposure to arsenic was predicted using geostatistical modeling. 
Statistical analysis was adjusted for smoking and other relevant covariates. 
Michaud et al. [15] 
Casecontrol 
331 bladder cancer cases and same number of controls were considered. 
OR 
Individual exposure to arsenic was determined using toenail concentrations that served as a biomarker of arsenic concentration. 
Adjusted for smoking and other relevant covariates 
Steinmaus et al. [16] 
Casecontrol 
181 bladder cancer cases and 328 controls were considered. 
OR 
The highest single year cumulative arsenic concentrations to which the subjects were exposed were estimated. 
Statistical analysis was adjusted for smoking and duration of exposure to arsenic. 
Steinmaus et al. [17] 
Casecontrol 
232 bladder and 306 lung cancer cases and 640 controls were considered. 
OR 
Arsenic exposure was based on water quality measurements for the individual’s location. 
Statistical analysis was adjusted for smoking and duration of exposure to arsenic. 
Table 1: Summary of Twelve Bladder Cancer Studies Selected for MetaAnalysis.
Lung Cancer Studies Description 
Study (Publication Yr) 
Type of Study 
Study Population 
Outcome Measure 
Exposure Measure 
Analysis Adjusted for Covariates? 
Chen et al. [18] 
Followup study 
A total of 2503 residents and 8088 residents in two arseniasis  endemic areas in Taiwan 
RR 
Arsenic exposure was estimated as lifetime cumulative exposure 
Statistical analysis was adjusted for smoking and other covariates. 
Chen et al. [19] 
Followup study 
8086 subjects were followed for 11 years, out of which 6888 were included in the final analysis. 
RR 
Arsenic concentration was estimated using water samples collected from the wells used by the subjects. 
Statistical analysis was adjusted for smoking and other covariates. 
Dauphinne et al. [20] 
Casecontrol 
196 lung cancer cases 359 controls 
OR 
Arsenic concentrations from records for community supplied drinking water and for private wells. 
Statistical analysis was adjusted for smoking and other covariates. 
Garcia et al. [21] 
Followup study 
3,932 American Indians who participated in the Strong Heart Study from 1989 to 1991 and were followed through 2008. 
HR 
Arsenic exposure measured as the sum of inorganic and methylated species in urine 
Statistical analysis was adjusted for smoking and other covariates. 
Ferreccio et al. [22] 
Casecontrol 
152 lung cancer subjects and 419 controls 
OR 
Water quality records of municipal water companies 
Statistical analysis was adjusted for smoking and other covariates. 
Heck et al. [23] 
Casecontrol 
A total 223 lung cancer cases and 238 controls were considered. 
OR 
Arsenic exposure measures were estimated from to enail concentrations.concentrations. 
Relationship of smoking in addition to arsenic ingestion was investigated. 
Mostafa et al. [24] 
Casereferent 
3223 cases and 1588 unmatched casereferents 
OR 
Arsenic exposure estimated by average concentrations for 64 districts. 
Relationship of smoking in addition to arsenic ingestion was investigated. 
Steinmaus et al. [17] 
Casecontrol 
232 bladder and 306 lung cancer cases and 640 controls were considered. 
OR 
Arsenic exposure was based on water quality measurements for the individual’s location. 
Statistical analysis was adjusted for smoking and duration of exposure to arsenic. 
Table 2: Summary of Eight Lung Cancer Studies Selected for MetaAnalysis.
Ingestion of arsenic through drinking water was considered as the exposure route for both bladder and lung cancer outcomes. The studies included in the metaanalysis reported exposure levels in various ranges and metrics. To address the multiple exposure metrics reported by some studies such as cumulative exposure, average exposure, and highest known exposure, the exposure measure in each study, including toenail concentration, is converted to microgram per liter µg/l, which is the most homogeneous metric across the studies.
We consider low to moderate exposure levels as 0300 µg/l for both bladder cancer and lung cancer studies. In some studies either lower, upper, or both limits are left open. For an openended lower limit, we assume that the lower limit is zero. The exposure midpoint is calculated by taking the average of the lower and upper limits of each range except for an openended upper limit. For an openended upper limit the midpoint is calculated as 1.2 times the lower bound of the openended upper limit. The reference midpoint is subtracted from these midpoints and the difference is considered as the doses in subsequent regression models.
Methods
A metaanalysis for combining exposureresponse relationships from observational studies is in general a difficult problem because a common exposureresponse relationship assumption across studies is not realistic. Although the studies are systematically selected to ensure uniformity, the assumption of homogeneity seldom holds for observational studies in environmental epidemiology, public health, or other related fields. Even the studies selected under preset criteria are likely to have numerous differences including study populations, exposure metrics, and outcome measures. Since ‘fixedeffects’ models, assume homogeneity across studies, these are not suitable for combining exposureresponse relationships from observational studies. ‘Randomeffects’ models are more appropriate for combining exposureresponse relationships when the exposureresponse relationships are similar even though the shape and magnitude vary across studies.
Methods for summarizing observational exposureresponse studies quantitatively are well established in the literature [28,29]. A simple exposureresponse model to estimate the trend effect assumes that the adjusted odds ratios are uncorrelated. Since the calculation of the adjusted odds ratios are based on the same reference category, this assumption is violated and the trend estimate becomes inefficient. An approximated variancecovariance matrix is estimated from the fitted table of exposureresponse relationship [29]. The approximated variancecovariance matrix is then used in the weighted least square estimation of the trend parameter. Trend parameter estimates obtained this improved method are both consistent and efficient.
The efficient estimation of the trend effect in an exposureresponse relationship also depends on the model under consideration. A simple linear exposureresponse model is limited since the exposureresponse relation is overly simplified. Also, the exposureresponse relationship across many studies addressing the same question may have differential nonlinear shapes. Linear exposureresponse models are not able to quantify the true relationship between exposure and responses in these nonlinear cases. Thus to encompass a wide range of exposureresponse relationships, flexible models, such as fractional polynomials (FP) and spline regression (SR), are preferable to linear models as they provide a large group of flexible models to incorporate various shapes of exposureresponse relations [5]. FP models are a family of models defined by covariate power transformations of a continuous exposure variable. The values of the power are selected from a small number of predefined integers and nonintegers [30]. A conventional linear model is a special case of FP models. SR models can come very close to the nonparametric regression models as the splines belong to a family of smooth functions.
A combined trend estimate of the exposureresponse relationship is obtained by first estimating a studyspecific functional form. At studyspecific analysis, flexible FP models and SR models are used to estimate such a relationship. The study specific estimates obtained from the firststage FP models or SR models are then combined through multivariate metaanalysis. FP and SR models provide a rich class of regression models for exposureresponse relationship in epidemiology. However, implementation of these models is not as widespread as linear exposureresponse models in epidemiology and other related fields. Bagnardi et al. [5] implemented FP and SR models for combining exposureresponse results from alcohol consumption and allcause mortality studies. In following sections, we discuss the methodology for combining exposureresponse relationships across observational studies using FP and SR models.
Combining exposureresponse relationships using fractional polynomials
The log relative risk for study is modeled using first and second order FPs at studyspecific analysis. Relative risk is a generic term that represents the risk ratio for cumulative incidence data in prospective cohort studies, and the odds ratio for casecontrol data in retrospective studies. The first and second order FP models for study i are presented as follows:
$\mathrm{log}R{R}_{i}{X}_{i}=\{{}_{{\beta}_{i}\mathrm{log}{X}_{i}if\text{p=0;i=1,2,}\mathrm{....}\text{,m}\text{.}}^{{\beta}_{i}{X}_{i}^{p}if\text{p}\ne \text{0}}$
(1)
$\mathrm{log}R{R}_{i}{X}_{i}=\{{}_{({\beta}_{1i}+{\beta}_{2i})\mathrm{log}{X}_{i}if{p}_{1}={p}_{2}=0;i=1,2,\mathrm{...},m.}^{{\beta}_{1i}{X}_{i}^{p}+{\beta}_{2i}{x}_{i}^{{p}_{2}}if{p}_{1}\ne 0,{p}_{2}\ne 0,}$
(2)
Here m = 12 for bladder cancer studies, m = 8 for lung cancer studies, and the powers p, p1, and p2 take values from a prespecified vector c = (−2, −1, −0.5, 0, 0.5, 1, 2, 3) as considered by Bagnardi et al. [5]. Such a power specification contains considerable flexibility to encompass a wide variety of exposureresponse shapes. With the prespecified index set p for power transformation, one can fit eight firstorder models and thirtysix secondorder models with all possible combinations of exponents for p1 and p2. The best fit model is selected as the one that provides highest likelihood for the data under that model. Other criteria for model selection are the deviance and the Akaike Information Criterion (AIC). For both of these criteria smaller values indicate better fit to the data. Both deviance and AIC are considered in selecting the best firstorder and the best secondorder fractional polynomial models.
The best fit models are then applied to estimate the exposureresponse relationship for each study included in the analysis. In order to efficiently estimate trends in doseresponse relationships for each study, the correlation among the log relative risks is taken into account. Estimated trends in doseresponse relationship from each study are then combined according to principles of multivariate random effects metaanalysis to obtain a pooled functional relation [31]. The R package Dosresmeta [32] was used to implement the fractional polynomial models to both bladder and lung cancer studies.
Combining exposureresponse relationships using spline regression models
Spline regression (SR) models for fitting exposureresponse relationships are smoothly joined piecewise polynomials of order q. The joint point is known as ‘spline knot’. It is crucial to select the spline knot positions properly. Usually knot positions are selected based on how well the spline model with selected knots fits the data. The shape of exposureresponse relationship plays an important role in knot selection process as well. A Bspline regression model with degree 2 and four knot positions usually at the quantiles of the exposure level x has 7 degrees of freedom. The shape of the exposureresponse relationship may be used to select the number of knots effectively. The Bspline regression model for log relative risk
$\left(\mathrm{log}R{R}_{i}\right)$
for the i^{th} study can be written as,
$\mathrm{log}R{R}_{i}={\beta}_{0i}+{\beta}_{1i}{X}_{i}+{\beta}_{2i}{X}_{i}^{2}+{\beta}_{3i}{\left({X}_{i}{k}_{1}\right)}_{+}^{2}+{\beta}_{4i}{\left({X}_{i}{k}_{2}\right)}_{+}^{2}+{\beta}_{5i}{\left({X}_{i}{k}_{3}\right)}_{+}^{2}+{\beta}_{6i}{\left({X}_{i}{k}_{4}\right)}_{+}^{2}+{\in}_{i},$
Where the truncated power basis function
${\left({X}_{i}k\right)}_{+}^{2}$
is defined as
${\left({X}_{i}k\right)}_{+}^{2}=\{{}_{0,otherwise.}^{{\left({X}_{i}k\right)}^{2},if{X}_{i}k,}$
For degree = 3, the cubic spline regression model becomes,
$\mathrm{log}R{R}_{i}={\beta}_{0i}+{\beta}_{1i}{X}_{i}+{\beta}_{2i}{X}_{i}^{2}+{\beta}_{3i}{X}_{i}^{3}+{\beta}_{4i}{\left({X}_{i}{k}_{1}\right)}_{+}^{3}+{\beta}_{5i}{\left({X}_{i}{k}_{2}\right)}_{+}^{3}+{\beta}_{6i}{\left({X}_{i}{k}_{3}\right)}_{+}^{3}+{\beta}_{7i}{\left({X}_{i}{k}_{4}\right)}_{+}^{3}+{\in}_{i}$
Where the truncated power basis function
${\left({X}_{i}k\right)}_{+}^{2}$
is defined as
${\left({X}_{i}k\right)}_{+}^{3}=\{{}_{0,otherwise.}^{{\left({X}_{i}k\right)}^{3},if{X}_{i}k,}$
Although SR models are promising in fitting studyspecific flexible exposureresponse relationships, all 12 bladder cancer and 8 lung cancer studies are extremely sparse with only three to five data points (Figure 8 & 9). With only one knot position at 50^{th} percentile, we were able to estimate the regression parameters but not their variancecovariance matrix. Thus it was not possible to combine the study specific regression coefficients from the study specific spline models. As a result, we do not include study specific spline models in the multivariate metaanalysis for the bladder cancer studies or lung cancer studies. This means that only estimates of the coefficients from the fractional polynomial models are combined using the multivariate metaanalysis.
Multivariate metaanalysis to combine results from FP and SR models
To conduct multivariate metaanalysis, we obtain
$v$
dimensional vector of regression coefficient estimates
${\widehat{\theta}}_{i}$
and associated
$v\times v$
estimated variancecovariance matrices
${S}_{i}$
A random effect multivariate metaanalysis Gasparrini [31] can be written as follows:
${\widehat{\theta}}_{i}\sim {N}_{v}\left(\theta ,{\displaystyle \sum {}_{i}}\right)$
Where
$\sum {}_{i}}={S}_{i}+\psi $
. The model in equation (3) is obtained from two independent withinstudy and betweenstudy components. In the within study component,
${\widehat{\theta}}_{i}\sim {N}_{v}({\theta}_{i},{S}_{i}),$
a
$v$
dimensional multivariate normal distribution centered at a vector of true unknown outcome parameters
${\theta}_{i}$
for study i. In the between study component,
${\theta}_{i}\sim {N}_{v}\left(\theta ,\psi \right)$
, where
$\psi $
represents the unknown between study variancecovariance matrix. The unknown parameter vector
$\theta $
represents the population average parameters of the average exposure response relationship. Estimation of the parameter vector
$\theta $
and unknown variancecovariance matrix
$\psi $
completes the multivariate metaanalysis with a randomeffects model. The R package Dosresmeta [32] is used to carry out the multivariate metaanalysis using first and second order fractional polynomial models for both the bladder and lung cancer studies. The combined exposureresponse models are then used to predict the risk for bladder and lung cancer in low to moderate exposure ranges of (0100) µg/l and (0300) µg/l.
Results
From the twelve bladder cancer studies and the eight lung cancer studies, we separately fit the doseresponse data to eight firstorder and thirtysix secondorder fractional polynomial models. As discussed in section 3.1, the number of first order models and the number of second order models follow from the choice of powers for the FP models.
To select the best models from each group, several goodness of fit statistics, including deviance and Akaike Information Criterion (AIC) are calculated. Specifically, Figure 3 presents the AIC values for both first and secondorder fractional polynomial models for the data from the bladder cancer studies. Among these eight firstorder fractional models, Figure 3 shows that the model
$\mathrm{log}\left(RRX\right)=\beta {X}^{3}$
where
$p=3$
, has the lowest AIC. We refer to this model as Model 1. Among the secondorder fractional polynomials models, Figure 3 shows that the models that have the lowest AIC are
$\mathrm{log}\left(RRx\right)={\beta}_{1}{X}^{2}+{\beta}_{2}{X}^{3}$
, where
${p}_{1}=2$
and
${p}_{2}=3$
, which we refer to as Model 2, and
$\mathrm{log}\left(RRX\right)={\beta}_{1}{X}^{3}+{\beta}_{2}\left({X}^{3}\mathrm{log}\left(X\right)\right)$
, where
${p}_{1}={p}_{2}=3$
, which we refer to as Model 3. We estimate the combined relative risks from Model 1, Model 2, and Model 3.
For lung cancer studies, we implement the same set of first and secondorder models as these appear to be the best fitted models. According to the goodness of fit statistics, deviance and AIC, the best fitted models for the lung cancer studies are Model 1, Model 2 and Model 3, which are the same as the bladder cancer studies. Estimated studyspecific coefficients from these fractional polynomial models for the bladder cancer studies are combined using multivariate metaanalysis. Combined predicted relative risks at dose levels 0 to 100 µg/l and 0 to 300 µg/l from Model 1, Model 2, and Model 3 are presented in Figure 4 and Figure 5. Similarly, for lung cancer studies, Figures 6 and Figure 7 present combined predicted relative risks at dose levels 0 to 100 µg/l and 0 to 300 µg/l respectively.
Figure 3: Goodness of fit statistics (AIC) for FP models for bladder cancer studies. Top: AIC plots for firstorder fractional polynomial models with p=2,1,.5,0,.5,1,2,3; Bottom: AIC plots for secondorder fractional polynomial models with p1=p2=2,1,.5,0,.5,1,2,3.
Bladder cancer
As discussed in Section 3.3, the regression coefficients of Model 1, Model 2 and Model 3 are combined through multivariate metaanalysis methods. The combined estimated coefficients are used to estimate the relative risks for doses from 0 to 100 µg/l and to compute the corresponding 95% confidence intervals. These results are shown in Figure 4.
Figure 4: Bladder cancer studies: predicted relative risk relative risk for doses from 0 to 100 µg/l and 95% confidence intervals. Top Left: Firstorder fitted fractional polynomial model with p = 3 (Model 1); Top Right: Secondorder fitted fractional polynomial model with p1 = 2, p2 = 3 (Model 2); Bottom Left:
Secondorder fitted fractional polynomial model with p1=p2=3 (Model 3).
In each figure, the solid black line shows the predicted relative risk and the dashed lines show the corresponding 95% confidence intervals. The top left graph shows the results from Model 1, the top right graph shows the results from Model 2, and the bottom left graph shows the results from Model 3. For doses between 0 and 100µg/l, Model 1 predicts relative risk close to 1 or lower. This implies that at dose levels between 0 and 100µg/l, there is no or minimal risk of bladder cancer. Model 2 produces relative risk estimates that have a slight upward trend. However, since these relative risk estimates never exceed 1.05, the results indicate no or minimal risk of bladder cancer at doses between 0 and 100µg/l. Model 3 finds similar low or no risk of bladder cancer for dose levels between 0 and 100µg/l. For each model, as shown by the confidence interval, the predicted relative risk estimates become less reliable when the dose levels increase.
In Figure 5, we plot relative risk estimates from Model 1, Model 2 and Model 3 for doses between 0 and 300 µg/l. Relative risk estimates from Model 1 show no risk up to dose level 150 µg/l and may even slightly reduce the risk of bladder cancer. Model 2 shows lower risk at lowdose levels and slightly higher risk at dose levels 150 µg/l or more. Model 3 predicts relatively higher relative risk at dose level 250 µg/l and higher. However, none of these results are statistically significant. In addition, at higher dose levels each model predicts less reliable relative risk estimates.
Figure 5: Lung cancer studies: predicted relative risk for doses from 0 to 100µg/l. Top Left: Firstorder fitted fractional polynomial model with p = 3 (Model 1); Top Right: Secondorder fitted fractional polynomial model with p1 = 2, p2 = 3 (Model 2); Bottom Left: Secondorder fitted fractional polynomial model with p1=p2=3 (Model 3).
Lung cancer
The combined predicted relative risks for lung cancer studies are presented in Figure 6 and Figure 7. Since the combined predicted relative risks from Model 1 and Model 2 are close to one, these models find no evidence of lung cancer risks at doses 0 to 100µg/l (Figure 6). The combined predicted relative risks from Model 3 show an upward trend, which implies some evidence of risk beginning at approximately 40µg/l. However, the relative risk only increases to 1.1, which implies a relatively minor risk.
Figure 6: Bladder cancer studies: predicted relative risk for doses from 0 to 300 µg/l and 95% confidence intervals. Top Left: Firstorder fitted fractional polynomial model with p = 3 (Model 1); Top Right: Secondorder fitted fractional polynomial model with p1 = 2, p2 = 3 (Model 2); Bottom Left: Secondorder fitted fractional polynomial model with p1=p2=3 (Model 3).
Figure 7 shows the predicted doseresponse models for dose levels 0 to 300 µg/l for the same three models as in Figure 6. Below 150µg/l, Model 1 shows no indication of risk of lung cancer. After 150 µg/l, there is an increase in predicted relative risk. However, the results in all of the models are not statistically significant. Model 2 indicates no risk up to 300 µg/l. Model 3 shows an increasing risk after dose level 100 µg/l, which declines approximately after 230µg/l.
Figure 7: Lung cancer studies: predicted relative risk for doses from 0 to 300 µg/l. Top Left: Firstorder fitted fractional polynomial model with p = 3 (Model 1); Top Right: Secondorder fitted fractional polynomial model with p1 = 2, p2 = 3 (Model 2); Bottom Left: Secondorder fitted fractional polynomial model with p1=p2=3 (Model 3).
Discussion and Conclusion
This article applies fractional polynomial and spline regression models to determine the shapes of the doseresponse relationships between bladder and lung cancer risk and exposure to low to moderate dose arsenic. Our results are similar to Mink et al. [3] who found a generally weak and statistically insignificant relationship between lowdose exposure to arsenic and bladder cancer and Begum et al. [4] who found a generally weak relationship between bladder and lung cancer and exposure to lowdose arsenic. We estimate overall risks of bladder and lung cancers by combining findings from systematically selected studies on these cancers under both linear and nonlinear modeling assumptions. We consider fractional polynomial models that include a linear model as a special case, and the spline regression models. Fractional polynomial models do not provide any statistically significant relative risks of bladder and lung cancer at low to moderate dose levels of arsenic exposure. These models predict no or minimal risk for bladder and lung cancer at low to moderate dose levels (0 to 150) µg/l. It is also to be noted that at higher dose levels each model predicts less reliable relative risk estimates for bladder and lung cancer. Overall, we found a weak and statistically insignificant relationship between both bladder and lung cancer and low to moderate exposure to arsenic.
However, it is important to observe that both bladder and lung cancer studies have only few data points in the range of exposure – response set (Figure 8 and Figure 9). Since sample size affects the statistical significance, we note that further investigation with larger number of points in the range of exposure – response set is required to draw firm conclusions.
Figure 8: Scatter plots of exposure levels and log relative risks for twelve bladder cancer studies.
Figure 9: Scatter plots of exposure levels and log relative risks for eight lung cancer studies.
Spline regression models are promising in fitting studyspecific flexible exposureresponse relationships. However, as shown in Figures 8 and Figure 9 there are only 3 to 5 data points in each of the bladder cancer and lung cancer studies. Figures 8 and Figure 9 present study specific exposure response relationships for bladder and lung cancer studies respectively. As evident from Figure 8 and Figure 9, there is lack of homogeneity in terms of exposure metrics as well as shape of the exposure response relationships. These figures also show the sparseness in the data for which the computation of the study specific spline models was not possible. With only one knot position at 50^{th} percentile, we were able to estimate the regression parameters but not their variancecovariance matrix. Thus it was not possible to combine the study specific regression coefficients from the study specific spline models. As a result, we do not include study specific spline models in the multivariate metaanalysis for the bladder cancer studies or lung cancer studies. This means that only estimates of the coefficients from the fractional polynomial models are combined using the multivariate metaanalysis.
Future studies investigating the association between exposure to low to moderate levels of arsenic and internal cancers can extend this work by including additional covariate information. For instance, smoking status could be included to determine its effect on the doseresponse relationships. This article can also be extended by obtaining additional data on low and especially moderate dose arsenic exposure levels and internal cancer for finer analyses. Due to the computational limitations for spline regression models with sparse data, our results were limited to only fractional polynomial models. This could be overcome, with additional data or the development of methods for modeling sparse data.
References
 Chu HA, Crawford Brown DJ (2006) Inorganic arsenic in drinking water and bladder cancer: A metaanalysis for doseresponse assessment. Int J Environ Res Public Health 3(4): 316322.
 Subcommittee to Update the 2001 Arsenic in Drinking Water Report Committee on Toxicology. Arsenic in Drinking Water 2001 Update. National Research Council, National Academy Press.
 Mink PJ, Alexander DD, Barraj LM, Kelsh MA, Tsuji JS (2008) Lowlevel arsenic exposure in drinking water and bladder cancer: A review and metaanalysis. Regul Toxicol Pharmacol 52(3): 299310.
 Munni B, John H, Md Irfan H (2012) Lowdose risk assessment for arsenic a metaanalysis approach. Asia Pac J Public Health 27(2): 116.
 Vincenzo B, Antonella Z, Piero Q, Giovanni C (2004) Flexible metaregression functions for modeling aggregate doseresponse data, with an application to alcohol and mortality. Am J Epidemiol 159(11): 10771086.
 Bates MN, Smith AH, Cantor KP (1995) Casecontrol study of bladder cancer and arsenic in drinking water. Am J Epidemiol 141(6): 523530.
 Bates MN, Rey OA, Biggs ML, Hopenhayn C, Moore LE, et al. (2004) Casecontrol study of bladder cancer and exposure to arsenic in Argentina. Am J Epidemiol 159: 381389.
 Chen YC, Su HJ, Guo YL, Hsueh YM, Smith TJ, et al. (2003) Arsenic methylation and bladder cancer risk in Taiwan. Cancer Causes Control 14(4): 303310.
 Chen CL, Chiou HY, Hsu LI, Hsueh YM, Wu MM, et al. (2010) Arsenic in drinking water and risk of urinary tract cancer: a followup study from northeastern Taiwan. Cancer Epidemiol Biomarkers Prev 19(1): 101110.
 Chiou HY, Chiou ST, Hsu YH, Chou YL, Tseng CH, et al. (2001) Incidence of transitional cell carcinoma and arsenic in drinking water: A followup study of 8,102 residents in an arseniasisendemic area in northeastern Taiwan. Am J Epidemiol 153: 411418.
 Karagas MR, Tosteson TD, Morris JS, Demidenko E, Mott LA, et al. (2004) Incidence of transitional cell carcinoma of the bladder and arsenic exposure in New Hampshire. Cancer Causes Control 15: 465472.
 Kurttio P, Pukkala E, Kahelin H, Auvinen A, Pekkanen J (1999) Arsenic concentrations in well water and risk of bladder and kidney cancer in Finland. Environ Health Perspect 107: 705710.
 Kwong RC, Karagas MR, Kelsey KT, Mason RA, Tanyos SA, et al. (2010) Arsenic exposure predicts bladder cancer survival in a US population. World J Urol 28(4): 487492.
 Meliker JR, Slotnick MJ, AvRuskin GA, Schottenfeld D, Jacquez GM, et al. (2010) Lifetime exposure to arsenic in drinking water and bladder cancer: a populationbased casecontrol study in Michigan, USA. Cancer Causes Control 21(5): 745757.
 Michaud DS, Wright ME, Cantor KP, Taylor PR, Virtamo J, et al. (2004) Arsenic concentrations in prediagnostic toenails and the risk of bladder cancer in a cohort study of male smokers. Am J Epidemiol 160(9): 853859.
 Steinmaus C, Yuan Y, Bates MN, Smith AH (2003) Casecontrol study of bladder cancer and drinking water arsenic in the western united states. Am J Epidemiol 158(12): 11931201.
 Steinmaus CM, Ferreccio C, Romo JA, Yuan Y, Cortes S, et al. (2013) Drinking water arsenic in northern Chile: high cancer risks 40 years after exposure cessation. Cancer Epidemiol Biomarkers Prev 22(4): 623630.
 Chen CL, Hsu LI, Chiou HY, Hsueh YM, Chen SY, et al. (2004) Ingested arsenic, cigarette smoking, and lung cancer risk: A followup study in arseniasisendemic areas in Taiwan. JAMA 292(24): 2984 2990.
 Chen CL, Chiou HY, Hsu LI, Hsueh YM, Wu MM, et al. (2010) Ingested arsenic, characteristics of well water consumption and risk of different histological types of lung cancer in northeastern Taiwan. Environ Res 110(5): 455462.
 Dauphiné DC, Smith AH, Yuan Y, Balmes JR, Bates MN, et al. (2013) Casecontrol study of arsenic in drinking water and lung cancer in California and Nevada. Int J Environ Res Public Health 10(8): 33103324.
 García Esquinas E, Pollán M, Umans JG, Francesconi KA, Goessler W, et al. (2013) Arsenic exposure and cancer mortality in a usbased prospective cohort: the strong heart study. Cancer Epidemiol Biomarkers Prev 22(11): 19441953.
 Ferreccio C, González C, Milosavjlevic V, Marshall G, Sancha AM, et al. (2000) Lung cancer and arsenic concentrations in drinking water in Chile. Epidemiology 11(6): 673679.
 Heck JE, Andrew AS, Onega T, Rigas JR, Jackson BP, et al. (2009) Lung cancer in a U.S. population with low to moderate arsenic exposure. Environ Health Perspect 117(11): 17181723.
 Mostafa MG, McDonald JC, Cherry NM (2008) Lung cancer and exposure to arsenic in rural Bangladesh. Occup Environ Med 65(11): 765768.
 Ferreccio C, González Psych C, Milosavjlevic Stat V, Marshall Gredis G, Sancha AM (1998) Lung cancer and arsenic exposure in drinking water: a case control study in northern Chile. Cad Saude Publica 14(3): 193198.
 Smith AH, Ercumen A, Yuan Y, Steinmaus CM (2009) Increased lung cancer risks are similar whether arsenic is ingested or inhaled. J Expo Sci Environ Epidemiol 19(4): 343348.
 Steinmaus C, Ferreccio C, Yuan Y, Acevedo J, González F, et al. (2014) Elevated lung cancer in younger adults and low concentrations of arsenic in water. Am J Epidemiol 180(11): 10821087.
 Berlin JA, Longnecker MP, Greenland S (1993) Metaanalysis of epidemiologic doseresponse data. Epidemiology 4(3): 218228.
 Greenland S, Longnecker MP (1992) Methods for trend estimation from summarized doseresponse data, with applications to metaanalysis. Am J Epidemiol 135(11): 13011309.
 Royston P, Ambler G, Sauerbrei W (1999) The use of fractional polynomials to model continuous risk variables in epidemiology. Int J Epidemiol 28(5): 964974.
 Gasparrini A, Armstrong B, Kenward MG (2012) Multivariate metaanalysis for nonlinear and other multiparameter associations. Stat Med 31(29): 38213839.
 Alessio Crippa R package ‘dosresmeta’. 2013.

