Personally I became interested in the topic of PM and health while a graduate student at the Harvard School of Public Health. My doctoral dissertation in biostatistics covered this topic back in 1971. Since then I have been actively engaged in environmental health and statistics issues. I have co-authored a book and written over 40 papers that have been published in the peer-reviewed scientific literature. I have obtained many significant recognitions from my peers. I have served on and chaired subcommittees of the National Research Council, National Academy of Sciences. I have served or chaired several EPA Science Advisory Board Committees. I have also been appointed a Fellow of the American Statistical Association. The comments that I present today reflect my personal views and judgments as a scientist, who has worked in this area for over twenty-five years. These comments should not be construed to be the official opinion of my employer or of any associate.
Below I cite several studies and documents. In an effort to achieve brevity and avoid technical details, I do not include data or attach papers. At times, to be more understandable, I try not to use statistical jargon. I have the back-up technical material, which I would be happy to share with you if you desire.
Several statistical studies suggest an association between particulate matter and health. I have authored some of these; EPRI has funded many more. Does this association mean that a reduction in particulate matter air pollution will lead to public health benefits? I believe the correct answer to this question is that no one knows. We have positive studies, but these results are tempered by the following issues, which are discussed in more detail below:
1.) Study results are not consistent. There are several studies which fail to find any significant positive association between health and particulate matter.
2.) Re-analyses of the data from existing studies do not support an unambiguous particulate matter-health association. The re-analyses by independent investigators do not agree with the original investigators' conclusions of a significant association between particulate matter and health. This is true whether the re-analyses have been funded by public (e.g., US EPA) or private entities.
3.) There is no one correct way to analyze data to determine the relationship between health and particulate matter. The results of these analyses differ according to the methods used. Hence flexibility in choice of analysis can influence the results in a way that invalidates commonly used statistical tests.
4.) It has not been possible in current studies to disentangle the effects of particulate matter air pollution from those of other pollutants and weather.
5.) There is an inconsistent relationship between the levels of particulate matter that people actually breathe and a) the measure of particulate levels used in air pollution health studies as well as with b) the levels of particulate matter that would be regulated.
6.) There is no accepted biological explanation for the results of the statistical models.
7.) If there is an association between particulate air pollution and health, there is no extant health information that suggests greater health effects associated with PM-2.5 (particulate matter less than 2.5 microns in diameter) than with PM-10 (particulate matter less than 10 microns in diameter).
8.) To understand any health effects and to manage PM-2.5 concentrations, we need a more accurate definition of PM-2.5 than that to be measured by the proposed federal reference method.
Specific Scientific Issues
The studies are not consistent. The evidence for particulate air pollution health effects comes from epidemiology studies (studies of people in the real world), not laboratory or animal studies. Several epidemiology studies report no significant relationship between particulate matter and public health. There are negative studies; for example, studies by Styer et al. of Salt Lake City; Morris of Manchester, England; Roth of Prague, The Czech Republic; Burnett et al. of ten Canadian cities, and Abbey et al. of California. In addition, the recent APHEA study of the European Commission, found no significant relationship between PM and mortality in several Eastern European cities, where pollution levels were high. Why are these studies inconsistent with other studies that report a significant relationship between PM and mortality? The methods used in the positive and negative studies appear to be reasonable and suggest no obvious error. Is it chance or is there some explanation why positive results are found in some locations, but not in others?
Re-analyses of existing studies do not support an unambiguous particulate matter-health association. Several studies that have reported significant associations between particulate matter and health have been re-analyzed. By and large these re-analyses do not reach the same conclusions as the original studies. For example, under contract to US EPA, Davis et al., at the National Institute of Statistical Sciences, re-examined daily mortality and particulate matter relationships in Birmingham, Alabama. The Davis et al. analysis tries to insure that the effect of hot and humid days is considered in any model trying to assess the influence of particulate matter on mortality. They conclude: "When we use the same variables as included by Schwartz, we obtain similar results to his. But when we use alternative models we obtain different conclusions. In particular, when humidity is included among the meteorological variables (it is excluded in the analysis by Schwartz), we find that the PM-10 effect is not statistically significant." Roth and Li in a study supported by EPRI similarly examined Birmingham mortality data, as well as hospital admissions data; they also could find no effect of particulate matter on health. The Health Effects Institute, in a project supported by the US EPA and the automobile industry, verified the numerical correctness of the results of Dockery and Schwartz in Philadelphia, but they also tried alternative models in their research project and found it impossible to definitively link particulate pollution with increased Philadelphia mortality. After application of several models, they conclude: "We caution against using the model coefficients directly to estimate the potential consequences of lowering concentrations of the individual pollutants through regulatory measures; the pollutant concentrations are correlated and the estimates of their effects depend on modeling assumptions."
In a paper published in the journal Epidemiology, Moolgavkar and Luebeck presented an independent analysis of the relationship between daily air pollution and mortality in Philadelphia. They conclude: "[I]n Philadelphia, each component of air pollution, when considered alone, is an important predictor of mortality in at least one season....When all pollutants are entered simultaneously into the model, however, nitrogen dioxide appears to emerge as the most important pollutant." In a second study Moolgavkar and his colleagues also "failed to replicate findings" of the study of the relationship between daily deaths and particulate air pollution in Steubenville, Ohio.
EPRI has recently sponsored a re-analysis of the relationship between daily respiratory hospital admissions and air pollution in Detroit. Joel Schwartz of Harvard had previously analyzed these data and found a statistically significant association between hospital admissions and particulate air pollution. He was kind enough to send his data to a group of statisticians at Stanford University. When they applied the same model as Schwartz, they obtained similar results. When they incorporated the potential influences of day of week into the model, particulate matter was no longer a significant predictor of hospital admissions. This is potentially important because hospital admissions vary by day of week. If they go down on weekends and pollution is lower on weekends, and if an investigator did not consider "day of week" in the model, then the investigator could wrongly attribute the effects of weekend behavior to pollution.
Can we say which analysis is the correct one for each of these data sets? By and large there is no best way to analyze the data. Each individual may have his or her favorite method, but in reality we are addressing a complex statistical issue for which there is no one correct way to analyze the data. Our problem is that different methods give different results. We cannot know which result to believe, but it is important to know that these differences occur.
Could the methods chosen to analyze the data influence the results? It is clear that the different models can give different results. None of the models fits the data well; hence it is not possible to decide which model is best. There is no one correct way to analyze a data set. Hence an investigator has considerable freedom in the choice of his/her model. There are different ways to address the seasonal nature of the data; i.e., the patterns due to the fact that there are more health effects, such as deaths, in winter than in summer. An investigator can choose weather factors and the other pollutants he or she may place in a model. The investigator can decide whether to relate today's pollution with today's mortality, or yesterday's pollution with today's mortality or the average of the last five day's pollution with mortality. All of the above and more have been considered. Then there is the model construct itself; it could be linear, log-linear, or Poisson. Often an investigator may choose one method over another in order to ensure that all public health considerations are unearthed so that the public health can be protected at all costs. This has been a rationale for considering only one pollutant when others are equally likely to influence a health response; in their papers authors indicate that a specific variable was chosen because it maximizes the association between an air pollution variable and a health response. This may be "conservative", but it does not provide an accurate estimate of effect or association.
Usually we use a 5% level of significance in empirical research; that means that we accept that there is no effect when the chance of a positive result occurring is only one in twenty. However, the additional flexibility in model choice noted above alters the level of significance in statistical tests, making it more likely that the investigator will estimate a positive effect when none in fact occurs.
There may be other factors associated with the complicated data sets with which we work. The data are complex time series data, and we have little detailed understanding of these data sets. The models we apply are relatively simple models. We assume they are adequate for our data. In an effort to test this, Lipfert & Wyzga undertook some initial analyses to determine how these models performed with unrelated data sets; e.g., pollution variables for one city and unrelated health data for a distant city.
Our results are preliminary, but indicate some surprising significant relationships such as a statistically significant relationship between air pollution in one city and health impacts in a distant city. These provocative findings need to be resolved. It may be premature to suggest that they impact our evaluation of the current science, but it would be irresponsible not to investigate these findings further. I hope to clarify these results within the next three months.
Then there is the issue of pressures to emphasize positive studies. This is best described by the following quote from the July, 14, 1995 issue of Science, entitled, "Epidemiology Faces Its Limits." The article states "Authors and investigators are worried that there's a bias against negative studies," and that they will not be able to get them published in the better journals, if at all, says [Marcia] Angell [Executive Editor] of the NEJM [New England Journal of Medicine]. "And so they'll try very hard to convert what is essentially a negative study into a positive study by hanging on to very, very small risks and seizing on one positive aspect of a study that is by and large negative." Or, as one National Institute of Environmental Health Sciences Researcher puts it, asking for anonymity, "Investigators who find an effect get support, and investigators who don't find an effect don't get support."'
Could PM be serving as an index for other pollution? If we regulated PM, would we achieve the health improvements we want? In other words can we disentangle the health effects of particulate matter from those of other pollutants and weather? To a limited extent we can, but we are hampered by two issues. All pollutants may be associated with health effects. Secondly, most urban pollutants are present at the same time. In addition, weather conditions can cause many pollutants to concentrate in an area at the same time, including those pollutants that are not measured. In fact weather conditions are often correlated with pollution as well.
Hence it is difficult to disentangle any effects of various pollutants and to indicate which pollutant may be associated with a given health effect. The Health Effects Institute and the Moolgavkar and Luebeck quotes above address this problem. In addition, in a paper Lipfert and Wyzga published in the Journal of the Air and Waste Management Association, we examined many published studies that had looked at the relationship between daily mortality and various pollutants. We found that if a study had chosen to focus upon sulfur dioxide or nitrogen dioxide instead of particulate matter, that study found similar effects on daily mortality as did those studies that focused upon particulate matter. A focus upon carbon monoxide indicated somewhat larger effects than particulate matter, and a focus upon ozone gave somewhat smaller effects. Given the high correlation between the various air pollutants, an obvious conclusion is that if an investigator had elected to study another pollutant instead of particulate air pollution, he/she might well have concluded that the other pollutant was the pollutant of concern.
Disentangling the various air pollutants is complicated because of statistical considerations. Lipfert and Wyzga have shown, and it is now widely accepted, that until one has an understanding of how well the measured pollution data represent the levels to which people are actually exposed, it is not possible to separate out the effects associated with various pollutants. The EPA Criteria document states, "Measurement error in pollutants or other covariates may also bias the results,.... and the most poorly measured exposure covariate is usually the one that is driven towards no effect." Measurement error is caused by the fact that the amount of pollution measured by the monitor is not the same as that to which a person is exposed. This could be caused by inaccuracies in the instrument. It could be due to the fact that the monitor is not located in the same area of a city as an impacted individual, or it could be due to the fact that people spend most of their time indoors where pollution exposures may be very different from what is measured at a monitor.
Is there any relationship between the concentration of particulate matter in the air people actually breathe with the levels used in air pollution health studies? We know very little about what pollution levels people are actually exposed to; we know even less about how the actual exposures of people to particulate matter relate to the levels measured at outdoor monitors, which can be very distant from people's homes. Available data show no consistent relationship between the levels of particulate matter people actually breathe and the levels measured at outdoor monitors. This is also true when we ask whether personal exposure data track outdoor levels over time. In addressing this issue in its Criteria Document, EPA depends upon a study of seven elderly Japanese living in non-smoking, non-carpeted, "typical", Japanese homes in Japan. For this group, there was a good relationship between personal exposures and ambient measures of PM-10. The relevancy of this data set to Americans is, however, unclear. Other studies do not demonstrate as good a relationship between the air actually breathed and that measured at the monitors. A study in Phillipsburg, NJ looked at the relationship between personal (actual) exposures to PM-10 and outdoor measures for 14 individuals. For the group as a whole, the personal exposures tended to increase with outdoor levels, but the results are not consistent across individuals. For some people a reduction in outdoor levels in PM-10 would have no effect on the PM-10 levels where people breathe. A study in Azusa, California compared the actual exposures and outdoor levels of ten people to PM-2.5 and PM-10 for periods of seven days. The results were not consistent. For half of these people, the actual exposures to PM-2.5 decreased when outdoor levels of PM-2.5 increased. No individual showed a striking positive relationship between personal exposures and outdoor levels.
Some studies have looked at people who might be more susceptible to air pollution. A group at the Gage Research Institute at the University of Toronto studied 21 asthmatics for both winter and summer periods for a total of about twenty days each. A correlation co-efficient of 1.0 would mean perfect concordance; a correlation co-efficient of 0.0 would indicate absolutely no association between the two. The average correlation coefficient across subject between actual exposures to PM-2.1 and measured outdoor levels was 0.11; (i.e., outdoor measures would explain about 1% of the variation in personal exposures.) This result suggests that changes in the outdoor levels of fine particulate matter (PM-2.1) would have negligible impact on the asthmatics' actual exposures. EPRI sponsored a study of asthmatics in Uniontown, PA. Our contractors from the Harvard School of Public Health measured the personal exposures and ambient levels of sulfates, a component of particulate matter, and found good agreement between personal exposure and outdoor levels. Unfortunately this study did not consider particulate matter as a whole.
We are currently supporting a study at Harvard School of Public Health of people with chronic obstructive pulmonary disease (COPD). We hope to understand the relationship between actual exposures of particulate matter (both PM-10 and PM-2.5) and outdoor levels. The first part of the study was undertaken in Nashville and showed absolutely no relationship between these two measures for the ten people studied. We are currently repeating that study in Boston.
Why do actual exposures differ from levels measured at monitors? First of all, people spend little time in the vicinity of the monitor, which may not be in a very representative location. If an individual lives in the suburbs, and the monitor is in the center of the city, the monitor may not provide a good indication of the pollution level to which the individual is exposed. Secondly people, especially susceptible individuals, spend considerable time indoors, where some particulate matter may not be able to penetrate and where other sources besides outdoor air pollution can have considerable influence on a person's actual exposure to particulate matter. Sources of indoor particulate matter include passive cigarette smoke, vacuuming, dusting, pet dander, fireplaces and woodstoves, hairsprays, etc., etc. For this reason personal exposures to particulate matter are often higher than outdoor levels because when we move, we generate a cloud a fine particulate matter around us, not too unlike that generated by the PigPen character in the Peanuts comic strip.
Is there any biological explanation for the results we see from these models? In its proposed decision to promulgate particulate matter standards, EPA states, "it is generally recognized that an understanding of biological mechanisms that could explain the reported associations has not yet emerged."
Is there any reason to believe that PM-2.5 is of greater health concern than PM-10? There is no health evidence that PM-2.5 presents a greater concern than PM-10. In a recently-published paper Lipfert and Wyzga reviewed 30 published papers (all that they could find as of that date) that examined the association between mortality and particulate air pollution. Some of these studies used PM-10 as a measure of particulate air pollution; others used PM-2.5. We compared the results of the studies that used PM-10 to those that used PM-2.5. The differences in estimated effects between the two particulate measures was small; if anything, the literature suggested greater effects were associated with PM-10.
There are a few studies that consider both PM-10 and PM- 2.5. If we consider the estimated effects of PM-10 and PM-2.5 in a head-to-head comparison in these studies, we find little difference between the two indices. Where there is a difference, the estimated health effects of PM-10 appear to be greater. For example, in Table 13-5 of EPA's Criteria document, EPA summarizes the results of the Harvard 6-City Study. The table presents the estimated changes in relative risk for increased chronic mortality in adults. The higher the number, the greater the estimate of increased risk. For PM-10 or PM-15, it is 1.42; for PM-2.5, it is 1.31. Lipfert and Wyzga compared the PM-10 and PM-2.5 results for all existing studies and found the estimated health effects of PM-2.5 to be less than or equal to the estimated effects of PM-10.
Why then does the EPA say that there is need for a PM-2.5 standard to protect public health? The basis for this argument is a paper based upon the Harvard 6-city study. That paper estimates the association between daily mortality and PM-2.5, PM-10, and the difference between PM-10 and PM- 2.5. EPA refers to the latter as the "coarse fraction". That study finds the strongest association between daily mortality and PM-2.5 and the weakest association between daily mortality and the "coarse fraction". In our opinion the analysis is flawed, however, because of the measurement error issue. In a paper recently accepted by the Journal of the Air and Waste Management Association, Lipfert and Wyzga show that the comparisons between PM-2.5 and the "coarse fraction" are inappropriate without any correction for the difference in measurement error. There are at least two types of measurement error present in the data collected. First of all, an earlier paper by the Harvard investigators noted that the device used created inaccuracies averaging 43% for the "coarse fraction". Secondly since PM-2.5 is more spatially uniform than is the "coarse fraction", the readings from a single monitor in a large geographic region (up to nine counties) will be much more representative for PM-2.5 than for the "coarse fraction." These two factors will bias downward the estimated impact of the coarse fraction. Our conclusion in this paper is: "In the specific study (Schwartz et. al, 1996) that we considered in detail, which employed a single monitor in each of six large metropolitan areas, we conclude that virtually nothing can be inferred about the true causal nature of daily mortality, the actual responses to these agents, or the shapes of the true response functions. Given the strong bias in favor of PM-2.5 resulting from lower instrument errors, less spatial variability, the treatment of missing data, and the near significance of coarse particles in spite of these handicaps, the most prudent conclusion from this study would have been that there is no apparent significant difference in mortality associations by particle size."
How accurately can we measure PM-2.5? A new standard, such as PM-2.5, requires defining the substance, PM-2.5, to be controlled. This is defined by the levels measured through an official reference method. For PM-2.5 this method is prescribed along with the proposed standard. Field testing of the reference method began late last fall. We too have been testing a suite of methods at the sites selected by EPA and/or state agencies. Our suite includes samplers that work on the same principle as the reference method. Test results are being analyzed by several groups; neither EPA nor our results have been published yet.
Our initial results indicate that in some cities the reference method would not capture a substantial portion of the fine particulate constituents, especially those that are likely to evaporate during the measurement process. These results are consistent with theoretical expectation. Therefore, we predict that the reference method is likely to provide incorrect and incomplete information from the standpoint of characterizing and managing any potentially harmful constituents of particles. Instead promising newer technologies ought to be considered.
Scientific Issues for the annual standard
How strong is the evidence for that standard? The studies and issues I have discussed previously relate both to the daily and annual standards. Evidence to support changes in the annual standards come from a different type of study. Differences in the health of various communities are compared with differences in the air quality for these communities. Attempts are made to adjust for other factors (e.g., demographic or socio-economic factors) that may explain the health differences among communities. It is important that all potentially relevant factors be included in the analyses. This is particularly true for factors which vary regionally as does air pollution. These factors include regional differences in diet, lifestyle, and climate. Omitting such a factor from statistical analysis can shift the blame onto air pollution. These factors are referred to as confounding factors.
These studies have improved recently with the study of defined cohorts for which we have some individual characteristics, such as smoking history. The ability of these studies to control for non-air pollution factors is however, limited by the information collected. EPA cites two of these studies as providing support for their proposed annual standard for PM-2.5.
The first study was based on differences in cohort survival rates among the cities studied in the Harvard Six Cities study. EPRI was a funder of this study. This study is confounded by the failure to account for known lifestyle differences across the six cities. For example, EPRI-supported research has shown that differences in the fraction of the elderly population with sedentary lifestyles in each area can account for most of the differences in mortality among the study cities; moreover, the predicted effect of this lifestyle factor matches almost perfectly the relationship estimated in an independent California study by Breslow and Enstrom.
The second study also did not consider lifestyle factors; in addition, this study only evaluated mortality relationships with respect to sulfate and fine particles and did not test whether similar results might have been found for any other pollutants.
Neither of these studies considered the fact that the evaluation of chronic health effects must consider the typically long latency periods of such diseases and the air pollution histories of the cities being studied. Typically, the dirtiest cities have already improved greatly; hence chronic health effects could be due to the past dirty air, which initiated the process, the end result of which we see today.Conclusions
Would the changes in the proposed particulate standards lead to improvements in public health? No one knows. The results of positive statistical studies must be balanced by the following issues:
1.) not all studies find a significant positive association between particulate matter and health;
2.) re-analyses of existing studies, by a wide range of scientists, do not support the conclusions of the original studies that there is a significant association between particulate matter and health;
3.) the choice of method to analyze data can influence the results;
4.) it is very difficult and often impossible to determine which specific pollutant may be related to health consequences; is it particulate air pollution or some other factor;
5.) there is an inconsistent relationship between the particulate levels in the air we actually breathe and that measured at monitors;
6.) we have no biological explanation for the results of the statistical models;
7.) if there is an association between particulate air pollution and health, there is no health information available that suggests greater health effects associated with PM-2.5 than with PM-10; and
8.) to understand any health effects and to manage PM-2.5 concentrations, we need a more accurate definition of PM-2.5 than that of the proposed federal reference method.
It is clear that we are dealing with a very complicated situation; the findings to date raise the specter of an important public health issue, yet there remain many unanswered questions before we can confidently conclude that these effects are real and we know how to improve the public health. We clearly need to work together if we are to resolve these questions. We need to pool our resources, share our knowledge and data to resolve these questions. It is fortunate that we as a society have already committed to reducing air pollution and pollution levels are in decline.
Let's continue to work together.