GET IN TOUCH

Our guest-blogger today is Maartje Sevenster, Sevenster Environmental, who has followed and analysed the process leading to the recently published weighting method for the EU Product Environmental Footprint (PEF). Here she shares her serious reservations on the process and the results.

A weighting set for the EU Product Environmental Footprint (PEF) was published last month. The weighting factors have been developed by the Joint Research Centre via an elaborate approach that has attempted to separate value-based weighting into objective factors. Nevertheless, the result is a poor, semi-qualitative approximation that mixes characterization, distance-to-target weighting, panel weighting, and uncertainty. All in all, the approach comes across as a black box of flawed mathematical operations.

A PEF consists of a characterization result for each of 16 impact categories, a corresponding set of weighted normalised results, and one single-score result. The use of this weighting method and the resulting single-score will be a requirement in all PEF studies and is meant to facilitate interpretation.

The final weighting set is an average of three independently derived sets, with the average multiplied by a robustness factor. Two of three independent sets are derived via traditional panel weighting and the third is based on a hybrid ‘evidence-based’ approach.

The term evidence-based has a feel of objectivity about it, but in its first part the approach applies weighting to issues that could better be investigated by natural science, and in the second part - that necessarily must be based on subjective preferences - the chosen approach violates basic requirements for good valuation practice.

In the first part of the JRC approach, an expert panel was asked to score the below seven characteristics of impacts on a scale from 1 to 100:

It is immediately obvious that many of these factors, such as time span, are already covered by the commonly used LCA natural science based characterization models, even if they are not always made explicit in the end results. In fact, Annex 13 of the JRC weighting report includes a similar criticism by Mark Goedkoop: “using a panel to link mid to endpoint is really weird. This means we replace science by the verdict of panellist. I am quite aware about some of the uncertainties in the mid to end point factors, but I always thought we prefer science over the laymen’s view. Uncertain science is always better than no science at all.” Another example: the use of GWP100 for characterizing global climate change impacts implies a natural science based assessment of the timing of the impacts for comparison of greenhouse gas emissions. Is it then valid to subsequently allow an expert panel to assign a zero weight to time span as a weighting factor, which was theoretically possible for the experts in this approach?

Only two of the seven factors are not part of LCA characterization models, namely reversibility and level compared to planetary boundary. Reversibility is also the only factor that is intrinsically categorical and therefore an excellent illustration of the artificiality of the approach. Is it valid to say that an irreversible impact is (only) a 100 times worse than an impact that can be reversed by natural processes within one year? It would certainly be useful to dicuss these factors prior to determining a multi-criteria type panel weighting per impact category.

Factors such as reversibility and time span may well play a role in expert judgements of the severity of a certain impact as compared to others. However, the JRC approach first introduces a categorical scaling for each of those factors turning them into artificial ordinal variables. For instance, for time span the following categories are used:

Even though we know precisely that some impacts are instantaneous and others may be spread out over hundreds of thousands of years, such as those of radioactive radiation, the difference is here reduced to an arbitrary factor of 100. The bottom line is that most of the seven factors can be evaluated by natural science, albeit with considerable uncertainty, and do not need expert weighting. The scaling wipes out all scientific evidence and along with it any understanding of what a resulting indicator might really mean.

The second part of the JRC expert weighting procedure is a more traditional expert panel judgement of the relative importance of the seven factors. This leads us to another troubling aspect, which is that the seven factors are not completely independent as is required for proper evaluation of (compensatory) weights. Especially the "level compared to planetary boundary" overlaps with all other factors to at least some extent. Moreover, averaging categorical variables is mathematically meaningless, even when the categories appear to be “numerical”.

Finally, this weighting set from the expert panel is averaged with two other weighting sets derived via a different approach. This seriously undermines the transparency of the weighting, which should at all times be straightforward to interpret, not just a set of numbers to arrive at a single score. This is further aggravated by the use of a robustness factor to assess what is in essence uncertainty. Again, this factor involves three arbitrarily scaled ordinal variables that are averaged. The report shows some inconsistencies regarding the final choice for this robustness factor, which is apparently not considered very robust by JRC, since toxicity impacts have been excluded from the benchmark calculations in spite of already having a very low weighting due to their low estimated robustness. The semi-numerical approach gives a false sense of objectivity to this “uncertainty assessment”.

To summarize, the final weighting set is the result of so many mathematically questionable averaging, scaling, and multiplication steps that it is hard to take serious. To allow for proper interpretation of results, weighting sets should be based on clear and transparent principles. It is preferable to use a single-step conversion with a fairly limited but unambiguous perspective, such as weighting based on damage costs.

Previous blog-posts on PEF:

The clock is ticking for PEF
Harnessing the End‑of‑Life Formula

Last week I attended a meeting on the EU commission’s Product Environmental Footprint (PEF) arranged by the Danish Environmental Protection Agency. The PEF initiative was presented as providing a level playing field for competition on environmental claims, countering the current situation that many (allegedly 95% of all environmental claims) are misleading.

The PEF pilot phase is now coming to and end, and the Commission representative presented the result as a success story: “We now have a machine that works!” although it was acknowledged that the data to drive the system is still insufficient. The PEF “machine” was likened to a watch that shows the time in simple terms, while the (LCA) clockwork inside is intricate and complex.

In spite of the attempt to present the PEF as a success story, there was much criticism from the more than 100 representatives from Danish industries and NGOs that were present. Some were concerned about the increased costs, pointing out that the money would be better spent on making improvements than on documenting status quo to the consumers. Others questioned whether LCA is at all relevant for consumer information, and it was suggested that new technology is now making LCA information for this purpose out-dated (see also my blog-post on distributed ledger technology).

The PEF system relies on the development of a “Product Category Rule” (PCR) for each product group. These PCRs are developed on a voluntary basis, paid by the majority of the industries that produce the products in each product group, providing a golden business for consultants. During the pilot phase it has turned out that this results in different rules for different products, so that only products within the same product group can be compared (as with the existing eco-labels).

An example from the audience was raised, about the gifts that a speaker often receive as thanks for giving an otherwise unpaid talk: Would a bottle of French wine or a box of locally produced chocolate (with cocoa beans from Africa) be the best choice? Since these two products belong to different product groups, they will have different rules for their PEFs and should therefore not be compared.

This simple example illustrates one of the largest problems of the idea of Product Category Rules, namely that they do not further improvements in environmental performance across product categories. NGOs and scientific advisors have pointed out that even within product categories it is now questionable if the PEF LCA calculation rules further environmental improvements. We have lobbied for constraining the industry consensus on the PCRs to the original Commission Guideline (see also my interpretation guide to this) and the requirements of the ISO LCA standards that focus on environmental improvements, but are now mocked with not understanding that compromise on scientific validity is necessary to reach industry consensus.

Personally, I concluded that if the concern is unfair competition, PEF is not the answer, but rather a part of the problem. A more simple solution would be to enforce the existing legislation on misleading claims, a solution that has worked well in Denmark so far.

In my view, there is not much point in having an expensive watch – even with the simplest user interface – if it shows the wrong time.

crosschevron-down