The Fox Island Energy Crisis: A Natural Experiment in Voluntary Energy Conservation

Item

Title (dcterms:title)
Eng The Fox Island Energy Crisis: A Natural Experiment in Voluntary Energy Conservation
Date (dcterms:date)
2014
Creator (dcterms:creator)
Eng Narog, Josiah
Subject (dcterms:subject)
Eng Environmental Studies
extracted text (extracttext:extracted_text)
THE FOX ISLAND ENERGY CRISIS:
A NATURAL EXPERIMENT IN VOLUNTARY ENERGY CONSERVATION

By Josiah M. Narog

A Thesis
Submitted in partial fulfillment
of the requirements for the degree
Master of Environmental Studies
The Evergreen State College
June, 2014

©2014 by Josiah M. Narog. All rights reserved.

This Thesis for the Master of Environmental Studies Degree
by
Josiah M. Narog

has been approved for
The Evergreen State College
by
_______________________________
Ralph Murphy, Ph. D.
Member of the Faculty

_______________________________
Date

ABSTRACT
The Fox Island Energy Crisis: A Natural Experiment in Voluntary Energy Conservation
Josiah M. Narog

This research examines a natural experiment in voluntary energy conservation that occurred during
short term energy supply crisis during the winter of 2010 on Fox Island, Washington. Using utility
billing records, NOAA weather data and survey data, a variety of statistical techniques, including
regression modeling, are applied in an effort to determine whether customers on Fox Island reduced
their energy consumption in response to utility requests for electricity conservation, and whether
their responses are predicted by household characteristics. Support was not found for the research
hypothesis that residents of Fox Island consumed less energy during the outreach period than usual
as a group. Residents who credited their energy saving efforts to a desire to conserve resources for
future generations were found to consume somewhat more energy during the crisis. Residents who
had positive opinions of their utility’s efforts to address the energy crisis used less energy as a
group during the crisis. Voluntary conservation outreach was not shown to be effective at reducing
overall levels of energy consumption in this case, and more research in the areas of attitudes,
behaviors and beliefs are needed to understand the specific conditions under which households can
be relied upon to conserve energy when asked.

Table of Contents
Table of Contents ............................................................................................................................ iv
Table of Figures ............................................................................................................................... vi
Table of Tables ............................................................................................................................... vii
Thesis ............................................................................................................................................... 1
Thesis Statement .......................................................................................................................... 1
Fox Island Washington Power Cable Failure .............................................................................. 1
Research Effort ............................................................................................................................ 3
Literature Review............................................................................................................................. 5
Evaluation, Measurement and Verification (EM&V) .............................................................. 5
Behavior and Energy Consumption ......................................................................................... 6
Historical Overview ................................................................................................................. 8
Methods ......................................................................................................................................... 13
Data – Sources and Preparation for Analysis............................................................................. 13
Data Sources .......................................................................................................................... 13
Initial Data Assessment and Cleaning ................................................................................... 15
Test 1 – Differences in Differences Analysis vs. Control Group .............................................. 23
Test 2 – Multivariate Regression Modeling ............................................................................... 27
Model Selection ..................................................................................................................... 29
Model Validation.................................................................................................................... 34
Energy Consumption Model Test for Energy Conservation .................................................. 37
Test 3 – Household Regression Modeling & Household Survey .............................................. 37
Creation of Household-level Regression Models .................................................................. 37
Gathering of Survey Data on Household Attitudes, Beliefs and Characteristics ................... 39
Comparison of Survey Results with Individual Regression Model Results .......................... 40
Results............................................................................................................................................ 41
Test 1 Results – Difference in Differences Comparison with Off-island Control Group .......... 41
Test 2 Results – Energy Model Evaluation of Treatment Period Energy Consumption............ 44
Test 3 Results – Survey Response and Individual Regression Model Analysis ........................ 46

Discussion ...................................................................................................................................... 48
Summary of Attempts to Detect Aggregate Level Energy Conservation .................................. 48
Summary of Attempts to Detect Patterns in Individual Level Conservation ............................. 49
Urgent Telephone Conservation Appeal, November 22, 2010 .................................................. 50
Possible Reasons for Lack of Conservation Finding ................................................................. 50
Implications for Future Conservation Program Development ................................................... 51
Works Cited ................................................................................................................................... 53
Appendices..................................................................................................................................... 55
Appendix A – High/Low Usage Day Pairs ................................................................................ 55
Appendix B – Reading Days with High Levels of Missing Meter Data .................................... 56
Appendix C – Data Discrepancies, Meter Usage versus Substation Measured Usage. ............. 59
Appendix D – Script of Telephone Outreach to Fox Island Residents, Nov. 2010 ................... 62
Appendix E – Regression Model Iterations ................................................................................ 63
Multivariate Regression Quadratic Model Version A ........................................................... 63
Multivariate Regression Quadratic Model Version B ............................................................ 64
Multivariate Regression Quadratic Model Version C ............................................................ 65
Multivariate Regression Quartic Model A ............................................................................. 66
Multivariate Regression Quartic Model B ............................................................................. 66
Multivariate Regression Quartic Model Final........................................................................ 67
Appendix F – Discussion of Akaike’s Information Criterion and Model Selection ..................... 69

List of Figures
Figure 1- Fox Island, WA and surrounding area. .............................................................................. 2
Figure 2 - Oil Prices ($/barrel) by year. Data courtesy Energy Information Administration, via
Wikipedia Commons. ....................................................................................................................... 9
Figure 4 - Daily average temperatures in degrees Fahrenheit, Tacoma Narrows Airport, 2008 2012. .............................................................................................................................................. 14
Figure 5 - Raw summed daily meter reads, all meters, Fox Island, WA, 2008 - 2012. .................. 15
Figure 6 - Summed daily meter reads, all meters, Fox Island WA, June 1, 2008 - June 30, 2008. 16
Figure 7 - Scatter of uncleaned summed daily meter reads by daily average temperature
Fahrenheit, Fox Island, 2008 - 2012............................................................................................... 16
Figure 8 - 22 anomalous high/low summed meter reading pairs identified by formula, Fox Island
dataset 2008-2012. ........................................................................................................................ 17
Figure 9 - Closeup of summed daily meter reads by average daily temperature, showing a
bisection of the data into two groups. .......................................................................................... 18
Figure 10 - Summed daily meter reads, Fox Island WA, September 2008 - April 2009. Data shows
high/low reading pairs, as well as a large area of conspicuously low meter readings in November
and December of 2008. ................................................................................................................. 19
Figure 11 - Percentage of Fox Island expected meter readings missing, by day, 2008 - 2012. ..... 19
Figure 12 - Summed meter reads by daily average temperature Fahrenheit, Fox Island WA, after
removal of high/low meter reading pairs and days with high levels of missing meter readings. . 20
Figure 13 - Summed household meter readings vs. Substation metered usage, Fox Island WA,
2008 - 2012. ................................................................................................................................... 21
Figure 14 - Summed meter readings, Fox Island Treatment Group (red) vs. Off-island Cromwell
control group (blue), 2008 - 2012. ................................................................................................. 23
Figure 16 - Differences in kWh by average external temperature, Fox Island vs. Cromwell control
group. This figure shows that there is a consistent relationship between the respective groups'
temperature response curves........................................................................................................ 25
Figure 17 - Differences in kWh, Fox Island vs. Cromwell control group by external temperature,
with a fitted quartic regression line. .............................................................................................. 26
Figure 18 - Non-treatment predicted differences versus observed differences between Fox
Island and Cromwell control groups, normalized by quartic regression. ...................................... 27
Figure 19 - Frequency distribution of average daily temperatures observed at the Tacoma
Narrows Airport, 2008 - 2012. ....................................................................................................... 29
Figure 20 - Summed daily meter reads by average daily temperature, with quadratic fil line. .... 30
Figure 21 - Summed daily meter reads by average daily temperature, with quartic fit line......... 31
Figure 22 - Results of 5-fold crossfold validation holdout exercise, showing each folds' model
predictions vs. that fold's observed values.................................................................................... 35

Figure 23 - Fox Island daily energy use minus Cromwell control group daily energy use, 2008 2012. The pre-crisis period is shown in blue, and the treatment period with utility outreach is
shown in red. ................................................................................................................................. 42
Figure 24 - Results of Differences in Differences test for the winter 2010 treatment period, using
the Cromwell off-island control group, normalized via quartic regression. The test shows that
the observed differences are higher than the predictions. ........................................................... 43
Figure 1-B - Uncleaned summed daily meter reads by average daily temperature, showing a
bifurcation of the data into two distinct groups............................................................................ 58

List of Tables
Table 1 - Sample sizes, correlation coefficient and associated p-values for each survey element
and those households' observed conservation responses during the treatment period, as
measured by mean standard residuals of their regression models. ............................................. 47
Table 1-A - 22 high/low pairs of summed electric meter reads from Fox Island, WA. .................. 55
Table 1-B - List of dates excluded from training dataset due to high ratios of missing meter
reads, along with the associated level of missing readings. .......................................................... 56
Table 1-C - Days excluded from training dataset due to discrepancies between household meter
data and substation metered data, showing the summed consumption in kWh from each source.
....................................................................................................................................................... 59

Acknowledgements
I would like to thank Professor Ralph Murphy for his unflagging enthusiasm and support of this
project. His insight, advice and encouragement made this research possible.
Critical support for this project, including financial support, access to data, and staff expertise,
was provided by the Pensinsula Light Company. I would like to specifically thank Mike Simpson,
Jonathan White, Kelly Keenan and Amy Grice for their efforts on behalf of this project. Gina
Ricci, with the National Rural Electric Cooperative Market Research team, lent her significant
expertise in survey design and administration, without which this project could not have
proceeded.
My colleagues at the Washington State University Extension Energy Program supported my
research every step of the way. Thanks especially to Dr. Lee Link, Jennifer Carter, Todd Currier
and David Shepherd-Gaw for the many ways they supported me personally and professionally.
Significant financial support for the project was provided by the Western Area Power
Administration, for which I am deeply grateful.
Laura Dedon-Oxford, my editor, deserves tremendous credit for her willingness to lend her
professional editing skills to my research. These words are poor payment for your hours of work
on behalf of a friend. Thank you.
Finally, I am so grateful to my wife, Dasha and children Emily, Kara and Byron, for their patience
and understanding. I will do my best to make up for the countless late nights, missed dinners,
and working weekends.

Thesis
Thesis Statement
This research examines a natural experiment in voluntary energy conservation that occurred
during the winter of 2010 on Fox Island, Washington. Using utility billing records, NOAA
weather data and other data, a variety of statistical techniques, including regression modeling, are
applied in an effort to determine whether customers on Fox Island reduced their energy
consumption in response to utility requests for electricity conservation.
Understanding the conditions under which customers may be relied upon to reduce their energy
consumption, particularly when not provided with a financial incentive to do so, is critical in
determining whether voluntary demand side management can be a reliable resource in
tomorrow’s energy system.

Fox Island Washington Power Cable Failure
Fox Island is a small island located directly across the Tacoma Narrows from Tacoma,
Washington. As of the 2010 census, Fox Island had a population of 3,633 residents. Residents of
Fox Island are generally well educated and affluent; an estimated 34.1% of the population holds a
bachelor’s degree, and 19.1% hold a graduate degree, compared to the respective Washington
State averages of 20.1% and 11.3% [1]. At $98,420, the median household income is well above
the Washington State average of $58,890 [19].

1

Figure 1- Fox Island, WA and surrounding area (left) and Pierce County, Washington (right).

Fox Island is supplied with electric power via two cables. One cable runs across the Fox Island
bridge, while the other crosses under the channel. The original cross channel cable was installed
in 1931, and was replaced in 1952 and again in 1970. The 1970 cable carried three-phase power
through three sub-cables and also included a fourth, “ground” sub-cable.
In July of 2010, one of the three phase sub-cables failed, leaving the utility in a precarious
position and without enough time to replace the cable before the upcoming winter heating season.
The Peninsula Light Company responded quickly, deploying hundreds of automated hot water
heater controls capable of reducing peak loads by turning off residents’ water heaters. They also
reached out to residents in an attempt to encourage voluntary energy conservation. The utility
also re-wired the underwater cable to bypass the damaged sub-cable, but the amount of current
the cable could now carry was much less than before, and was further limited by a desire to avoid
damaging the remaining sub-cables.
That winter passed without major incident, and Peninsula Light Company only attempted urgent
telephone outreach to customers during one particularly bad winter storm in late November. In
2

March of 2011 the utility began the cable replacement effort, and in June 2011 the new cable was
energized, thus resolving the crisis. The loss of the cable, conservation outreach and subsequent
replacement of the cable form the basis of a natural experiment in voluntary energy conservation.
This research examines this unplanned experiment on Fox Island, Washington. The utility call for
voluntary energy conservation is a form of demand side management. The utility hoped that,
through education and outreach, its customers would reduce their energy consumption and
prevent the need for rolling blackouts. This research intends primarily to answer two questions.
Did the residents of Fox Island, taken as a group, respond to the outreach appeals from their
utility by conserving electricity during the winter of 2010? Secondly, were there patterns in the
individual responses to the conservation appeal suggesting that residents with particular socioeconomic or behavioral traits were more likely to respond to conservation outreach appeals?
Understanding the conditions under which customers may be relied upon to reduce their energy
consumption--particularly when not provided with a financial incentive to do so--is critical in
determining whether voluntary demand side management can be a reliable resource in
tomorrow’s energy system. The answers to these questions would provide useful insight into the
creation and implementation of future demand side management programs, moving us closer to
an end-to-end energy management paradigm.

Research Effort
The power cable failure on Fox Island provides a relatively unique natural experiment – a small
community which was subjected to an ostensibly severe and short-term disruption to their energy
supply, and was asked to perform voluntary, non-compensated energy conservation by their
member-owned electric cooperative. Adding to the conditions that combined to turn this natural
event into a feasible experiment was the existence of an extensive, daily interval meter data set
for each individual home on the island. Finally, the absence of a natural gas supply to the island

3

meant that, for most residents, electricity would be the primary form of heating, and so it was
reasonable to believe that temperature driven modeling could be successfully applied for the
majority of residents.
After learning that one of the two power cables which supplied Fox Island with electricity had
failed, the electric cooperative, Peninsula Light Company, engaged in a multi-pronged effort to
weather the coming winter heating season without resorting to rolling blackouts or other extreme
measures. These efforts included the deployment of hundreds of automated hot water heater load
controllers designed to be operated by the utility for purposes of moderating peak demand. Pen
Light coordinated with state emergency management officials and performed risk analyses
designed to determine at what temperature they were likely to experience a supply shortfall
emergency on the island. Additionally, Pen Light engaged in an outreach effort aimed at
encouraging Fox Island residents to cut back on their non-essential electric consumption.
No financial incentives were offered to encourage conservation. Rather, appeals were made to the
residents’ sense of community and to their self-interests – conserving energy might mean keeping
the lights on for everyone, and their friends and neighbors were likewise cutting back.
Following the resolution of the crisis and the replacement of the existing failed cable, Peninsula
Light Company was hailed by many organizations for their extremely rapid deployment of
automated load controllers and for their use of customer appeals to encourage conservation.
Articles written about this incident claimed that voluntary conservation measures helped Pen
Light Company to "keep the lights on" during the crisis.
This research seeks to address the following research hypotheses related to the Fox Island energy
crisis:
Hypothesis 1: As a group, customers on Fox Island consumed significantly less energy than
would be expected during the treatment period (winter 2010-2011).
4

Hypothesis 2: Household level conservation is significantly predicted by household
demographics, attitudes and beliefs.

Literature Review
Evaluation, Measurement and Verification (EM&V)
In order to determine whether conservation has occurred, one must first attempt to determine
what the energy consumption would have been in the absence of the intervention. This predicted
consumption, often referred to as the “baseline,” can be estimated by several different means. For
a single piece of equipment with a consistent purpose – say, an electric motor driving a conveyor
belt in a factory – it is relatively simple to determine a baseline by simple measurement.
Buildings, however, are significantly more complex and present special challenges.
Generally speaking, the energy consumption of a building is driven by three physical parameters:
the building’s spatial layout, its insulation’s efficacy, and the weather the building is subjected to
on a day-to-day basis [3]. To these physical parameters must be added occupant behavior, as the
decisions made by homeowners greatly affect the amount of energy consumed by their
residences. A homeowner’s decision to switch off a light, turn down a thermostat, or turn off a
computer all represent short-term behaviors that affect the home’s electricity consumption.
Thankfully, behaviors are also somewhat predictable, particularly as a function of time of day,
day of week, or month of year. Generally speaking residents follow similar patterns each day,
getting up, preparing a meal, going to work, etc.
Having identified the factors that strongly determine energy consumption, it is possible to use
statistical modeling techniques to predict, as a function of weather, volume, insulation and time,
the energy consumption of a building. Conversely, if an individual is not in possession of details
regarding a building’s physical characteristics but does have detailed weather data as well as
energy consumption observations, that individual can estimate the energy related physical

5

characteristics of a home that affect its energy consumption. Once the precise relationship
between weather, a building’s physical characteristics, and energy consumption are determined,
then the remaining variable of interest - in this case behavior - can be studied.
Evaluation, Measurement and Verification (EM&V or M&V) is primarily focused on first
estimating and then eliminating the confounding variables related to energy consumption that are
not impacted by the energy conservation program in question, in an effort to measure and verify
the impact of the program itself. Energy conservation programs are often expensive; their costs
must be recovered from the electric ratepayers. These costs can be justified after EM&V
demonstrates that their impacts, in terms of load shifting or reduction, have been confirmed using
robust statistical techniques.
The standard for EM&V is the International Performance Measurement and Verification Protocol
(IPMVP). This protocol, originally created by the U.S. Department of Energy, is supported by an
international governing body and is used extensively throughout the world [4]. IPMVP describes
multiple strategies for verification, listed as options A-D. Option C covers whole building
monitoring of energy consumption, and includes multiple sub-options including multivariate
regression modeling. IPMVP provides basic guidelines and references for the use of multivariate
regression modeling as an EM&V strategy. The methods used in this research are largely based
upon techniques described in Option C of the IPMVP.
Behavior and Energy Consumption
Early energy policy was created with the view that consumer demand for electricity was
relatively inelastic, since electricity is an essential good necessary to the basic conduct of modern
life [5]. Most demand side strategies, therefore, have focused on hardware-based savings which
can include both efficiency measures and direct load control measures. These hardware measures

6

might include replacing an electric motor, or an air conditioner, with more energy-efficient
models.
Encouraging voluntary, behavior-based energy savings in the form of behavior based
conservation efforts was seen as an unattractive option by rate-setters and policy makers,
especially following the perceived failure of such conservation appeals by President Jimmy
Carter in 1978 [5]. Many observers believe that President Carter’s electoral defeat in the 1980
election was, at least in part, due to his famous appeals to Americans that they “turn down the
thermostat.” The image of President Carter appealing to the public while wearing a thick sweater
is a famous, and much lampooned, part of the energy conservation legacy. Policy makers came to
believe that any attempts to get Americans to give up their basic, energy-derived comforts would
result in significant backlash [5].
Price policies were used, but these policies were primarily targeted at consumers’ long-term
energy decisions. Higher electric prices might encourage a homeowner to invest in insulation, it
was believed, but consumers could not be relied upon to make daily, habitual choices to reduce
their consumption.
During the 1980’s, efficiency continued to be the primary focus of policymakers, and even in this
arena, demand-side efficiency was largely relegated to the sidelines by a focus on supply side
efficiencies in generation and transmission. Throughout the history of the aggregated electric
supply model, electric utilities had been able to make significant gains in supply efficiency and
thus profitability, simply by investing in ever larger and better designed generating plants. These
significant gains in efficiency continued into the 1980s and discouraged planners from looking
for efficiency or conservation measures elsewhere. Additionally, efficiency gains in supply were
extremely easy to measure and verify, given that a precise accounting of both the fuel
consumption and the resulting electricity supply was readily available for every power plant.

7

Supply side efficiency was seen as more reliable than even investments in demand side efficiency
equipment such as energy efficient air conditioners or refrigerators; it was thought that the
efficiency gains of the equipment would often be subverted by consumers’ misuse of the
equipment.
The California Energy Crisis of 2001 broke the intellectual logjam which surrounded the
consumer behavior paradigm. First, customers were seen to conserve energy in the face of price
increases, demonstrating that electric demand is at least somewhat elastic and responds to price
signals. Second, after price increases were eliminated by legislative fiat, consumers were seen to
respond to public conservation appeals, showing that customers can also be responsive to
information and social appeals. Demand side-management had been proved effective on a
massive scale and for a significant period of time.
Increasingly, academics, regulatory bodies and utilities are now investigating a variety of demand
side management techniques, including both financial and non-financial measures. It is hoped that
by more fully understanding the techniques (financial, informational, or other) which can
successfully prompt conservation efforts on the part of customers, DSM can play an important
role in the future health and stability of the electric system.
Historical Overview
Following is a brief overview of historical events that have had major impacts on the
development of demand side efficiency and conservation programs. Each of these events
prompted shifts away from the widely accepted belief that demand side management could never
play an important role in the electric system.
1970s Energy Crises
Until the 1970s, the electric power sector had relied upon efficiencies of scale and ever increasing
demand for their product to provide power at decreasing cost. This fundamental model began to
8

unravel in the late 1960s, and by the mid 1980s it was clear that the old way of supplying
electricity would have to change.
In 1967, the Arab Oil Embargo resulted from the Six-Day War between Israel and the
surrounding Arab states. Oil exports were ended to countries perceived as being aggressors in the
conflict, including the United States and the United Kingdom. The Yom Kippur War in 1973
brought a repeat of this strategy and its accompanying oil shock. Finally, in 1979, the Iranian
Revolution was preceded by a massive strike of Iranian oil workers, which resulted in a dramatic
reduction of Iranian exports. In the midst of political turmoil in the oil producing regions, the
United States’ oil production peaked, placing additional pressure on global oil markets.

Figure 2 - Oil Prices ($/barrel) by year. Data courtesy Energy Information Administration, via Wikipedia Commons.

The 1970’s crises had a dramatic and lasting impact on energy prices, as shown in Figure 2
above. In response to this series of events, the industrialized nations created a system designed to
prevent such events from disrupting the global economy in the future [6]. The United States
Department of Energy was founded in 1977 under the Carter administration, and state energy
offices were established throughout the United States [7]. These bodies’ roles included the
promotion of “least cost” planning techniques that sought to encourage efficiency programs by
utilities.

9

Environmental Impacts
In 1962, Rachel Carson published the book Silent Spring, documenting the severe detrimental
impact of the widespread use of pesticides. Many consider this book to be the beginning of the
popular environmentalist movement. During the 1960’s and 70’s, the public became increasingly
aware of the impacts of industry and consumerism on the natural world. Incidents such as the
Torrey Canyon oil tanker spill in 1967, the 1969 Cuyahoga River Fire, pervasive Los Angeles
smog, and the Three Mile Island nuclear accident in 1979 are all notable examples of
environmental incidents which helped to raise awareness of the environmental costs of economic
prosperity and abundant energy. These factors all contributed to changes in the way that energy
was produced and consumed.
The Era of Efficiency
Beginning in the 1980’s, demand side management began in earnest, and utilities around the
country began implementing significant efficiency programs. Amory Lovin coined the term
“negawatt” to describe a watt of energy conserved rather than produced. Utilities began least cost
planning processes, and it was discovered that efficiency programs often provided a more cost
effective means of meeting load than the construction of additional generating assets.
2001 California Energy Crisis
In 1999, California underwent a substantial reorganization of its utility regulatory apparatus.
Electric rates, traditionally set by the state’s regulatory body, instead were indexed to the average
price of the wholesale electric market [8]. The intent of this measure was to allow competitive
pressures to ultimately drive retail prices below levels seen under the previous regulatory
structure.

10

In 2001, a series of factors—not least among which was intentional manipulation of the electric
market by unscrupulous corporations—combined to see record high electric prices in California
and the entire Western Interconnection.
Initially, California customers were subject to the full increases in electric prices, which in some
instances were dramatic and rapid. In the San Diego Gas and Electric service territory, pre-reform
prices of $.10 / kWh were the norm. At the height of the crisis, prices to residential customers
were over $.23 / kWh.
Customers voluntarily responded to these price increases with a rapid reduction in energy
consumption. Researchers estimated a 13% reduction in consumption after normalizing for
weather differences. While the public was reducing their consumption in response to the price
spikes, there was widespread outrage about the situation. In September of 2000, the California
State Legislature imposed a price cap of approximately $.135 / kWh. Following the imposition of
the price cap, energy consumption rebounded by approximately 8%.
After enacting the price cap, California was faced with a crisis of a different sort. Prices were
now stable, and public outrage had been quelled, but system operators were now faced with
supply shortfalls and the potential need for electricity rationing via rolling blackouts. In an
attempt to prevent this outcome, California state agencies and utilities undertook a massive public
campaign aimed at promoting voluntary energy conservation without requiring massive price
increases. Initially this campaign was met with skepticism. Many doubted that consumers would
respond to such non-financial appeals. Ultimately, however, the public appeals proved to be
effective, and energy consumption again began to decline. The following figure shows the
normalized consumption as a response first to the price spikes, then to the public conservation
appeals.

11

Figure 3 - Average Within-household consumption changes during the 2000 California Energy Crisis price spike and
subsequent price cap [8].

2008 Juneau Alaska Transmission Line Loss
In April 2008, an avalanche destroyed the single transmission line connecting Juneau, Alaska to
its hydro-electric facility. Immediately, diesel generators came online to fill the shortfall in
electric supply caused by the loss of the hydro facility. Initial estimates indicated that repairs
would take a full three months, and customers were informed that they would soon be facing
increased bills as a result of the costs of diesel fuel. Electric prices spiked to 500%, hitting $.52 /
kWh during the crisis [9]. Ultimately, repairs proceeded faster than anticipated, and the supply
crisis only lasted for 45 days. During that time period, residents of Juneau responded to the 500%
increase in prices by reducing their electric consumption by approximately 25% [9]. Residents
reported an average of 10 conservation behaviors per household, with a mix of behavioral
strategies such as thermostat changes or light management, and technical improvements such as
12

light-bulb replacement, appliance changes, or added insulation. Researchers also found a
persistent reduction of energy usage following the end of the crisis, with an average 8% energy
savings as compared to pre-crisis consumption. The Juneau case demonstrates that under certain
extreme circumstances, household conservation can reach very high levels, with residential
customers combining a variety of behavioral and technological strategies to great effect. What is
less clear, however, is whether such conservation efforts can be achieved through non-price
related means, or in less dire circumstances. The Fox Island energy crisis, examined in this thesis,
is one such scenario where an urgent conservation request was made by the utility, but was not
accompanied by financial incentives or price signals. This research will attempt to determine
whether residents responded to this request.

Methods
Data Sources
The main source of data for this evaluation consisted of 4,374,945 recorded daily interval meter
reads from the Fox Island population served by two Feeders in the Peninsula Light Company
Service Territory (Artondale Feeder 2 and Artondale Feeder 6, hereafter AR2 and AR6). Each
record in this data set consisted of a location ID (associated with the specific meter) a customer
ID (associated with the customer account), a reading date, the daily usage in Kilowatt Hours
(kWh), and the reading type (“Actual” or “Estimated”). This data set covered the period from
January 1, 2008 through October 18, 2012. The figure below shows the summed usage by date
from the original data set. Outside of the vertical axis limits of the figure are a single day where
the summed usage is negative (-179882 kWh on December 12, 2009), and a single day with an
extremely high reading (1,905,752 kWh on November 16, 2010).

13

Weather data, including temperature (in degrees Fahrenheit), wind speed, and cloud cover, were
gathered from the National Oceanographic & Atmospheric Administration (NOAA) Quality
Controlled Climate dataset [10]. The weather data was gathered at the Tacoma Narrows Airport,
located on the peninsula directly across the channel North East of Fox Island. This weather
dataset included hourly readings of temperature, humidity, wind speed, and cloud cover, and
covered the period from January 1, 2008 through October 18, 2012. All weather data was
collected at the Tacoma Narrows Airport Station (NOAA ID 94274, Lat. 47.267, Long. -122.576,
Elev. 292 ft. above sea level).
The following figure shows daily average temperature (degrees F) gathered from the Tacoma
Narrows Airport weather station, for the study period. Seasonal temperature variations are clearly
visible in the data.

Figure 4 - Daily average temperatures in degrees Fahrenheit, Tacoma Narrows Airport, 2008 - 2012.

14

The following figure shows the raw summed kWh for all meters in the Meter Data set, for the
study period from January 2008 through October 2012. The summed energy usage for the island
is approximately inversely related to the above average daily temperature chart, showing that low
temperatures are a strong driver of increased energy consumption on the island. Personal
communications with PenLight utility staff revealed that natural gas is not available on Fox
Island, so most homes are electrically heated in some form. An examination of Figure 5 reveals
time periods with extreme variations in usage from day to day, as well as periods where usage is
unusually low for an extended period of time. This was suggestive of underlying problems with
the dataset that would need to be identified before modeling efforts could proceed.

Figure 5 - Raw summed daily meter reads, all meters, Fox Island, WA, 2008 - 2012.

Initial Data Assessment and Cleaning
Figure 6 shows daily summed meter data from June, 2008. Clearly visible in the data is a pattern
of abnormally low usage days followed immediately by abnormally high usage days.
15

Figure 6 - Summed daily meter reads, all meters, Fox Island WA, June 1, 2008 - June 30, 2008.

These paired high and low usage days can also be seen in Figure 7, showing summed kWh usage
as a function of daily average temperature. A complementary set of high and low data points are
visible arrayed around the central trend line.

Figure 7 - Scatter of uncleaned summed daily meter reads by daily average temperature Fahrenheit, Fox Island, 2008 2012.

16

In an effort to locate all of these anomalous paired high/low data points, an algorithm was
employed to select days wherein the summed usage for that day was less than half of the previous
day, as well as less than one fifth of the following day’s summed usage. This simple method
selected 22 Low/High day pairs for a total of 44 days, as summarized in the Figure 8. Details of
the selected days are provided in Appendix A.

Figure 8 - 22 anomalous high/low summed meter reading pairs identified by formula, Fox Island dataset 2008-2012.

When questioned about the observed high/low pair phenomenon, PenLight staff provided a
plausible explanation involving the design of the electric meters and the means by which they
transmit data to the utility. These meters, deployed in 2005, are designed to transmit usage data to
the utility once per day, and do so via powerline communication, transmitting their data signal
along the same circuits used to transmit power to the home. This allows digital meters to be
deployed in territories where insufficient cellular network coverage exists for wirelessly

17

transmitted data solutions, and avoids the cost of the utility setting up their own “mesh” network
of radio repeaters. The downside of power line communications, however, is that interruptions to
the power line network, such as a rerouting of power through a different circuit, or a loss of
power, can result in a failure of the meters to transmit their data. In such an event, the meter will
transmit at the next available opportunity,
usually the next day. PenLight’s digital
meters operate much the same way as
traditional electro-mechanical meters, in that
what they report is an “absolute” number of
kWh used as of the time of the reading. It is
in observing the differences between the
absolute readings from one day and the

Figure 9 - Close-up of summed daily meter reads by average
daily temperature, showing a bisection of the data into two
groups.

absolute reading from the next that the
relative, or daily, usage can be determined. Because of this, when meters missed a daily reading,
the next day’s reading was the sum of both days’ usage. This led to the particular pattern
observed in Figure 8 above.
In addition to these extreme low/high pairs, the plotting of the summed consumption as a function
of temperature revealed a pattern wherein two distinct clusters of points can be seen in the data. A
close-up view of the plot is shown to the right and the full data are shown below.
A suspicious “dip” in the first shoulder of the 2008 heating season suggests that the summed
usage is lower than would be expected for this part of the year. Figure 10 provides a closer look at
the raw summed kWh for the 2008-2009 heating season.

18

Figure 10 - Summed daily meter reads, Fox Island WA, September 2008 - April 2009. Data shows high/low reading pairs, as well as a
large area of conspicuously low meter readings in November and December of 2008.

Examination of the meter data revealed missing meter readings. Conversations with PenLight
staff indicated that these readings were also missing from the data warehouse, and could not be
retrieved. Figure 11 below shows the percentage of the total number of meter reads which are
missing from the original dataset, as a function of time.

Figure 11 - Percentage of Fox Island expected meter readings missing, by day, 2008 - 2012.

19

For days with low total numbers of missing readings, it was decided that a “derate” factor would
be incorporated to adjust for low levels of missing meter reads, and that days missing more than
15% of their expected readings would be omitted from the analysis dataset. Days eliminated due
to insufficient readings are detailed in Appendix B.

Figure 12 - Summed meter reads by daily average temperature Fahrenheit, Fox Island WA, after removal of high/low
meter reading pairs and days with high levels of missing meter readings.

Comparison of Meter Data to Artondale Substation Data
A second source of summed energy consumption data was obtained from PenLight, consisting of
5-minute interval volt and amperage readings from the Artondale substation. The Artondale
Substation provides electric service to approximately 2500 individual meters, including meters
located off Fox Island across the channel. The population served by the Artondale substation is
the same as the population included in the meter data set.

20

The Artondale substation consumption data was compared to the metered summed energy
consumption, adjusted for the missing meter reads, from the meter data set as an additional means
of identifying potentially erroneous days. When plotted as a time-series, the differences between
the substation and the metered estimates are seen concentrated in the same period of 2008 where
most of the missing meter data is concentrated, as shown in Figure 13 below.

Figure 13 - Summed household meter readings vs. Substation metered usage, Fox Island WA, 2008 - 2012.

This figure clearly shows both the general consistency of these readings, as well as the existence
of relatively rare instances where the two measurements diverge, sometimes sharply. In addition
to meter data which, during parts of 2008 and 2009 sometimes erratic and highly variable
suggesting data errors, there is a period of time when the substation data appears to “dip” sharply
below the meter data (beginning around February of 2009).
An examination of the Substation source data shows that during this “dip” in 2009, the substation
data from Artondale Feeder 6 shows ‘0’ in the amp columns for the entire period reflected by the

21

dip, whereas the Artondale Feeder 2 data shows normal amp readings. This suggests that the
Substation data, too, is not without flaws and gaps.
Given that errors exist in both datasets, but with the knowledge that the datasets are independent
of one another, days where both datasets agree can be relied upon with a good degree of
confidence. A simple formula identified all days where the Substation data varied from the Meter
data by more than 10%, and a list of days that exceeded this tolerance was created and removed
from the analysis dataset. A full list of the discrepant days, as well as the associated summed
meter read and substation readings, can be found in Appendix C.
This method flagged 327 days with deviant data, and 1,134 days with congruent data. Notably, all
of the previously identified data issues (low/high pairs, high missing days) were also identified as
deviant data using this cross dataset comparison method. This provided additional assurance that
including only those data points in the final analysis dataset where the meter data and the
substation data agreed would ensure that only robust data would be used for analysis. The final
analysis dataset consisted of the individual meter readings taken from the 1134 days where the
summed consumption of the individual meters was congruent with the summed energy
consumption observed at the Artondale Substation.

Identification of Off-Island Meters
PenLight did not have an explicit list of those residents within the overall dataset who resided on
Fox Island versus those who resided in the area directly across the channel. Both populations are
served by the Artondale substation. In order to determine which customers were located on the
island, latitude and longitude coordinates associated with each meter/account were obtained from
PenLight, and a Geographic Information Systems (GIS) analysis was performed.

22

This analysis determined which meters were located on the island and which were located on the
mainland. For privacy purposes, the precise locations of the meters will not be included here, but
858 of 2,476 meters were located off the island, and 1,618 of 2,476 meters are located on the
island. Using the account numbers of on- and off-island meters, new summed totals were
calculated for each day for both the on- and off-island populations. In addition, for each day a
“percentage missing” was calculated for each population, and the summed amount was adjusted
by the missing percentage for each individual population. These adjusted summed amounts are
shown in Figure 14 below for the entirety of the study period, showing that the two populations
follow similar seasonal patterns.

Figure 14 - Summed meter readings, Fox Island Treatment Group (red) vs. Off-island Cromwell control group (blue),
2008 - 2012.

Test 1 – Differences in Differences Analysis vs. Control Group
In order to assess whether the Fox Island population had restricted their overall energy
consumption during the treatment period, the off-island population previously identified using

23

GIS analysis was used as a control group. PenLight staff verified that only residents of Fox Island
were subject to energy conservation messaging and outreach related to the cable failure, so a
direct comparison of energy consumption during the two populations for both the non-treatment
and treatment periods was selected as a method of determining whether conservation had
occurred.
Since the size of the populations differed, some method of normalization was required in order to
directly compare the two groups. Several options exist, including sub-sampling or averaging.
Averaging was identified as a simple and effective means of directly comparing the two
populations. Figure 15 shows the on and off island averages, normalized for base load.

Figure 15 - Showing the average Fox Island energy usage and the average Cromwell energy usage, normalized to each
groups' respective baseload. This figure shows that the two groups respond in very similar ways to external
temperatures.

From this figure, it appears that the on-island population responds more strongly to low
temperature events than the off-island population. A probable explanation for this phenomenon is
the availability of natural gas to the off-island population, while no natural gas service is available
24

to residents on Fox Island. As a result, it is expected that Fox Island residents would be more
reliant on electric heating either in the form of resistance electric heaters or heat pumps,
compared to their off-island neighbors.
This difference between populations implies that a direct, unadjusted comparison using the offisland population as a control group would not be appropriate, as the differences between the two
populations appear to be exacerbated by temperature extremes. Figure 16 shows the difference in
average energy consumption between the on and off island populations, as a function of
temperature. Appropriately, the shape of the response is very similar to the overall

Difference in kWh, On Island minus Off
Island

energy/temperature relationship.

Differences in kWh by temperature, Fox
Island vs. Cromwell Control Group
140000
120000
100000
80000
60000
40000
20000
0
0.000 10.000 20.000 30.000 40.000 50.000 60.000 70.000 80.000 90.000
Temperature degrees F

Figure 16 - Differences in kWh by average external temperature, Fox Island vs. Cromwell control group. This figure
shows that there is a consistent relationship between the respective groups' temperature response curves.

In order to better understand the relationship between the two populations’ energy use as a
function of temperature, a quartic regression line was fit to the mean difference in energy
consumption between the two populations, as a function of temperature. When fit with a quartic
regression line, 92% of the variation in the difference in kWh between populations is found to be

25

explained by variations in temperature (R2 = .92). The following figure shows the differences and
the line of best fit.

Figure 17 - Differences in kWh, Fox Island vs. Cromwell control group by external temperature, with a fitted quartic
regression line.

The formula for this regression was used to predict the differences in energy usage between the
on-island and off-island populations during the non-treatment period from 2008 through
December 2011, but excluding the defined treatment period of November 2010 through February
2011. Figure 18 compares the predicted differences against the observed differences during the
non-treatment period, and shows that the regression equation accurately predicts the differences
between the two populations.

26

Figure 18 - Non-treatment predicted differences versus observed differences between Fox Island and Cromwell control
groups, normalized by quartic regression.

Once the predicted differences and the observed differences were calculated for the Pretreatment
period, the same prediction equation was applied to the treatment period. The predicted
differences were compared to the observed differences to determine whether, throughout the
treatment period or for specific sub-sections of the treatment period, the observed differences
were less than the predicted differences, as would be expected if the on-island population was
engaged in conservation behavior. The findings of this analysis are described in the results
section under “Results of Difference in Differences Analysis (Pg. 41).”

Test 2 – Multivariate Regression Modeling
Electric demand varies over time, responding to millions of actions by individual consumers,
businesses and manufacturers going about their daily business. Every time yesterday’s leftover
meal is warmed up in a microwave, or a thermostat activates an air conditioner, or an industrial
lathe comes up to speed in preparation for cutting a piece of metal, some electric generator
27

attached to the grid must respond by increasing production slightly. Hundreds of such generators
respond to tiny incremental changes in load through via automated “governors” which speed up
or slow down the generators in order to maintain the delicate balance of supply and demand on
the electric system. Oversupply of electricity leads to voltage surges, blown circuits and
dangerous fire hazards. Undersupply causes voltage drops, failing equipment or brown outs.
These automated governors, responding to moment-to-moment changes, are sufficient for smallscale changes. For larger changes in load, natural gas turbines must be throttled up or down, and
entire generators must be brought online or taken offline with the changing grid conditions.
When looking at the aggregated loads of millions of households and thousands of businesses,
electric load follows certain fairly predictable patterns. Since space heating and cooling is such a
dominant end-use for electricity, the major sources of month-to-month variation in electric
demand are driven by seasonal changes, which in turn are driven by local climatological
conditions. The Pacific Northwest’s space heating demands far outstrip its space cooling
demands, thus most PNW utilities are “winter peaking” meaning that their highest loads are seen
during the coldest winter months. Below is a chart showing several years or daily average
temperatures for the Tacoma Narrows Airport. Shown on this graph is a line at 72 degrees
Fahrenheit.

28

Figure 19 - Frequency distribution of average daily temperatures observed at the Tacoma Narrows Airport, 2008 2012.

The average temperature does not adequately capture hourly variability, as some days with an
average temperature below 72 degrees Fahrenheit might have peak temperatures in the 80s or
90s. However it is clear that for the vast majority of the time, the Puget Sound region experiences
cool temperatures and many spaces require frequent heating.
Model Selection
At the summed level, energy usage responds strongly to temperature in a curvilinear fashion,
increasing sharply as temperatures drop, decreasing as temperatures approach comfort levels
around 70 degrees Fahrenheit, then increasing again, but at a lower rate, under high temperature
conditions. A quadratic fit line explains nearly 90% of the variations in energy consumption
(R2=.893). The fit line and equation are shown in the figure below.

29

Estimated Total Consumption by Daily
Temperature, Quadratic Fit
300000

y = 111.57x2 - 14979x + 578267
R² = 0.8926

Estimated Total kWh

250000
200000
150000
100000
50000
0
0

10

20

30

40

50

60

70

80

90

Average Daily Temperature, Fahrenheit

Figure 20 - Summed daily meter reads by average daily temperature, with quadratic fit line.

A quadratic equation, however, may not be the best option for describing system temperature
response. At the extreme high and low ends of the temperature spectrum, it is not expected that
energy consumption would increase indefinitely. Instead, the curve will eventually bend down as
individual Heating, Ventilating and Air Conditioning (HVAC) units reach their maximum
capacities under the extreme temperature conditions. Thus one would expect to see energy
consumption plateaus at both ends of the temperature spectrum, with the plateau on the high
temperature side being relatively lower than on the low temperature side, reflecting the lower
total proportion of homes with electric cooling capabilities than those with electric heating
capabilities. A fourth order polynomial, or a quartic polynomial, would provide such a shape.
When a quartic polynomial is fitted to the data, it successfully explains just over 90% of the
variation in energy consumption (R2=.901). The following figure shows a quartic line fitted to the
consumption by temperature data.
30

Estimated Total Consumption by Daily
Temperature, Quartic Fit
300000
y = -0.0707x4 + 16.339x3 - 1255.6x2 + 34037x - 56411
R² = 0.9014

Estimated Total kWh

250000
200000
150000
100000
50000
0
0.000

10.000 20.000 30.000 40.000 50.000 60.000 70.000 80.000 90.000
Average Daily Temperature, Fahrenheit

Figure 21 - Summed daily meter reads by average daily temperature, with quartic fit line.

After exploratory analysis was performed on the total dataset, above, the same evaluation was
performed on a “training dataset” which was selected as a subset of the total dataset that excluded
both non-congruent meter/substation days, as well as excluded all days’ data from November 1,
2011 through February 28th 2011, the period during which PenLight performed conservation
outreach to its Fox Island residents. The resulting training dataset consisted of 881 days of
temperature and energy consumption data. Quadratic and Quartic regressions were performed on
this dataset, and the results are presented below.

Quadratic Regression:
lm(formula = EstimatedTotalConsumption ~ SelectDryBulbF + I(SelectDryBulbF^2),
data = Training_Data, na.action = na.exclude)
Residuals:
Min 1Q Median

3Q

Max

31

-33233 -6807 -1016 5715 35442
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
592784.30 9058.79 65.4 <2e-16 ***
SelectDryBulbF
-15641.35 355.29 -44.0 <2e-16 ***
I(SelectDryBulbF^2) 118.47
3.42 34.6 <2e-16 ***
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 10400 on 877 degrees of freedom
Multiple R-squared: 0.91,
Adjusted R-squared: 0.91
F-statistic: 4.43e+03 on 2 and 877 DF, p-value: <2e-16

Quartic Regression:
lm(formula = EstimatedTotalConsumption ~ SelectDryBulbF + I(SelectDryBulbF^2) +
I(SelectDryBulbF^3) + I(SelectDryBulbF^4), data = Training_Data,
na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-35557 -5844 -330 5214 34890
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
7.66e+04 9.16e+04 0.84 0.40313
SelectDryBulbF
2.33e+04 7.58e+03 3.07 0.00219 **
I(SelectDryBulbF^2) -9.42e+02 2.30e+02 -4.09 4.6e-05 ***
I(SelectDryBulbF^3) 1.24e+01 3.03e+00 4.07 5.1e-05 ***
I(SelectDryBulbF^4) -5.20e-02 1.47e-02 -3.54 0.00042 ***
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 10000 on 875 degrees of freedom
Multiple R-squared: 0.917,
Adjusted R-squared: 0.916
F-statistic: 2.4e+03 on 4 and 875 DF, p-value: <2e-16

The majority of variation is explained well by variations in average daily temperature, with a
slight increase in goodness of fit shown by the quartic model.
Additional variability in the energy consumption might be explained by other measurable factors
such as wind speed, cloud cover, length of daylight, day of the week, month of the year, or
holidays. Data for wind speed and cloud cover was taken from NOAA weather data set from the
Tacoma Narrows station. The variable “AvgWind” is the daily average wind speed from the
dataset. Cloud cover in the NOAA dataset can consist of a variety of different cloud cover
32

categories, as well as a “Clear” category. To simplify the analysis, all non-clear sky observations
were grouped together as “Not Clear” observations. The PcntClr variable is the ratio of “Clear” to
“Not Clear” readings for a given day, with a high percentage corresponding with cloudless
conditions for the majority of the day. Sunlight duration is another potentially impactful variable
on energy consumption. The length of daylight hours varies considerably in the northern
latitudes. The shortest day lasts for approximately 500 minutes between sunrise and sunset, and
the longest lasts approximately 950 minutes. The length of daylight may affect the amount of
lighting energy used for businesses and at homes, and daylight is also a source of passive heat
gain which, in combination with cloud cover, may affect HVAC energy consumption. Energy
consumption may also vary in predictable ways based upon the day of the week and the month of
the year, or on holidays.
In order to examine the appropriateness of each of these variables, stepwise regression was
performed in an effort to minimize the Akaiki’s Information Criterion (AIC) and select an
optimal model. A brief overview of AIC and its applications in model selection is provided in
Appendix F.
A backwards stepwise regression was performed on the training dataset to determine whether the
removal of the non-temperature variables served to decrease the overall AIC for the model. The
details of each iteration of the model are included as Appendix E. The details of the final selected
model are shown below.
lm(formula = EstimatedTotalConsumption ~ I(SelectDryBulbF^2) +
I(SelectDryBulbF^3) + I(SelectDryBulbF^4) + AvgWind + PcntClr +
SunlightDur + WeekDayFactor + MonthFactor + HolidayFactor,
data = Training_Data, na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-22549 -3617 -374 3359 22976
Coefficients:

33

Estimate Std. Error t value Pr(>|t|)
(Intercept)
4.11e+05 8.11e+03 50.69 < 2e-16 ***
I(SelectDryBulbF^2) -2.66e+02 1.40e+01 -18.96 < 2e-16 ***
I(SelectDryBulbF^3) 4.67e+00 3.53e-01 13.24 < 2e-16 ***
I(SelectDryBulbF^4) -2.20e-02 2.41e-03 -9.12 < 2e-16 ***
AvgWind
5.66e+02 6.06e+01 9.35 < 2e-16 ***
PcntClr
2.96e+03 8.76e+02 3.38 0.00076 ***
SunlightDur
-1.05e+02 8.38e+00 -12.55 < 2e-16 ***
WeekDayFactor2
-4.25e+03 7.66e+02 -5.54 4.0e-08 ***
WeekDayFactor3
-5.70e+03 7.33e+02 -7.78 2.0e-14 ***
WeekDayFactor4
-5.26e+03 7.28e+02 -7.23 1.1e-12 ***
WeekDayFactor5
-4.72e+03 7.33e+02 -6.44 2.0e-10 ***
WeekDayFactor6
-5.37e+03 7.29e+02 -7.36 4.4e-13 ***
WeekDayFactor7
-2.57e+03 7.27e+02 -3.53 0.00043 ***
MonthFactor2
-1.12e+02 1.27e+03 -0.09 0.92922
MonthFactor3
2.79e+03 1.78e+03 1.57 0.11675
MonthFactor4
3.62e+03 2.51e+03 1.44 0.14908
MonthFactor5
5.06e+03 3.16e+03 1.60 0.10987
MonthFactor6
6.53e+03 3.52e+03 1.85 0.06398 .
MonthFactor7
4.29e+03 3.33e+03 1.29 0.19798
MonthFactor8
-2.60e+03 2.79e+03 -0.93 0.35126
MonthFactor9
-1.23e+04 2.11e+03 -5.82 8.2e-09 ***
MonthFactor10
-1.26e+04 1.40e+03 -8.95 < 2e-16 ***
MonthFactor11
-8.55e+03 1.07e+03 -8.03 3.3e-15 ***
MonthFactor12
-8.28e+02 1.11e+03 -0.75 0.45386
HolidayFactorWORKDAY -3.42e+03 1.31e+03 -2.61 0.00920 **
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 5670 on 855 degrees of freedom
Multiple R-squared: 0.974,
Adjusted R-squared: 0.973

F-statistic: 1.33e+03 on 24 and 855 DF, p-value: <2e-16
Model Validation

As shown above, the quartic model provides the best fit for the data, as well as minimizing the
AIC, despite its additional model complexity. In both cases, the Daylight Savings variable is
rejected. The adjusted R2 from the final quartic model is .973, but this number is likely subject to
training optimism. In order to determine a more reasonable standard error and R2, a cross fold
validation was performed.
Cross fold validation attempts to eliminate training optimism in the estimation of total model
error[11]. The training dataset is randomly sorted into an arbitrary number of groups, in this case
five. For the first “fold” a single section of the training data is held back, and the model is created
34

using the data from the other four sections. This model is then used to predict the data values
from the fifth section that was held out while the model was generated. This constitutes the first
“fold,” and the process is repeated four more times. For each iteration, a new section of data is
first held out, and then predicted using the model generated from the other four data sections. At
the end of this process, the residual standard errors and R2 found from each fold are averaged, and
this is used as a good approximation of the true model predictive error for out of sample
predictions. Figure 22 below shows the results of each of the five cross folds for the final
regression model.

Figure 22 - Results of 5-fold crossfold validation holdout exercise, showing each folds' model predictions vs. that fold's
observed values.

35

After folding and recording five times, the total out of sample R2 was then calculated from the
results of the cross fold, using the following formula:

The results of the cross fold analysis for the quartic model are:

SStot
SSres
R2

1.05E+12
2.95E+10
0.972

The post cross fold R2 of .972 is only slightly less than the original estimate of .973. This result
provides strong confidence that the model will successfully predict the majority of variation in
out-of-sample energy consumption. The ANOVA table for the final model follows:
Analysis of Variance Table
Response: EstimatedTotalConsumption
Df Sum Sq Mean Sq F value Pr(>F)
SelectDryBulbF
1 8.27e+11 8.27e+11 25746.84 < 2e-16 ***
I(SelectDryBulbF^2) 1 1.29e+11 1.29e+11 4028.58 < 2e-16 ***
I(SelectDryBulbF^3) 1 5.62e+09 5.62e+09 174.87 < 2e-16 ***
I(SelectDryBulbF^4) 1 1.26e+09 1.26e+09 39.08 6.4e-10 ***
AvgWind
1 1.32e+09 1.32e+09 41.20 2.3e-10 ***
PcntClr
1 1.44e+09 1.44e+09 44.67 4.2e-11 ***
SunlightDur
1 4.01e+10 4.01e+10 1246.98 < 2e-16 ***
WeekDayFactor
6 3.42e+09 5.70e+08 17.73 < 2e-16 ***
MonthFactor
11 1.38e+10 1.26e+09 39.17 < 2e-16 ***
HolidayFactor
1 2.16e+08 2.16e+08 6.72 0.0097 **
Residuals
854 2.74e+10 3.21e+07
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

36

Energy Consumption Model Test for Energy Conservation

Using the previously validated Energy Consumption Model, which had been generated using only
non-treatment period data, predictions were generated for the ‘out of sample’ treatment period of
November 2010 through February 2011. These predictions were compared to observed metered
energy consumption for the on-island population, in order to detect a difference between the
predictions and the observations that would indicate energy conservation behavior. Energy
conservation behavior would be detectable as energy consumption observations during the
treatment period that were significantly lower than the model predicted consumption. The
findings from this analysis are described in the results section under “Energy Model Evaluation of
Treatment Period Energy Consumption (Pg. 44).”

Test 3 – Household Regression Modeling & Household Survey
Creation of Household-level Regression Models
In order to determine whether households responded to conservation appeals, individual
regression models were created for each of the metered homes on the island using training data
consisting of the non-treatment period. The model formula was determined for all of the homes,
but regression coefficients as well as treatment period model predictions were calculated for each
of the homes individually. In order to normalize for the native variability in home energy
consumption, predictions were calculated as standard errors of the original training model. In this
way, the predictive power of the base model is accounted for in assessing how extreme the
difference in predictions are from the observed results, and allows for treatment residuals to be
comparable across homes.
First, the Summed Energy Consumption Model was applied to each of the 1618 on-island meters,
and the adjusted R2 values of the models using both the Quadratic and Quartic regression

37

formulas were compared. The following figure shows the frequency distribution of adjusted R
squared for each model.

Figure 23 - Frequency distribution of R-squared values of 300 household level regression models. Figure shows
distributions for models using both quadratic and quartic temperature variables.

The improvements in overall predictive power for the quartic fit support the previously validated
quartic model as the best choice for the household-level analysis. Having decided on the quartic
model, the predictions for each of the 300 individually generated multiple regression models were
generated and the residuals from the predicted consumption and the observed consumption during
the treatment period were calculated. In order to adjust for the various sizes among the homes, as
well as for the varying predictive power found in the models, each residual was divided by the
standard deviation of the residuals from that home’s pre-treatment model to generate out of
sample standard residuals. This allows for each treatment residual to be compared to each other
treatment residual from other homes’ models, while at the same time accounting for the baseline
variability of that home’s model.
38

Gathering of Survey Data on Household Attitudes, Beliefs and Characteristics
In addition to interest in the efficacy of the PenLight outreach program on the general population
of Fox Island, this research hoped to identify specific factors that predicted higher or lower levels
of conservation response on a household level. To this end, a survey was designed and
administered to a random sample of 300 residential electric customers on Fox Island. This survey,
in conjunction with data gathered from a third-party vendor, was designed to assess the key
household characteristics that were believed to be predictive of customers’ willingness to
conserve energy when asked to do so by the utility company. The survey questions fell into four
general categories:
-Power Sharing Program: Knowledge and Attitudes [is there a reason attitudes is capitalized?]
about the “Power Sharing” remote hot water heater load controller program;
-Cable Failure: Knowledge and Attitudes about the partial failure of the underwater power cable,
as well as knowledge of and self-reported response to the utility’s voluntary energy conservation
requests;
-Energy Conservation Motivations: Including self-reported conservation efforts, beliefs about
neighbors’ conservation efforts, and willingness or intent to conserve energy in the future;
-Demographic and Segmentation Data: Including household size, income, ages of occupants,
and highest levels of academic achievement.
Demographics data was purchased from Acxiom Corporation, and the sampling protocol included
segmentation by age in order to provide an age representative sample. Calls were attempted to
604 households, with 300 completed responses, 250 declined, and 54 disconnected or wrong
numbers. The sample results have a margin of error of +/- 5% (CI=95%).
One of the basic research hypotheses is that responses to voluntary conservation outreach efforts
will be significantly predicted by one or more of these variables: attitude, knowledge, behavior,
39

belief or demographics. This is of interest even if, at the summed level, Fox Island residents did
not significantly reduce their energy consumption during the treatment period. Even if the
majority of residents did not conserve energy, it is possible that some residents did, and if those
responsive residents are predictable from their survey results, this has important implications in
the design of future conservation outreach efforts.
Comparison of Survey Results with Individual Regression Model Results

After generating standard residuals for each of the 300 sampled homes, the average standard
residual for each home during the treatment period was calculated. For a given home, a low
average standard residual suggests that home had lower than expected energy consumption, while
accounting for the inherent error of that home’s model. Conversely, a high average standard
residual indicates that a home used more energy than expected during the treatment period, while
accounting for the model error.
Pearson’s Correlation Coefficients were calculated, comparing survey responses against the
average standard residuals to detect consistent trends. Pearson’s coefficient is suitable for both
continuous data and binary data, and so was applied for each of the survey questions. For each
pairing of survey question and average standard error, a coefficient and p-value was calculated.
The results of this correlation analysis are described in the section “Test 3 Results – Survey
Response and Individual Regression Model Analysis” (Pg. 46).

40

Results
Test 1 Results – Difference in Differences Comparison with Off-island
Control Group
To complete the Difference in Differences (DiD) analysis, the on-island/off-island energy
consumption differences were regressed against the average daily temperature in degrees F, and
the following prediction equation was calculated:

EnergyUseOnIsland – OffIsland = -.028TempF4 + 6.41TempF3 -427.9TempF2
+10840TempF2 + 58519
The predicted differences were subtracted from the observed differences in order to determine
whether, during the treatment period, the on-island population consumed less energy than
predicted by the off-island control group’s consumption.
Figure 24 shows a line plot of the full set of differences between the predicted for the
pretreatment and treatment periods, with the treatment period shown in red. This plot shows that,
contrary to the research hypothesis, the DiD for the Treatment period was not significantly lower
than the DiD from the non-treatment period. Contrary to the research hypothesis, the DiD
suggests higher on-island energy consumption, as compared to the off-island control, during the
research period than during any other period.

41

Figure 24 - Fox Island daily energy use minus Cromwell control group daily energy use, 2008 - 2012. The pre-crisis
period is shown in blue, and the treatment period with utility outreach is shown in red.

The average Differences during the treatment period are higher than the same period during other
years, so that a test for a statistically significant reduction in energy consumption during the
treatment period was unnecessary. The DiD failed to reject the Null Hypothesis HO1 that there
was a significant reduction in energy usage during the treatment period.
In order to determine whether there had been any short-term reductions in energy consumption
within the control period, the predicted difference calculated by the above prediction equation,
and the observed differences were compared on a daily basis for the treatment period only. The
following plot shows the predicted differences between on-island and off-island populations, as
well as the observed differences.

42

Figure 25 - Results of Differences in Differences test for the winter 2010 treatment period, using the Cromwell offisland control group, normalized via quartic regression. The test shows that the observed differences are higher than
the predictions.

Figure 25 shows, despite the utility’s outreach efforts, at no point did island residents consume
less energy relative to the off-island group than was predicted by the differences regression
model. In fact, beginning in early December, the on-island residents appeared to consume more
energy than would be expected as compared to the off-island control group.
For both the overall treatment period, as well as within the treatment period, comparison of the
on-island treatment group to the off-island control group does not reveal any reductions in energy
consumption during the treatment period.1

1

Though see discussion of loss of power on November 22nd, 2010, the evening of the sole live telephone
outreach to Fox Island residents.

43

Test 2 Results – Energy Model Evaluation of Treatment Period Energy
Consumption
In addition to using Differences in Differences approach, a regression modeling approach was
completed in order to test whether the on-island population had significantly reduced energy
consumption during the treatment period. Figure 26 shows the regression model “predicted” as
well as the meter dataset “observed” on-island kWh consumption, for both the non-treatment and
treatment periods. Energy conservation efforts would be detected as energy consumption
observations that are significantly less than those predicted by the regression model, during the
treatment period.

Figure 26 - Daily observed and modeled daily energy consumption for the Fox Island population, before during and
after the winter 2010 treatment period.

44

Figure 26 shows that, during the non-treatment period, observed consumption generally aligns
well with predicted consumption. For the non-treatment period used to generate the model, the
Energy Consumption model successfully predicts over 97% of the variation in observed energy
consumption (F(25,854)=1259, p<.0001, Adj.R2 = .9728). The out-of-sample predictions and
observations are shown in Figure 26 in green and purple, respectively.
As shown in Figure 27 (Observations minus Energy Consumption Model Predictions), generally
the observed consumption is greater than that predicted by the Energy Consumption model,
suggesting that energy consumption was higher (or at least not significantly lower) during the
treatment period.

Figure 27 - Daily residuals between modeled energy consumption and observed energy consumption for Fox Island
population before, during and after treatment period.

The Multivariate Regression Energy Consumption Model Analysis failed to reject the Null
Hypothesis that the on-island population used significantly less energy during the treatment

45

period. Further, at no period within the treatment was any large reduction in energy consumption
observed.2

Test 3 Results – Survey Response and Individual Regression Model Analysis
Survey questions were designed to evoke responses that might be predictive of a household’s
willingness to conserve energy when asked to do so by their utility. Some answers to survey
questions served to group respondents into binary sections (for instance, Participants in the
Utility’s Power Sharing Program versus non-participants). Other questions were designed to elicit
a response on a scale of 1-5 (for example, a question asking respondents to rate how the Peninsula
Light Company handled the power cable outage). Finally, some demographics items grouped
households into age or income categories that might have as many as a dozen discrete factor
levels. Table 1 below shows the individual questions, the correlation of survey responses to that
question with observed household energy conservation levels, and the level of significance
associated with that correlation. Note that in many cases the sample size is far less than the 300
households surveyed. Following protocols established by the Northwest Energy Efficiency
Alliance (NEEA), households with low performing regression models (R-squared <.50) were
omitted from the samples. Also, households which declined to respond to the particular survey
question are omitted from the samples for that question. As shown below, weak positive and
negative associations between survey items and energy conservation are observed for several
items, however significant correlations are observed for only two items. Respondents who
indicated that their primary reason for conserving energy was “for future generations” used
higher levels of energy than their peers. On the other hand, residents of Fox Island who expressed
a strongly favorable opinion of the Peninsula Light Company’s handling of the cable crisis
consumed significantly less energy than their peers.

2

Though see discussion of loss of power on November 22 nd, 2010, the evening of the sole live telephone
outreach to Fox Island residents.

46

Table 1 - Sample sizes, correlation coefficient and associated p-values for each survey element and those households'
observed conservation responses during the treatment period, as measured by mean standard residuals of their
regression models.

Short Question Text
Category
Education
Demographics
Income
Demographics
Marital status
Demographics
Length of residence
Demographics
Home square footage
Demographics
Home year built
Demographics
Age of Interviewee
Demographics
Average PreTreatment Usage
Demographics
Gender of Interviewee
Demographics
Age of Oldest in Home
Demographics
Age of Youngest in Home
Demographics
Plans to reduce consumption in future. Future Plans
Adjusted Thermostat
Actions Taken
Turned off appliances and computers
Actions Taken
Changed water sprinkers
Actions Taken
Delayed running of appliances
Actions Taken
Turned off lights and fans
Actions Taken
Reduced electric loads
Actions Taken
Number of actions taken
Actions Taken
Purchased a new heat pump
Actions Taken
Purchased a new water heater
Actions Taken
Didn't use electric blanked
Actions Taken
Reduced hot water consumption
Actions Taken
Purchased energy efficient appliances. Actions Taken
To keep my bills low
Motivations
To help the Fox Island community
Motivations
To keep community bills low
Motivations
My friends and neighbors are conserving Motivations
For future generations
Motivations
To protect the environment
Motivations
Opinion of PenLight's handling of cable incident
Opinion
Aware of the loss of the cable.
Participation
Aware that they were asked to conserve Participation
Self-estimated conservation achievementParticipation
Took steps to permanently reduce consumption
Participation
Participated in the Power Sharing program.
Participation
Tried to conserve in response to the cableParticipation
incident.

Variable
SampleSize Correlation pValue
acx_educ
186
0.07
0.36
acx_income
186
-0.05
0.47
acx_marital
186
0.05
0.54
acx_resten
186
0.06
0.40
acx_sqft
142
0.13
0.11
acx_yrbuilt
142
-0.06
0.50
AgeQ17
183
-0.01
0.87
AveragePreTreatUse
186
0.04
0.63
GenderQ18
186
0.00
0.97
OldestAge
186
0.11
0.15
YoungestAge
186
-0.02
0.81
PlanToReduceQ14
175
-0.05
0.51
AdjustThermostat
186
0.03
0.73
AppliancesAndComputers
186
-0.03
0.66
ChangedSprinklers
186
0.02
0.77
DelayAppliances
186
0.07
0.37
LightsAndFans
186
0.01
0.91
LoadControl
186
0.07
0.33
MeasuresTotal
186
0.05
0.52
NewHeatPump
186
-0.04
0.62
NewWaterTank
186
0.05
0.51
NoElectricBlanket
186
-0.06
0.42
ReduceHotWater
186
0.05
0.52
ReplaceAppliances
186
0.05
0.51
BillsLowQ15c
110
-0.07
0.47
CommunityQ15d
106
0.10
0.29
ComPricesLowQ15b
111
0.03
0.74
FriendsAndNeighbQ15f
90
0.01
0.96
FutureGenerQ15e
110
0.20
0.04
ProtectEnviroQ15a
110
0.01
0.92
CableHandlingQ7
92
-0.29
0.00
CableAwareQ6
184
0.07
0.33
CutBackAwareQ8
166
0.07
0.40
EstimatedConservedQ11
72
-0.08
0.48
PermanentReduceQ13
183
-0.09
0.23
PowerShareQ1
186
-0.04
0.57
TriedToConserveQ10
101
-0.03
0.74

47

Discussion
Summary of Attempts to Detect Aggregate Level Energy Conservation
To answer the first question, “did residents conserve?” two methods were used in an effort to
detect conservation. The first method took advantage of the existence of a population living
directly across the channel from Fox Island in the Cromwell area. This off-island population,
which was demographically and geographically comparable to Fox Island’s and was not
subjected to conservation appeals, was used as a control group. A Difference in Differences
approach was used in an attempt to identify a conservation signal on the part of the Fox Island
population. Before a Difference in Differences analysis could be performed, a regression relating
the pre-treatment average energy consumption of the Fox Island population and the Cromwell
population was performed, and a consistent relationship was discovered. This relationship
allowed for the two populations to be approximately normalized against each other, and then
compared directly. The Difference in Differences approach showed that, contrary to the research
hypothesis, the Fox Island population consumed on average more electricity during the treatment
period than would have been expected if they had followed historic consumption patterns, as
controlled for by the Cromwell population.
In addition to comparison against the control group, a multivariate regression model was
constructed, using the pre-treatment dataset for Fox Island residents as a training dataset. This
model, once completed and refined, provided excellent predictive power when predicting
aggregated consumption of Fox Island residents, successfully predicting nearly 98% of the
observed variation in consumption. In an effort to validate the model, a five way cross fold was
performed, sequentially withholding five different randomly selected subsets of the training data,
then predicting this “hold out” data set from a model specified from the remaining 4/5 of the data.
Using the cross fold validation method, the revised R-squared value remained substantively
48

unchanged, and the adjusted estimate of the R-squared value for out-of-sample predictions was
still slightly over .97.
Having validated a regression model for the prediction of summed energy usage, the predicted
energy consumption was compared to the observed consumption during the treatment period. A
conservation signal should present as observed values that are substantively less than the model
predictive values. Instead, regression modelling showed observed consumption in excess of the
model predicted values for nearly all of the treatment period.

Summary of Attempts to Detect Patterns in Individual Level Conservation
After identifying the 300 homes that responded to the telephone survey, the regression model,
which had previously been specified at the aggregate level, was applied to each home
individually. This meant that while the variable coefficients differed from home to home, the
models’ formulae were identical from home to home. This approach meant that each home had
varying model coefficients and errors, and each home’s energy consumption levels also varied
widely from home to home during the pre-treatment period. In order to normalize between homes
to allow for meaningful comparisons, standard residuals were calculated for the treatment period
for each home. The average standard residuals for each home were then correlated with the
responses to the telephone survey in an effort to detect survey responses that were predictive of
lower or higher standard residuals among the homes.
This method did not reveal statistically significant relationships between most of the survey
variables and the homes’ average standard residuals, with the exception of two variables. The first
question, in which respondents were asked to rate Peninsula Light’s efforts to address the cable
failure, showed a negative and statistically significant relationship with average standard
residuals. In other words, residents who responded with more positive feelings about Pen Light’s
handling of the cable failure crisis consumed, on average, less electricity than those with less
49

positive feelings. The second relationship emerged from a question asking respondents to rate
reasons why they would conserve electricity in the future. Those residents who indicated that
conserving resources for future generations was a strong motivator for future conservation
generally used more electricity during the treatment period than their peers who responded less
positively to this motivation for future conservation efforts.

Urgent Telephone Conservation Appeal, November 22, 2010
In addition to the general outreach performed by Pen Light to encourage conservation efforts, a
single automated “robo-dialer” telephone outreach was performed during the afternoon of
November 22nd, 2011, appealing to residents to reduce their energy consumption--especially
during peak hours--because overnight temperatures would be extremely cold and placed the
system in jeopardy of exceeding the remaining cable capacity. Unfortunately, physical damage
associated with the storm event caused loss of power for the entire island, a loss that lasted
throughout the night and well into the next day. This loss of power means that any conservation
efforts that might have been undertaken by Fox Island residents were not possible. It is possible
that, had electric service been available throughout this winter storm, a short-term response to the
urgent telephone outreach would have been detectable at the aggregate or individual household
levels. Unfortunately, since telephone outreach was only performed once during the treatment
period, it is impossible to determine whether telephone outreach could have been effective.

Possible Reasons for Lack of Conservation Finding
While it may go without saying that the old adage “absence of evidence is not evidence of
absence” holds true, it seems appropriate to emphasize this point here. This research effort did not
reveal, generally, a sustained response to utility conservation appeals, but this did not mean that
such a response was not occurring. It may also be that residents of Fox Island responded to

50

Peninsula Light Company appeals in a more sophisticated manner than this author anticipated.
The script for the November 22nd telephone outreach shows that Pen Light clearly called for
reduction or delay of the consumption of electricity during peak usage hours. It is possible that,
during the course of town hall meetings and in outreach materials, residents took away
instructions not to conserve energy overall, but to specifically limit electricity consumption
during “peak hours.” This might have led not to a decrease in overall consumption, but merely a
shift of consumption from peak to off-peak hours. While it would be theoretically possible to
investigate the proposition that energy consumption during peak hours was reduced, doing this
through examination of sub-station records would be very difficult. Since Peninsula Light
Company was also actively using water heater load controls for load shifting during the treatment
period, this confounding factor would need to be extricated from any voluntary load shifting.
Answering that question is therefore beyond the scope of this research effort.

Implications for Future Conservation Program Development
While it is possible that individual households were conserving energy during the cable crisis,
community level conservation was not detectable either through comparison against an off-island
control group or through regression modeling. Also, while significant correlations were
discovered between two of the survey items and regression modeled household behavior, the lack
of connections across multiple survey items does reduce confidence in the ability to predict
individual conservation efforts from such questionnaires.
Alternatively, the lack of response could be due to insufficient or inconsistently provided
information by the utility to customers to encourage their conservation. Telephone outreach was
only performed once to residents, and due to a power outage it was impossible to determine
whether this outreach was effective. It is possible that more frequent or aggressive outreach
would have resulted in more conservation.
51

Ultimately, the utility may have appropriately judged the amount of effort that was required for
this situation. While it did not appear that voluntary conservation occurred in significant amounts,
the utility did not exceed the cable capacity or have to resort to rolling blackouts. Thus the
observation that the amount of outreach may have been insufficient to prompt conservation
should not be taken as a criticism, per se. The amount of outreach undoubtedly would have
increased if the utility found itself frequently approaching the limits of the damaged cable.
A major aspect of this research was the creation of an automated process for creating household
level regression models. This process was successfully applied to 300 individual homes’ meter
data, and could potentially be applied to a much larger population of homes. Regression models
allow for the estimation of relationships between a home and external weather conditions, and a
potential application of mass household modeling would be to identify homes which would be
likely candidates for home energy efficiency upgrades.
Peninsula Light Company and thousands of other electric utilities are deploying digital metering
devices which will result in a data influx of monumental scale. This data represents both a
challenge and an incredible opportunity to leverage machine learning and predictive modeling for
demand side management, demand prediction, and conservation project verification.

52

Works Cited
[1] W. Leighty and A. Meier, “Accelerated electricity conservation in Juneau, Alaska:
A study of household activities that reduced demand 25%,” Energy Policy, vol. 39,
no. 5, pp. 2299–2309, 2011.
[2] NREL, “Solar Power and the Electric Grid.” U.S. Department of Energy, National
Renewable Energy Laboratory, SunShot Initiative, 2010.
[3] U.S. DOE, “Benefits of Demand Response in Electricity Markets and
Recommendations for Achieving Them.” Lawrence Berkely National Laboratory,
Feb-2006.
[4] DOE, “2010 Smart Grid System Report - Report to Congress, February 2012,”
United States Department of Energy, Feb. 2012.
[5] A. Faruqi and Palmer, Jennifer, “Dynamic Pricing and its Discontents,” Regulation,
vol. 34, no. 3, p. 16, Nov. 2011.
[6] R. T. A. Croson, “Theories of commitment, altruism and reciprocity: Evidence from
linear public goods games,” Econ. Inq., vol. 45, no. 2, pp. 199–216, 2007.
[7] H. Sarak and A. Satman, “The degree-day method to estimate the residential heating
natural gas consumption in Turkey: a case study,” Energy, vol. 28, no. 9, pp. 929–
939, 2003.
[8] EVO, “International Performance Measurement and Verification Protocol.”
Efficiency Valuation Organization, Jan-2012.
[9] CEC, “Public Interest Energy Strategies Report,” California Energy Commission,
100-03-012D, Aug. 2003.
[10] TomTheHand, “File:Oil Prices 1861 2007.svg - Wikipedia, the free encyclopedia.”
[Online]. Available: http://en.wikipedia.org/wiki/File:Oil_Prices_1861_2007.svg.
[Accessed: 25-Nov-2013].
[11] D. Yergin, “Ensuring Energy Security | Foreign Affairs.” [Online]. Available:
http://www.foreignaffairs.com/articles/61510/daniel-yergin/ensuring-energysecurity. [Accessed: 25-Nov-2013].
[12] DOE, “A BRIEF HISTORY OF THE DEPARTMENT OF ENERGY | Department
of Energy.” [Online]. Available: http://energy.gov/node/%20362173. [Accessed: 25Nov-2013].
[13] P. C. Reiss and M. W. White, “What changes energy consumption? Prices and
public pressures,” RAND J. Econ., vol. 39, no. 3, pp. 636–663, 2008.
[14] USCB, “American FactFinder - Results.” [Online]. Available:
http://factfinder2.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=
ACS_11_5YR_DP03. [Accessed: 25-Nov-2013].
[15] “fox island wa - Google Maps,” Fox Island Map. [Online]. Available:
https://maps.google.com/maps?oe=utf-8&client=firefoxa&q=fox+island+wa&ie=UTF8&hq=&hnear=0x5491abf3505897b9:0x7a531c28d0494abe,Fox+Island,+WA&gl=
us&ei=w66SUtK1GMqliQKHo4DACQ&ved=0CLcBELYD. [Accessed: 25-Nov2013].
[16] NOAA, “Quality Controlled Local Climatological Data (QCLCD) | National
Climatic Data Center (NCDC),” Quality Controlled Local Climatological Data.
53

[17]
[18]

[19]
[20]
[21]

[22]

[Online]. Available: http://www.ncdc.noaa.gov/data-access/land-based-stationdata/land-based-datasets/quality-controlled-local-climatological-data-qclcd.
[Accessed: 03-Nov-2013].
“Puget Sound Energy Service Area fact sheet - 1213_service_area_map.pdf.” .
D. Posada and T. R. Buckley, “Model selection and model averaging in
phylogenetics: advantages of Akaike information criterion and Bayesian approaches
over likelihood ratio tests,” Syst. Biol., vol. 53, no. 5, pp. 793–808, 2004.
S. Fortmann-Roe, “Understanding the Bias-Variance Tradeoff.” Jun-2012.
S. Fortmann-Roe, “Accurately Measuring Model Prediction Error.” May-2012.
H. Bozdogan, “Model selection and Akaike’s Information Criterion (AIC): The
general theory and its analytical extensions,” Psychometrika, vol. 52, no. 3, pp.
345–370, Sep. 1987.
LBNL, “Estimated U.S. Energy Use in 2012.” Lawrence Berkeley National
Laboratory, May-2013.

54

Appendices
Appendix A – High/Low Usage Day Pairs
Table 1-A - 22 high/low pairs of summed electric meter reads from Fox Island, WA.

Date
SummedUsage Date
SummedUsage Date
SummedUsage
20080601
6526 20080811
141405 20081108
12112
20080602
174687 20080817
15640 20081109
178351
20080608
6283 20080818
153208 20081116
17653
20080609
199927 20080824
16919 20081117
210922
20080615
14747 20080825
147958 20081206
19509
20080616
156590 20080831
16287 20081207
146000
20080622
14417 20080901
155738 20081214
28223
20080623
194644 20080907
13357 20081215
300357
20080629
15921 20080908
152033 20090103
7373
20080630
162363 20080914
17060 20090104
290981
20080706
13872 20080915
152133 20090207
13407
20080707
145887 20081011
4929 20090208
252196
20080803
14069 20081012
224042 20090221
1241
20080804
144645 20081019
16231 20090222
217522
20080810
14629 20081020
187866
A set consisting of 22 pairs of days consisting of one abnormally low usage day followed by one
abnormally high usage day. These pairs were selected via formula and removed from the analysis
data set. These pairs are believed to be instances where usage for a substantial portion of the
island’s meters was not read on the first day, and on the following day the readings “caught up”
showing a higher than normal consumption.

55

Appendix B – Reading Days with High Levels of Missing Meter Data
Table 1-B - List of dates excluded from training dataset due to high ratios of missing meter reads, along with the
associated level of missing readings.

20081020

Days Removed From Analysis for Missing Readings
17%
20090116 21%
20090613 16%
20090821

16%

20081021

20%

20090117

20%

20090614

16%

20090822

16%

20081022

22%

20090118

19%

20090615

16%

20090823

16%

20081023

23%

20090119

17%

20090616

16%

20090824

16%

20081024

23%

20090120

16%

20090617

16%

20090825

16%

20081025

23%

20090311

15%

20090618

16%

20090826

16%

20081026

23%

20090411

15%

20090619

16%

20090827

16%

20081027

23%

20090412

15%

20090620

16%

20090828

16%

20081028

23%

20090413

15%

20090621

16%

20090829

16%

20081029

23%

20090414

15%

20090622

16%

20090830

16%

20081030

22%

20090415

15%

20090623

16%

20090831

16%

20081031

22%

20090416

15%

20090624

16%

20090901

16%

20081101

22%

20090417

15%

20090625

16%

20090902

16%

20081102

22%

20090418

15%

20090626

16%

20090903

16%

20081103

22%

20090419

15%

20090627

16%

20090904

16%

20081104

21%

20090420

15%

20090628

16%

20090905

16%

20081105

19%

20090421

15%

20090629

16%

20090906

16%

20081106

17%

20090422

15%

20090630

16%

20090907

16%

20081119

27%

20090423

15%

20090701

16%

20090908

17%

20081120

38%

20090424

15%

20090702

16%

20090909

17%

20081121

45%

20090425

15%

20090703

16%

20090910

17%

20081122

44%

20090426

15%

20090704

16%

20090911

17%

20081123

52%

20090427

15%

20090705

16%

20090912

17%

20081124

52%

20090428

15%

20090706

16%

20090913

17%

20081125

51%

20090429

15%

20090707

16%

20090914

17%

20081126

51%

20090430

15%

20090708

16%

20090915

17%

20081127

51%

20090501

15%

20090709

16%

20090916

17%

20081128

51%

20090502

15%

20090710

16%

20090917

17%

20081129

51%

20090503

15%

20090711

17%

20090918

17%

20081130

51%

20090504

15%

20090712

16%

20090919

17%

56

20081201

52%

20090505

15%

20090713

16%

20090920

17%

20081202

52%

20090506

15%

20090714

16%

20090921

17%

20081203

51%

20090507

15%

20090715

16%

20090922

17%

20081204

46%

20090508

15%

20090716

16%

20090923

17%

20081205

44%

20090509

15%

20090717

16%

20090924

17%

20081206

43%

20090510

15%

20090718

16%

20090925

17%

20081207

42%

20090511

16%

20090719

16%

20090926

17%

20081208

40%

20090512

15%

20090720

16%

20090927

17%

20081209

32%

20090513

15%

20090721

16%

20090928

17%

20081210

27%

20090514

15%

20090722

16%

20090929

17%

20081211

22%

20090515

15%

20090723

16%

20090930

17%

20081212

19%

20090516

15%

20090724

16%

20091001

17%

20081213

19%

20090517

15%

20090725

16%

20091002

17%

20081214

19%

20090518

16%

20090726

16%

20091003

17%

20081215

18%

20090519

16%

20090727

16%

20091004

17%

20081216

18%

20090520

16%

20090728

16%

20091005

17%

20081217

18%

20090521

16%

20090729

16%

20091006

17%

20081218

18%

20090522

16%

20090730

16%

20091007

17%

20081219

17%

20090523

16%

20090731

16%

20091008

17%

20081220

17%

20090524

16%

20090801

16%

20091009

17%

20081221

17%

20090525

16%

20090802

16%

20091010

17%

20081222

16%

20090526

16%

20090803

16%

20091011

17%

20081223

15%

20090527

16%

20090804

16%

20091012

17%

20081224

15%

20090528

16%

20090805

16%

20091013

17%

20081225

15%

20090529

16%

20090806

16%

20091014

17%

20081226

15%

20090530

16%

20090807

16%

20091015

17%

20081227

15%

20090531

16%

20090808

16%

20091016

17%

20081228

15%

20090601

16%

20090809

16%

20091017

17%

20090105

18%

20090602

16%

20090810

16%

20091018

17%

20090106

22%

20090603

16%

20090811

17%

20091019

17%

20090107

24%

20090604

16%

20090812

16%

20091020

17%

20090108

24%

20090605

16%

20090813

16%

20091021

18%

20090109

23%

20090606

16%

20090814

16%

20091022

16%

20090110

23%

20090607

16%

20090815

16%

20091023

16%

20090111

24%

20090608

16%

20090816

16%

20091024

16%

57

20090112

23%

20090609

16%

20090817

16%

20091025

16%

20090113

23%

20090610

16%

20090818

16%

20100101

70%

20090114

22%

20090611

16%

20090819

16%

20101218

100%

20090115

22%

20090612

16%

20090820

16%

20101231

30%

Dates and percentage of expected meter reads missing from PenLight dataset.

Figure 1-B28 - Uncleaned summed daily meter reads by average daily temperature, showing a bifurcation of the data
into two distinct groups.

58

Appendix C – Data Discrepancies, Meter Usage versus Substation Measured
Usage.
Table 1-C - Days excluded from training dataset due to discrepancies between household meter data and substation
metered data, showing the summed consumption in kWh from each source.
Date

MeterkWh

Subst.kWh

20080128.0

124727.9

194451.8

20080129.0

269816.2

20080201.0

Date

MeterkWh

Subst.kWh

20090216.0

140103.1

72679.1

191705.4

20090217.0

127015.3

136764.1

176791.5

20090218.0

20080202.0

230342.1

183995.9

20080301.0

107498.1

20080302.0
20080527.0

Date

MeterkWh

Subst.kWh

20090711.0

72935.8

83876.3

65626.6

20090712.0

74110.1

83377.1

132269.4

67924.6

20090714.0

70260.2

77961.2

20090219.0

130813.8

67572.8

20090715.0

72275.1

80174.7

139903.9

20090220.0

138499.9

71144.1

20090718.0

75751.5

84027.6

185617.0

146928.4

20090221.0

1451.8

68564.5

20090719.0

76839.5

85143.5

85340.9

62030.6

20090222.0

254597.3

62388.8

20090720.0

75656.6

83406.0

20080529.0

82968.5

64285.3

20090223.0

116506.0

60172.0

20090722.0

72634.9

80788.6

20080530.0

89094.0

73007.8

20090224.0

121528.3

63173.8

20090723.0

69629.4

77693.3

20080601.0

6657.6

95165.9

20090225.0

137321.4

70404.5

20090724.0

72984.1

80288.4

20080602.0

178282.9

90683.8

20090226.0

153570.5

78260.1

20090725.0

80617.1

89301.2

20080608.0

6417.6

103410.6

20090227.0

135460.1

68871.8

20090804.0

72370.5

80122.5

20080609.0

204547.7

106507.0

20090228.0

134617.3

68770.6

20090805.0

70794.4

78574.8

20080615.0

15038.2

88872.3

20090301.0

124591.4

64660.3

20090806.0

70191.7

78282.9

20080616.0

159813.3

87147.0

20090302.0

109133.5

55716.6

20090807.0

70479.9

77857.6

20080622.0

14695.6

84707.5

20090303.0

114940.5

58856.9

20090808.0

72036.9

79302.2

20080623.0

198732.5

85190.9

20090304.0

122840.7

63664.5

20090809.0

73591.1

82414.7

20080624.0

136378.7

82140.6

20090305.0

130197.8

66586.5

20090810.0

131493.5

79059.9

20080629.0

16215.3

97427.5

20090306.0

140494.6

71222.9

20090811.0

127961.4

77306.0

20080630.0

165637.0

88661.1

20090307.0

136023.7

72071.4

20090812.0

68895.1

77103.3

20080706.0

14134.3

83649.5

20090308.0

155690.8

81389.9

20090814.0

70128.7

77158.7

20080707.0

148645.2

82449.5

20090309.0

152816.3

93364.0

20090815.0

71018.8

78585.0

20080716.0

77073.7

85647.0

20090310.0

147509.9

107365.8

20090816.0

73502.6

81755.0

20080717.0

77084.1

85647.0

20090311.0

149553.0

110472.4

20090823.0

74279.1

81854.9

20080729.0

93761.3

80796.4

20090312.0

136815.7

101019.0

20090825.0

130708.6

78872.7

20080803.0

14364.5

81753.2

20090315.0

258380.2

80918.6

20090828.0

71629.0

78957.0

20080804.0

147927.0

82286.6

20090316.0

146355.9

76633.2

20090829.0

72784.7

80591.0

20080810.0

14924.0

82819.8

20090317.0

134636.2

72442.7

20090830.0

73457.1

82896.6

20080811.0

144912.4

80713.6

20090318.0

122968.5

66245.5

20090831.0

71837.2

79823.9

20080817.0

15981.7

91187.2

20090319.0

115454.9

61229.4

20090901.0

73908.2

81932.4

20080818.0

156555.1

82412.2

20090320.0

112761.2

95617.0

20090902.0

72209.3

80637.8

20080824.0

17267.3

83365.9

20090321.0

136624.6

155411.9

20090903.0

69427.6

77227.1

20080825.0

151128.1

79358.8

20090322.0

128739.5

143721.8

20090904.0

68667.8

77052.4

20080829.0

78076.3

62870.7

20090323.0

138907.2

158179.5

20090905.0

71077.4

79556.1

59

20080831.0

16594.9

82938.0

20090324.0

127717.8

145266.9

20090906.0

76566.6

84799.0

20080901.0

158682.5

85783.0

20090325.0

126721.1

143796.2

20090907.0

78307.4

87805.5

20080907.0

13609.5

84737.4

20090326.0

127921.3

143508.4

20090908.0

72286.4

80572.5

20080908.0

154907.4

80984.4

20090327.0

120311.9

133121.8

20090909.0

69728.3

77412.5

20080914.0

17389.7

85109.4

20090328.0

138624.6

159029.8

20090910.0

68627.2

77259.4

20080915.0

155073.1

81618.8

20090329.0

126925.5

142006.5

20090911.0

69643.3

79147.5

20081011.0

5057.5

122180.8

20090330.0

128044.5

143181.6

20090912.0

73203.0

82140.2

20081012.0

228936.6

109369.8

20090331.0

113941.3

128705.9

20090913.0

75851.6

84393.5

20081019.0

18784.6

124110.9

20090401.0

141019.1

160580.7

20090914.0

70013.2

79654.6

20081020.0

226517.4

116369.4

20090402.0

120824.6

133100.3

20090915.0

69899.2

78758.7

20081108.0

13488.6

98414.8

20090405.0

100266.2

111996.0

20090916.0

68894.8

77976.3

20081109.0

197999.2

108667.1

20090407.0

84464.6

95084.9

20090917.0

68082.5

77720.9

20081116.0

18789.9

126445.3

20090408.0

96729.2

109161.1

20090918.0

69443.2

77762.4

20081117.0

224217.7

117783.9

20090409.0

99382.2

110474.7

20090919.0

70441.5

79159.1

20081120.0

123412.1

136585.4

20090410.0

107089.5

118496.0

20090920.0

76333.0

84567.7

20081121.0

120488.4

135847.8

20090411.0

109922.0

123851.4

20090921.0

72819.7

81239.8

20081122.0

121979.0

137889.2

20090412.0

112996.3

128514.7

20090922.0

71710.9

79657.4

20081123.0

123667.8

144057.2

20090413.0

123225.2

139340.7

20090923.0

69407.2

78609.6

20081124.0

130474.8

147053.5

20090414.0

117953.1

134267.9

20090924.0

69478.1

77671.9

20081125.0

124735.4

146408.5

20090415.0

110784.4

123784.1

20090925.0

71375.5

79687.0

20081126.0

124765.5

143610.9

20090416.0

101608.4

113128.6

20090926.0

74723.3

82281.4

20081127.0

136190.1

157227.5

20090419.0

86759.1

96884.4

20090927.0

77925.7

86961.8

20081128.0

113222.6

132366.3

20090420.0

76036.4

84846.4

20090929.0

83705.6

93079.1

20081206.0

34010.4

145731.4

20090421.0

72086.5

81951.8

20091004.0

91363.7

100993.9

20081207.0

253455.2

140439.1

20090426.0

101105.7

112232.7

20091008.0

87259.6

96426.6

20081209.0

204347.0

148518.4

20090427.0

89864.6

98971.9

20091112.0

304601.2

133623.9

20081213.0

162116.0

180901.6

20090428.0

93468.1

103947.6

20091213.0

20194.0

217212.1

20081214.0

34756.5

226255.4

20090429.0

90218.6

101210.5

20100613.0

80277.3

54111.8

20081215.0

368424.0

248484.4

20090430.0

88704.8

99792.6

20100713.0

179965.8

70124.4

20081216.0

198583.5

228615.2

20090501.0

78795.2

88406.9

20100721.0

76253.2

49448.7

20081217.0

197078.1

229737.8

20090502.0

82078.1

91522.8

20100722.0

81154.2

54128.5

20081218.0

198031.4

228261.4

20090503.0

86684.7

96749.7

20100906.0

90903.0

76807.0

20081219.0

211613.9

247484.7

20090504.0

91208.3

102291.6

20100922.0

88640.5

79184.0

20081220.0

223450.6

264286.7

20090505.0

93754.6

103221.5

20100930.0

82636.0

72253.4

20081224.0

189580.3

218003.0

20090506.0

100279.1

110940.4

20101001.0

81725.9

71078.7

20081225.0

177460.2

204334.6

20090507.0

89991.2

100480.4

20101002.0

81161.5

70201.1

20081226.0

177840.5

205551.8

20090512.0

95757.9

107222.5

20101003.0

89949.6

78337.1

20081227.0

157876.1

175495.3

20090513.0

108047.8

119328.3

20101004.0

92871.5

81206.2

20081229.0

157716.0

177191.5

20090514.0

97319.6

107478.4

20101006.0

90793.2

79507.3

20081230.0

160483.3

183005.0

20090517.0

74821.4

82352.4

20101007.0

89947.3

79123.7

20081231.0

159760.1

183656.8

20090519.0

82006.2

93133.6

20101008.0

180452.8

74359.1

60

20090101.0

155028.3

173441.0

20090524.0

75205.0

83419.3

20101009.0

87653.4

77204.9

20090102.0

162073.0

186223.5

20090526.0

71150.9

79008.5

20101010.0

92535.3

81308.2

20090103.0

8513.1

194992.2

20090527.0

72847.7

81460.4

20101011.0

95828.2

85020.0

20090104.0

340898.8

212423.5

20090528.0

71521.3

79412.4

20101012.0

96716.0

85337.9

20090106.0

132218.4

147888.7

20090529.0

71121.9

78449.5

20101013.0

97327.0

85840.6

20090107.0

120124.2

133757.9

20090530.0

72578.3

80906.9

20101014.0

97906.0

86325.7

20090114.0

145905.9

163139.1

20090531.0

75987.9

83678.9

20101015.0

102213.0

90810.7

20090115.0

147593.4

164632.0

20090601.0

73098.4

80813.5

20101016.0

121295.0

100755.7

20090116.0

155289.1

174878.8

20090602.0

73744.3

81825.0

20101017.0

119863.0

105983.3

20090117.0

159570.3

180773.3

20090603.0

78572.9

86690.8

20101018.0

112291.3

98877.2

20090120.0

169424.0

186390.5

20090606.0

71645.2

79624.2

20101021.0

95716.0

79135.3

20090121.0

168453.8

188115.8

20090607.0

74058.0

83605.6

20101114.0

108827.4

122606.6

20090122.0

158609.5

176853.5

20090608.0

71994.2

79382.5

20101115.0

108814.4

81190.1

20090123.0

162113.0

186616.5

20090609.0

70863.3

78863.4

20101116.0

1908061.1

88874.8

20090124.0

164311.6

182005.9

20090610.0

70208.3

78148.8

20101123.0

313754.2

203502.7

20090125.0

173388.2

202059.8

20090611.0

66874.5

77701.1

20101218.0

67641.3

175831.7

20090126.0

176562.2

200234.9

20090614.0

74882.2

83574.4

20110104.0

237742.7

211969.3

20090127.0

170945.5

196315.4

20090616.0

69236.1

76929.0

20110616.0

131598.8

85011.3

20090128.0

139600.3

154188.3

20090617.0

70260.9

77579.1

20110617.0

128741.7

82104.4

20090129.0

143075.2

158526.7

20090618.0

68816.0

76571.8

20110808.0

76775.7

68647.9

20090201.0

157132.1

177839.5

20090620.0

73663.4

81397.5

20110809.0

76933.2

66413.5

20090202.0

134088.4

148546.6

20090621.0

76486.0

84696.2

20110828.0

76446.9

85637.5

20090204.0

135864.7

71908.3

20090624.0

69450.1

77395.5

20110829.0

76405.0

31706.6

20090205.0

140867.3

73066.5

20090625.0

69516.8

76878.0

20110830.0

84961.9

45365.1

20090206.0

135077.8

69507.4

20090630.0

130604.0

78875.2

20110912.0

78172.1

86043.0

20090207.0

15487.4

79790.3

20090702.0

74429.6

83647.4

20110913.0

76001.6

83988.1

20090208.0

291465.7

78375.8

20090703.0

80439.3

89224.2

20110915.0

75614.6

83781.9

20090209.0

149911.8

76228.0

20090704.0

79841.9

89508.6

20110920.0

78253.7

86448.2

20090210.0

168440.1

87719.5

20090705.0

78177.2

87189.2

20110921.0

75917.1

88013.2

20090211.0

149725.1

76837.5

20090706.0

72334.4

80479.1

20110926.0

87628.3

96548.0

20090212.0

148023.7

75703.5

20090707.0

70943.5

79072.0

20110927.0

78948.2

88786.5

20090213.0

149024.5

75323.0

20090708.0

71602.6

80509.0

20110929.0

81822.1

90233.9

20090214.0

142689.4

71991.4

20090709.0

71910.1

80326.8

20110930.0

79047.0

87721.6

20090215.0

156435.9

79830.3

20090710.0

73118.6

80458.1

20111206.0

172766.5

146800.1

Dataset consisting of days where the energy consumption measured at the customer meters
differed by more than 10% from the substation measured energy consumption. These days were
removed from the analysis dataset.

61

Appendix D – Script of Telephone Outreach to Fox Island Residents, Nov.
2010
Script of Fox Island Telephone Outreach:
“Hello, this is Peninsula Light Company. We are expecting extreme low temperatures in your
area for the next 12 to 24 hours. In order to prevent possible loss of power, we ask you to
minimize your electric usage between the hours of 5 o’clock and 10 o’clock a.m. and 4 o’clock
and 8 o’clock p.m. This may also require PenLight to activate the Power Sharing program to
further reduce power usage. PenLight may occasionally make this request during the Fox Island
Cable Replacement project. For more information regarding Power Sharing or for tips on how
to reduce your power usage, please visit w w w dot penlight dot org or call 253.857.5950. Thank
you for your patience.
We are testing the power sharing system tomorrow morning as well. This should not result in
“rolling blackouts” at this time but could in the future. We want people to be aware of the
situation and encouraged to participate in Power Sharing.
How to limit electric usage:
·
Turn electric heating down a couple of degrees – wear sweaters, use blankets and utilize
alternate heat sources such as wood or gas.
·

Turn off lights that are not in use.

·

Turn off or unplug appliances or electronics not in use.

Every little bit helps.”

62

Appendix E – Regression Model Iterations
Multivariate Regression Quadratic Model Version A
lm(formula = EstimatedTotalConsumption ~ SelectDryBulbF + I(SelectDryBulbF^2) +
AvgWind + PcntClr + SunlightDur + WeekDayFactor + MonthFactor +
HolidayFactor + DaylightSav, data = Training_Data, na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-21999 -3694 -257 3476 22904
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
556256.46 7968.01 69.81 < 2e-16 ***
SelectDryBulbF
-12014.61 248.92 -48.27 < 2e-16 ***
I(SelectDryBulbF^2)
94.58
2.43 38.95 < 2e-16 ***
AvgWind
561.94 60.75 9.25 < 2e-16 ***
PcntClr
2851.93 866.58 3.29 0.00104 **
SunlightDur
-104.86
8.97 -11.69 < 2e-16 ***
WeekDayFactor2
-4257.17 770.45 -5.53 4.4e-08 ***
WeekDayFactor3
-5706.69 737.61 -7.74 2.9e-14 ***
WeekDayFactor4
-5252.29 732.81 -7.17 1.7e-12 ***
WeekDayFactor5
-4748.15 737.68 -6.44 2.0e-10 ***
WeekDayFactor6
-5361.98 734.48 -7.30 6.6e-13 ***
WeekDayFactor7
-2560.61 732.10 -3.50 0.00049 ***
MonthFactor2
-242.54 1286.64 -0.19 0.85052
MonthFactor3
3085.66 1781.48 1.73 0.08362 .
MonthFactor4
3940.66 2518.79 1.56 0.11807
MonthFactor5
4918.20 3170.25 1.55 0.12119
MonthFactor6
6380.93 3540.62 1.80 0.07186 .
MonthFactor7
4442.01 3348.52 1.33 0.18501
MonthFactor8
-2415.15 2807.95 -0.86 0.38997
MonthFactor9
-12142.14 2168.93 -5.60 2.9e-08 ***
MonthFactor10
-12653.89 1652.13 -7.66 5.1e-14 ***
MonthFactor11
-8707.77 1071.60 -8.13 1.5e-15 ***
MonthFactor12
-1088.82 1106.78 -0.98 0.32551
HolidayFactorWORKDAY -3503.89 1316.98 -2.66 0.00795 **
DaylightSavTRUE
-635.73 1185.98 -0.54 0.59207
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 5700 on 855 degrees of freedom
Multiple R-squared: 0.974,
Adjusted R-squared: 0.973
F-statistic: 1.31e+03 on 24 and 855 DF, p-value: <2e-16

EstimatedTotalConsumption ~ SelectDryBulbF + I(SelectDryBulbF^2) +
AvgWind + PcntClr + SunlightDur + WeekDayFactor + MonthFactor +
HolidayFactor + DaylightSav
Df Sum of Sq
RSS AIC
- DaylightSav
1 9.32e+06 2.78e+10 15243
<none>
2.77e+10 15244

63

- HolidayFactor
1 2.30e+08 2.80e+10 15250
- PcntClr
1 3.51e+08 2.81e+10 15254
- WeekDayFactor
6 2.95e+09 3.07e+10 15321
- AvgWind
1 2.78e+09 3.05e+10 15326
- SunlightDur
1 4.44e+09 3.22e+10 15373
- MonthFactor
11 9.69e+09 3.74e+10 15486
- I(SelectDryBulbF^2) 1 4.92e+10 7.70e+10 16140
- SelectDryBulbF
1 7.56e+10 1.03e+11 16400
Step: AIC=15243
EstimatedTotalConsumption ~ SelectDryBulbF + I(SelectDryBulbF^2) +
AvgWind + PcntClr + SunlightDur + WeekDayFactor + MonthFactor +
HolidayFactor
Df Sum of Sq
RSS AIC
<none>
2.78e+10 15243
- HolidayFactor
1 2.35e+08 2.80e+10 15248
- PcntClr
1 3.49e+08 2.81e+10 15252
- WeekDayFactor
6 2.94e+09 3.07e+10 15319
- AvgWind
1 2.77e+09 3.05e+10 15325
- SunlightDur
1 5.28e+09 3.30e+10 15394
- MonthFactor
11 1.54e+10 4.32e+10 15610
- I(SelectDryBulbF^2) 1 4.93e+10 7.70e+10 16139
- SelectDryBulbF
1 7.56e+10 1.03e+11 16398

Multivariate Regression Quadratic Model Version B
Start: AIC=15413.9
SummedUsage ~ SelectDryBulbF + I(SelectDryBulbF^2) + AvgWind +
PcntClr + SunlightDur + WeekDayFactor + MonthFactor + HolidayFactor +
DaylightSav
Df Sum of Sq
RSS AIC
- DaylightSav
1 1.2676e+05 3.3637e+10 15412
<none>
3.3637e+10 15414
- PcntClr
1 1.8743e+08 3.3825e+10 15417
- HolidayFactor
1 2.2245e+08 3.3860e+10 15418
- AvgWind
1 2.5004e+09 3.6138e+10 15475
- WeekDayFactor
6 3.2074e+09 3.6845e+10 15482
- SunlightDur
1 4.3941e+09 3.8031e+10 15520
- MonthFactor
11 1.0678e+10 4.4315e+10 15634
- I(SelectDryBulbF^2) 1 4.6220e+10 7.9858e+10 16173
- SelectDryBulbF
1 7.0993e+10 1.0463e+11 16411
Step: AIC=15411.91
SummedUsage ~ SelectDryBulbF + I(SelectDryBulbF^2) + AvgWind +
PcntClr + SunlightDur + WeekDayFactor + MonthFactor + HolidayFactor
Df Sum of Sq
RSS AIC
<none>
3.3637e+10 15412
+ DaylightSav
1 1.2676e+05 3.3637e+10 15414
- PcntClr
1 1.8730e+08 3.3825e+10 15415
- HolidayFactor
1 2.2361e+08 3.3861e+10 15416
- AvgWind
1 2.5006e+09 3.6138e+10 15473

64

- WeekDayFactor
6 3.2161e+09 3.6854e+10 15480
- SunlightDur
1 5.0801e+09 3.8718e+10 15534
- MonthFactor
11 1.5326e+10 4.8963e+10 15720
- I(SelectDryBulbF^2) 1 4.6248e+10 7.9885e+10 16171
- SelectDryBulbF
1 7.1002e+10 1.0464e+11 16409

Multivariate Regression Quadratic Model Version C

lm(formula = EstimatedTotalConsumption ~ SelectDryBulbF + I(SelectDryBulbF^2) +
AvgWind + PcntClr + SunlightDur + WeekDayFactor + MonthFactor +
HolidayFactor, data = Training_Data, na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-22061 -3699 -251 3502 22968
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
557225.26 7757.10 71.83 < 2e-16 ***
SelectDryBulbF
-12015.94 248.80 -48.30 < 2e-16 ***
I(SelectDryBulbF^2)
94.61
2.43 38.99 < 2e-16 ***
AvgWind
561.35 60.71 9.25 < 2e-16 ***
PcntClr
2841.24 865.99 3.28 0.00108 **
SunlightDur
-106.61
8.35 -12.76 < 2e-16 ***
WeekDayFactor2
-4244.01 769.74 -5.51 4.7e-08 ***
WeekDayFactor3
-5686.66 736.35 -7.72 3.2e-14 ***
WeekDayFactor4
-5230.63 731.39 -7.15 1.8e-12 ***
WeekDayFactor5
-4726.10 736.22 -6.42 2.3e-10 ***
WeekDayFactor6
-5338.32 732.85 -7.28 7.3e-13 ***
WeekDayFactor7
-2538.81 730.67 -3.47 0.00054 ***
MonthFactor2
-107.01 1261.02 -0.08 0.93239
MonthFactor3
2975.65 1768.88 1.68 0.09289 .
MonthFactor4
3786.16 2501.20 1.51 0.13046
MonthFactor5
4906.63 3168.86 1.55 0.12190
MonthFactor6
6440.57 3537.40 1.82 0.06900 .
MonthFactor7
4450.58 3347.09 1.33 0.18398
MonthFactor8
-2542.87 2796.66 -0.91 0.36347
MonthFactor9
-12433.19 2098.99 -5.92 4.6e-09 ***
MonthFactor10
-13116.02 1408.77 -9.31 < 2e-16 ***
MonthFactor11
-8744.27 1068.98 -8.18 1.0e-15 ***
MonthFactor12
-1134.88 1102.98 -1.03 0.30380
HolidayFactorWORKDAY -3540.61 1314.65 -2.69 0.00722 **
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 5690 on 856 degrees of freedom
Multiple R-squared: 0.974,
Adjusted R-squared: 0.973

F-statistic: 1.37e+03 on 23 and 856 DF, p-value: <2e-16

65

Multivariate Regression Quartic Model A

lm(formula = EstimatedTotalConsumption ~ SelectDryBulbF + I(SelectDryBulbF^2) +
I(SelectDryBulbF^3) + I(SelectDryBulbF^4) + AvgWind + PcntClr +
SunlightDur + WeekDayFactor + MonthFactor + HolidayFactor +
DaylightSav, data = Training_Data, na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-22555 -3623 -376 3315 22905
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
3.85e+05 5.54e+04 6.95 7.3e-12 ***
SelectDryBulbF
2.06e+03 4.53e+03 0.45 0.64962
I(SelectDryBulbF^2) -3.28e+02 1.38e+02 -2.38 0.01743 *
I(SelectDryBulbF^3) 5.47e+00 1.82e+00 3.00 0.00276 **
I(SelectDryBulbF^4) -2.58e-02 8.84e-03 -2.92 0.00362 **
AvgWind
5.68e+02 6.07e+01 9.36 < 2e-16 ***
PcntClr
3.00e+03 8.79e+02 3.41 0.00067 ***
SunlightDur
-1.03e+02 9.03e+00 -11.41 < 2e-16 ***
WeekDayFactor2
-4.26e+03 7.68e+02 -5.55 3.8e-08 ***
WeekDayFactor3
-5.73e+03 7.34e+02 -7.80 1.8e-14 ***
WeekDayFactor4
-5.29e+03 7.30e+02 -7.25 9.2e-13 ***
WeekDayFactor5
-4.74e+03 7.35e+02 -6.45 1.9e-10 ***
WeekDayFactor6
-5.40e+03 7.31e+02 -7.38 3.8e-13 ***
WeekDayFactor7
-2.60e+03 7.29e+02 -3.56 0.00039 ***
MonthFactor2
-2.66e+02 1.29e+03 -0.21 0.83700
MonthFactor3
2.88e+03 1.79e+03 1.61 0.10765
MonthFactor4
3.77e+03 2.53e+03 1.49 0.13612
MonthFactor5
5.09e+03 3.16e+03 1.61 0.10782
MonthFactor6
6.47e+03 3.52e+03 1.84 0.06667 .
MonthFactor7
4.25e+03 3.34e+03 1.27 0.20273
MonthFactor8
-2.47e+03 2.80e+03 -0.88 0.37810
MonthFactor9
-1.19e+04 2.18e+03 -5.47 6.0e-08 ***
MonthFactor10
-1.20e+04 1.66e+03 -7.21 1.2e-12 ***
MonthFactor11
-8.48e+03 1.07e+03 -7.92 7.3e-15 ***
MonthFactor12
-7.22e+02 1.12e+03 -0.65 0.51798
HolidayFactorWORKDAY -3.35e+03 1.31e+03 -2.56 0.01076 *
DaylightSavTRUE
-7.08e+02 1.18e+03 -0.60 0.54954
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 5670 on 853 degrees of freedom
Multiple R-squared: 0.974,
Adjusted R-squared: 0.973
F-statistic: 1.22e+03 on 26 and 853 DF, p-value: <2e-16

Multivariate Regression Quartic Model B
Step: AIC=15237
EstimatedTotalConsumption ~ I(SelectDryBulbF^2) + I(SelectDryBulbF^3) +

66

I(SelectDryBulbF^4) + AvgWind + PcntClr + SunlightDur + WeekDayFactor +
MonthFactor + HolidayFactor + DaylightSav
Df Sum of Sq
RSS AIC
- DaylightSav
1 1.13e+07 2.74e+10 15235
<none>
2.74e+10 15237
- HolidayFactor
1 2.13e+08 2.77e+10 15241
- PcntClr
1 3.70e+08 2.78e+10 15246
- WeekDayFactor
6 2.96e+09 3.04e+10 15315
- I(SelectDryBulbF^4) 1 2.64e+09 3.01e+10 15316
- AvgWind
1 2.81e+09 3.02e+10 15320
- SunlightDur
1 4.21e+09 3.17e+10 15360
- I(SelectDryBulbF^3) 1 5.58e+09 3.30e+10 15398
- MonthFactor
11 8.74e+09 3.62e+10 15458
- I(SelectDryBulbF^2) 1 1.15e+10 3.89e+10 15542
Step: AIC=15235
EstimatedTotalConsumption ~ I(SelectDryBulbF^2) + I(SelectDryBulbF^3) +
I(SelectDryBulbF^4) + AvgWind + PcntClr + SunlightDur + WeekDayFactor +
MonthFactor + HolidayFactor
Df Sum of Sq
RSS AIC
<none>
2.74e+10 15235
- HolidayFactor
1 2.19e+08 2.77e+10 15240
- PcntClr
1 3.66e+08 2.78e+10 15245
- WeekDayFactor
6 2.95e+09 3.04e+10 15313
- I(SelectDryBulbF^4) 1 2.67e+09 3.01e+10 15315
- AvgWind
1 2.80e+09 3.03e+10 15319
- SunlightDur
1 5.05e+09 3.25e+10 15382
- I(SelectDryBulbF^3) 1 5.62e+09 3.31e+10 15397
- I(SelectDryBulbF^2) 1 1.15e+10 3.90e+10 15542
- MonthFactor
11 1.42e+10 4.17e+10 15580

Multivariate Regression Quartic Model Final
lm(formula = EstimatedTotalConsumption ~ I(SelectDryBulbF^2) +
I(SelectDryBulbF^3) + I(SelectDryBulbF^4) + AvgWind + PcntClr +
SunlightDur + WeekDayFactor + MonthFactor + HolidayFactor,
data = Training_Data, na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-22549 -3617 -374 3359 22976
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
4.11e+05 8.11e+03 50.69 < 2e-16 ***
I(SelectDryBulbF^2) -2.66e+02 1.40e+01 -18.96 < 2e-16 ***
I(SelectDryBulbF^3) 4.67e+00 3.53e-01 13.24 < 2e-16 ***
I(SelectDryBulbF^4) -2.20e-02 2.41e-03 -9.12 < 2e-16 ***
AvgWind
5.66e+02 6.06e+01 9.35 < 2e-16 ***
PcntClr
2.96e+03 8.76e+02 3.38 0.00076 ***
SunlightDur
-1.05e+02 8.38e+00 -12.55 < 2e-16 ***

67

WeekDayFactor2
-4.25e+03 7.66e+02 -5.54 4.0e-08 ***
WeekDayFactor3
-5.70e+03 7.33e+02 -7.78 2.0e-14 ***
WeekDayFactor4
-5.26e+03 7.28e+02 -7.23 1.1e-12 ***
WeekDayFactor5
-4.72e+03 7.33e+02 -6.44 2.0e-10 ***
WeekDayFactor6
-5.37e+03 7.29e+02 -7.36 4.4e-13 ***
WeekDayFactor7
-2.57e+03 7.27e+02 -3.53 0.00043 ***
MonthFactor2
-1.12e+02 1.27e+03 -0.09 0.92922
MonthFactor3
2.79e+03 1.78e+03 1.57 0.11675
MonthFactor4
3.62e+03 2.51e+03 1.44 0.14908
MonthFactor5
5.06e+03 3.16e+03 1.60 0.10987
MonthFactor6
6.53e+03 3.52e+03 1.85 0.06398 .
MonthFactor7
4.29e+03 3.33e+03 1.29 0.19798
MonthFactor8
-2.60e+03 2.79e+03 -0.93 0.35126
MonthFactor9
-1.23e+04 2.11e+03 -5.82 8.2e-09 ***
MonthFactor10
-1.26e+04 1.40e+03 -8.95 < 2e-16 ***
MonthFactor11
-8.55e+03 1.07e+03 -8.03 3.3e-15 ***
MonthFactor12
-8.28e+02 1.11e+03 -0.75 0.45386
HolidayFactorWORKDAY -3.42e+03 1.31e+03 -2.61 0.00920 **
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 5670 on 855 degrees of freedom
Multiple R-squared: 0.974,
Adjusted R-squared: 0.973
F-statistic: 1.33e+03 on 24 and 855 DF, p-value: <2e-16

68

Appendix F – Discussion of Akaike’s Information Criterion and Model
Selection

AIC is a method of testing for the parsimony of a given model, a means of balancing the
conflicting aims of low bias and low variance in a model.[12] The below figure from Posada and
Buckley 2004 shows this relationship.

Model selection should be primarily driven by a desire to make accurate predictions from new
information, separate from the data set used to train the model. For regression models, the main
drivers of error will be the size of the available dataset, its underlying variability, and the number
of parameters included during the model selection process.[13]
The overall error term can be broken down into three separate components, consisting of the
model bias, model variance, and irreducible model error. Model bias represents a model
predictions’ consistent deviation from the truth, i.e. a model that, on average, predicts higher than
69

true values would have an upwards bias. Model variability is an expression of the consistency of
the model’s predictions. Another way of thinking about these two terms is as the “accuracy” and
“precision” of the model, respectively, though this may misleadingly suggest that one is more
important than the other, whereas in reality a balance of bias and variance reduction is
critical.[12], [13] The formula for total model error is provided below [13].

An increase in the model parameter count (i.e. an increase in model complexity) will generally
have the effect of reducing bias while increasing variability. In an effort to reduce overall error to
a minimum, a balance must be achieved between these two competing factors, by adjusting the
main driver of model complexity: the number of parameters to be included. The following figure
illustrates the conceptual “sweet spot” where model complexity is precisely positioned to
minimize Total Error by balancing Bias and Variance[13].

70

In reality, for a given model we cannot know the discrete sources of error, but must instead rely
upon measures of total model error when predicting the training sample. R2 or “R squared” is one
such measure of total model error, however unadjusted R2 is useful only for measuring the
discrepancies between the model predictions and the training data set. This measure of model
error will include “training optimism,” an over-estimation of the model’s ability to predict future
values based upon its success at predicting training values.
One method of attempting to evaluate total model error for out of sample predictions is “Adjusted
R Square” which incorporates a penalty for model complexity. The below formula illustrates the
calculation of Adjusted R2 where n is observations and p is the number of model parameters.[11]

Even adjusted R squared, however, tends to under penalize model complexity and cannot be
entirely relied upon as an accurate measure of prediction error.[11] Akaike’s Information
Criterion (AIC) is one method for performing model selection among different potential models
of various complexity and accuracy, in a search for the optimal model.[14] AIC provides a more
accurate measurement of the information loss of a model, as well as a more conservative
penalization of models’ complexity. AIC is useful in the practical application of stepwise
regressions, where an analytical software tool, such as R, can go through multiple model
iterations, adding or subtracting each of the available parameters, before settling on an AIC
optimal model.

71