The impact of the model
structure in social simulations
O. Barthélémy
Centre for Policy Modelling, Manchester
Metropolitan University,
O.Barthelemy@mmu.ac.uk
The use of scenarios for planning
is quite common, be it for industries, nations, shop chains, or technologies.
They represent a specific approach where while dealing with uncertainties, one
can either consider the whole (frequently continuous) range of alternatives, or
specify particular cases that would capture key properties of this range.
From their appearance in the
literature in the 40s, scenarios have been more and more frequent. It was
generally extrapolated through past data and relationships, till the 80s, when
the studies in innovation and diffusion showed that the future is depending on changes
in social and economic systems which paths are multiple, and indeed not fixed,
but evolving themselves. The goal of scenarios then became the analysis of some
trends within a “possibility space”, and eventually the reduction of that space
finding potential discontinuities, in order to improve the decision making
process.
It is the case for climate change
and social behaviour throughout time. Scenarios are used because the future is
uncertain, and the ability to adapt to these future changes might have effects
upon the longer term (e.g. technological changes).
A multi agent system can be used
to represent formally these scenarios and its structure and components can then
be analysed in order to evaluate their impact upon the outcome.
This diagram from the CC:DEW report
shows the increasing uncertainty as the different components are integrated
into a model.
Fig 1: The presence of uncertainty at
multiple levels
Multiple scenario frameworks are
built on the Environmental Futures scenarios, from the UK Foresight
programme. This programme tries to look beyond normal commercial horizons to
identify potential opportunities from new science and technologies. It intends
to deliver thorough and up to date information and analysis of recent
developments in relevant science and technology, vision of the future
reflecting the potential impact of science and technology and of forecast
social and economic trends, recommendations for actions, and networks of people
who recognise the importance of the issues addressed by the project.
The Foresight Future scenarios
have been created for the UK Foresights programme, and the project was funded
by the Department of Trade and Industry, as well as the Department of
Environment, Transport and the Regions. The research was led by the Science and
Technology Research Policy for the UK Foresight Programme (SPRU) from the
University of Sussex. These scenario represent a tool for foresight, enabling
institutions, businesses, and more generally users, to apprehend possible
futures, in order to improve their decision making.
Generated via an iterative
participatory process, the scenario framework also draw on pre-existing work,
such as the scenarios developed by the Intergovernmental Panel on Climate
Change (IPCC 2000) that is trying to estimate
future greenhouse gas emissions.
The UKCIP scenarios use computer
estimations of climate change, and are attempting to assess their impact upon
the UK’s socio-economic structure. They differ from the Foresight scenarios by
being specifically designed for the timescale of the associated climate
scenario provided, and by providing details likely to be of use for regional
and sectoral studies. For example, they give greater emphasis to the possible
changes to regions and to certain types of geographical domain. Classified
according to governance system and social values, their denomination is
equivalent to the Foresight scenarios, apart from the Provincial Enterprise,
renamed National Enterprise. Every scenario describes a plausible future, and
the shape of water demand, agricultural trends, future transport, and economic
development.
The scenarios presented by the Environment agency are based upon the
UKCIP scenarios, themselves based on the Foresight scenarios. The Foresight
scenarios were devised taking into account various sectors of the economy. The
UKCIP focused on the climate change impact, and the EA focused further on water
resources.
Fig 2: The different focuses of
studies originating from the future scenarios
Four distinct Foresight scenarios
were retained as typical cases. They are classified according to a couple of
main indicators, or drivers of change: the social values of the individuals,
and the governance structure in place.
Social values go from individualistic to
more community oriented, while the system of governance, dealing with the
structures of the government and the decision making process go from autonomy
(power remaining to national level) to interdependence (power moves to
institutions, e.g. from the EU to regional government).
Governance \ Values |
Individual |
Community |
Interdependence / Globalisation |
World Markets (B) |
Global Sustainability (C) |
Autonomy / Regionalisation |
Provincial Enterprise (A) |
Local Stewardship (D) |
The scenarios have specific
general characteristics, presenting the general trends, and more details upon
economic and sectoral trends, employment and social trends, regional
development, health, welfare and education, and the environment.
As clearly expressed in
Herrington (1996), the statistical results obtained for a given period, and the
best goodness of fit for several representations were varying when applied to
the following time period, largely reducing the interest of such results in a
forecasting perspective. One of the reasons for such lack of success could be
the fact that the tools might not be appropriate for the subject of study.
The use of parametric statistics,
although very common, is subject to several conditions.
It includes the fact that both
variables should have homogeneous levels of variance. Parametric statistics use
the knowledge upon the distribution the sample is extracted from as components
of the tests that can be undertaken.
Using these assumptions, they can
use mathematical properties and simplification that enables them to reach for
example confidence intervals, and qualify the generated or observed sample.
While this is sufficient to
analyse a finite set of data, the inference from such a finite set of data can
be jeopardised by some limits for this method.
In most cases, the statistical
treatment of data is done with methods that have for assumption the existence
and stability of the mean and the standard deviation for that distribution. But
this assumption cannot be taken for granted in multiple cases. As shown in Moss
(2001) and Bak (1997), many phenomenon could have underlying distribution that
do present these properties.
It seems to be the case for
household water house. The data provided by some water companies was analysed,
and showed that the relative changes are not normally distributed (i.e. either
the mean or the standard deviation, or both, is not stable). The use of tools
assuming so is consequently unsound.
Figure 3: Comparison of detrended
cumulative probability with a normal distribution
This figures shows the
representation of relative change for household water demand for the Fairlight
region. If the relative changes were normally distributed, the dots would be
plotted alongside the horizontal line.
As the tests undertaken tend to
confirm the assumption that this sample does not suit a normal pattern, further
analysis indicate that the property shown, a relatively fat tail, and thin
peak, is known as leptokurtosis.
Failing to use normal
distributions, even with time-dependent mean and variance considered by
economists, physicists like Per Bak (Bak 1997)
used the power law devised by Pareto in 1893. If the data observed suit this
kind of distribution, the consequences are important. The probability density
function of the pareto distribution has two parameters, one, a, as the peakedness parameter, in the interval (0,2], and the other, b, as the skewness parameter, in the interval [-1,1]. The issue is that
these parameters have critical values. When a is equal to 2, the characteristic function of the paretian distribution
reduces to that of the normal distribution. But for a < 2, there is no finite variance for the distribution, and for a less or equal to 1, there is no finite mean.
Moss (2001) has investigated the
different means to generate such a distribution. The three mutually exclusive
explanations are: a normal distribution with predictable time varying
parameters, a stable pareto distribution with infinite variance generated by
self organised critical social process, or a non stable distribution generated
by a self organised critical social process.
Self organised critical processes
can be generated by systems that present social embeddedness. It is defined in
Edmonds 1999, as “the extent to which modelling the behaviour of an agent
requires the inclusion of other agents as individuals rather than an
undifferentiated whole”. It means that formally, it is more relevant to model
an agent as a part of the total system of agents and their interactions as
opposed to modelling it as a single agent that is interacting with an
essentially unitary environment.
The classical statistical
approach used is parametric statistics. Another option is to use non
parametrics statistics. It cannot really be used for inference, as one cannot
actually provide a non parametric model for forecasting, but can be used
instead to characterise an already existing data sample.
But as seen earlier, this
assumption does not always hold, and as it is likely the case now. Because the
assumptions required to use parametric statistics are at most debatable in the
current case, it puts the reliability, hence the utility of standard methods in
jeopardy.
What can then be used to analyse
the samples are nonparametric methods. Also be called “distribution-free”
methods, they do not assume the underlying distribution of the variable
involved, and consequently, none of their eventual properties. Instead, they
are based on ordinal measurements. Consequently, while the nonparametric
methods are not as powerful as parametric ones when in the correct situation,
they are much more powerful in cases such as this one when data cannot be
assumed to be normal).
The overall principle of the
nonparametric methods used in this research is that a sample can be compared to
another by verifying whether the distribution of their variations is either
consistent with a know distribution (that method is used to demonstrate that
the data, both observed and generated, are unlikely to match normality
assumptions), or similar to one another (that is used to demonstrate that the
results obtained by multiple simulations have the same underlying distribution,
and that this distribution can be considered as matching the observed data).
The research into climate change
issues and the impact upon household water demand has led to the generation of
a model, using Integrated Assessment (Moss, 2001). Multi disciplinary inputs
have helped to make the representation as sensible as possible, and multi agent
systems were used, in order to represent the behaviour of households in terms
of water demand. It is devised as a dynamic representation of a society
composed of reactive agents (Ferber, 1999) influenced by their environment,
both social and physical.
The model used is based on Moss
and al. (2000) and Barthélémy and al. (2002). It incorporates agents
representing several households and a policy agent. The households are
characterised by their location on a grid, and a set of ownership, volume per
use, and frequency of use of appliances. The households can observe and
evaluate subjectively their neighbours, their activities, and the appliances. They
can therefore rate differently the information depending on its source.
The policy agent reacts to the
climatic conditions, and broadcasts messages of water saving in case of
dryness. Innovation is also present, as some appliances can break down, and be
replaced by newly available substitutes with different characteristics.
Amongst the processes embedded in
the model, a particular attention is given to the influence of innovation, its
presence and diffusion, as well as the importance of population response to the
various signals, from the policy agent or the neighbouring households.
The outputs of such simulations
can either be a straightforward analysis of the water demand generated by the
simulated households, or a more qualitative analysis of a specific phenomenon,
or of the characteristics or patterns (rather than the values themselves) of
the water demand. The latter will have our attention, that’s why it is
necessary to present the few assumptions that were found or thought to be of
importance in the model.
Some formal models are simple
enough so that the few parameters are meaningful ones with respect to the
object of the modelling. It is for example the case when dealing with the
evaluation of weight category for a person with respect to current weight and
physical characteristics. There are more complex models, for which the
sophistication requires many aspects of a problem to be taken into account. It
is then possible that some parameters integrated to the model could influence
significantly the results, while not being central to the modeller. Based on
simple model, Hales and Edmonds (2003) have given an example of modelling
details that, although not really central to their issue, turned out to be
decisive for the behaviour of the model, and the observed phenomenon.
During the verification and
validation processes, several particularities have been observed. The first one
refers to the innovation process, and the diffusion of new technologies. The
second one refers to the importance of initial conditions, and the way they are
implemented into the model.
In
Duboz et al. (2001) for example, the Multi
agent system used is tested for boundary conditions, i.e. a change in the
algorithms that are involved in the spatial behaviour of agents (namely the way
the agent bounces off a wall). Are also tested the distribution algorithms and
the size of the space, and the conclusion suggests that the choice of the
bouncing algorithm is a more important parameter than the distribution or size
of the population of agents.
In the current case, there are
many parameters and algorithms. The methods and values used are carefully
selected as having appropriate characteristics and (in general a lack of)
underlying assumptions.
Quite an exhaustive view of
innovation diffusion can be found in (Rogers Everett 1995). He defines diffusion as “the process by which an innovation is
communicated through certain channels over time among the members of a social
system”.
Rogers presents all (or nearly)
the aspects of innovation, from its generation to its consequences, through the
study of its diffusion through the characteristics of the technology, of the
user, and of the underlying network. It is noticeable that the diffusion of
innovation depends then a lot on the nature of the innovation. They have
specific characteristics that could explain the rate of adoption, i.e. the
success or the failure of a given technology: relative advantage,
compatibility, complexity, trialability, and observability. Some even display
self-reinforced dynamics, when the technology can benefit from network
externalities. What is taken in to account then is the global influence of the
number of adopters in the system, not only locally (Blume [1993], Ellison
[1993]).
These characteristics are obviously
tightly depending on the technology itself and hence will not be included
extensively in the present study for the aforementioned reasons. The common
literature, mainly composed of surveys, is globally presenting these
characteristics of the technology along with the characteristics of the
individuals to explain the success or failure of some specific cases.
After many studies and debates, a scale and
a classification appeared. Proposed by Rogers in 1962, it is based on the
assumption that the frequency of adoption follows a bell-shaped curve, and the
associated cumulative curve is S-shaped. The adopters are then categorised
depending on their time of adoption.
·
The innovators are the first 2.5%
·
The early adopters are the next 13.5%
·
The early majority is the next 34%
·
The late majority is the next 34%
·
The laggards are the last 16%
These numbers are based on the
intervals each side of the average time of adoption. The innovators are more
than 2 standard deviations less than the average, while it is only one for
early adopters, the early majority is situated within a standard deviation less
than the average, the late majority is within a standard deviation more than
the average, and the laggards are adopting a technology after a period of time
that is more than the average plus one standard deviation. Trondsen (1996),
Valente (1996) and Young (2002) amongst others refer to these values, and
present results that conform to these categories.
One of the issues with such a
theory s that one cannot know beforehand which agent is which, unless the size
of the market is known in advance. It then becomes impossible to use this
approach in order to represent the diffusion of innovation in markets that
could be smaller than the whole population. In this model, the adoption process
for innovation is based on endorsements. In this matter, one can argue that the
subjectivity of the indicator selected can be such that it will direct the
whole pattern of adoption.
It was seen though that with this
particular type of representation, when the choice had to be made at random,
the bell-shaped pattern emerged in multiple occasions.
Technically, the reason for this
random selection is that amongst the endorsements recalled by the agent in
order to assess their values and select the highest, most of them were
equivalent. In certain cases, an agent looked for a specific type of
endorsement, and could not find them. It was then automatically considering
every endorsement value as zero.
The initial conditions for this
model are based upon real life observation, and research upon the assumptions
made in the different scenarios generated by the Environment Agency.
Amongst the various assumptions,
three are of importance in this context.
The representation of the social
values of households does not imply anything in terms of the modelling of the
structures and the environment. The implications are concerning the way they
see and judge that environment. The argument here is that someone caring about
community will put a greater emphasis on the community as a driver of his own
behaviour. Selecting the appropriate weights of influence in the already
existing model can then represent this indicator. The endorsements can be
ranked, from an individualistic (self centred) point of view, to a more citizen
(globally influenced). They can therefore be used to represent the concern, and
influence, of a particular agent. The link is made here between the fact that
an agent is community oriented, and its major influences are in the “community”
around him, his neighbourhood. Nevertheless, while is easier to argue that
individualism can be linked with the references in the model to the
self-centred beliefs and rules, community can be a bit more difficult in the
framework of this model. What is called community in the EA approach is
actually referred as “community” within the model as the immediate social
environment of the agent. As it is expressed in their description, community
also seems to have the meaning of “citizenship”, a behaviour in line with the
idea of not wasting limited resources.
While the social values can be
represented in such a way that various ones can be generated easily, through a
simple choice of different ranking and / or values of the endorsements
themselves, it is not possible for the governance structure. As expressed
earlier, there are issues with the meaning of such an indicator, that prevent
from implementing it in a single and specific way. Consequently, unlike the
first part of the influences, this classification of the scenarios will be done
using the detailed approach. For a given state of social values, the governance
structure will be identified (and the scenario defined) by the range of
available appliances, their associated values (ownership, frequency, volume,
replacement rate), as well as the presence or not of technological regulations.
The representation of the
technological regulator is simple. Since the regulations are enforced in the
scenarios, there is no need for a dynamic adaptive regulation, i.e. the presence
in the model of an agent that would evaluate the situation and eventually
decide for the need of intervention. Like the emergence of new appliances,
which was already implemented, it is present as a constraint upon the
appliances. Since the scenarios describe accurately when regulation happens,
the influence on the model is that from a given date onwards, some devices are
made available, or unavailable for the households.
The layout of the support for the
agents can be important, and slightly different properties generate different
outcomes. In this case, the agents are situated on a regular grid limits. What
has been investigated though, is the impact of the grid structure itself. It
can be considered as a bidimensional mapping, or a tridimensional mapping, the
latter being equivalent to a toroidal shape, allowing agents to observe more
cells when they are located towards the edge of the grid.
The study of both situations
tends to show that although there is a significant difference in the links between
agents, it does not seem to generate large differences in the behaviours. It
can be assimilated to an increased possibility of communication amongst the
agents.
Fig 4: Matrix of links for a grid with a toroidal
structure: in this case, the 20 agents are situated on a grid of size 8*8.
Fig 5: The equivalent matrix of links for the
non-toroidal version can be displayed as follow:
Since the black cells show the
existence of an interaction between the agents listed horizontally and
vertically, it is easy to notice that the toroidal structure provides more
contacts to each agent.
A significant difference could
not be proven to result from such changes, as long as the situations were
similar enough (i.e. there were not multiple simultaneous parameter changes).
Nevertheless, their influence
upon the results of the model needs to be understood, in order to improve the
model’s comprehension.
While using power law distributed frequency and volumes
for the agents, some peaks of consumption appeared in the outputs. They were
initially though to be caused by drought periods, since the model is devised to
capture this process. Despite this, the fact that some peaks did not seem to
correspond to dry periods lead to a more thorough investigation of all peaks.
The outcome was actually that the particular implementation of the model
allowed in some rare cases the agent representing the institutions to broadcast
a message encouraging households to use more water than they were.
This phenomenon does not appear
in the alternative, where the frequency and volume are normally distributed,
but then it does not result in satisfactory outcomes.
Fig 6: Peaks in demand observed
in parallel with dryness duration in the model
One could expect that the effect
of dryness would drive consumption down. The matching of these visual
indications reveals that the reaction to an exhortation from the policy agent
is not always producing the expected effect. Only the major dryness period
lasting for 9 months 1991 had an important and significant effect upon the
water consumption (as well as the subsequent ones, but on a smaller scale). The
other dryness periods actually seem to have no effect while only when the level
is already very high, a decrease can be observed.
This demonstrates that either the
behaviour of the agents is not implemented properly, or that there can be other
method related issues.
Although this behaviour does not
seem to affect the vast majority of the runs, it requires investigation to
understand what is the cause of such variance within the simulations.
The detailed study of this run
shows exactly what is happening. As expressed in the description of the model,
the policy agent uses a kind of average of the observed frequency and volume
data from the households. Due to the initial conditions, the policy agent could
then be biased by some extreme randomised value for the households.
It would therefore broadcast a
message that would lead households to adapt by using patterns whose recommended
values are higher than those in use by the households themselves.
Also, as one would expect, the
different cases of population repartition show that the favourable repartition
of the population towards agents that are marginally influenced by their
environment tend to generate patterns with significantly higher water demand
levels. That could be explained by the shift of influence from a
citizen-centred view to a more self centred one. One could then consider that
the messages from the institutions are not rated high enough to result in a
change of behaviour from the households.
Finally, the global demand for water generated by a
set of runs for this model can be displayed as follows:
Fig 7: Chart of multiple runs for
the same simulation settings
This shows the evolution of water
demand for the entire population, i.e. 20 agents in this case, which are
located on a grid of size 10, on which they can see what their neighbours are
doing up to 4 cells away, without being limited by the edge of the grid.
Due to the unavoidable (at this
stage) effects of the power law implementation, some of the runs are not likely
to be representative of a standard behaviour. Still, the majority of the runs
are similar and could certainly be considered as representative of a scenario.
As a model is devised to analyse
the multiple issues surrounding the use of scenarios describing plausible
futures, the multi agent system used demonstrates the importance of careful
implementation of objects and processes.
The multi agent system can generate
patterns that are consistent, and that seem more accurate on a statistical
point of view than the traditional parametric approaches used in other studies
on the same subject.
While generally the sensitivity
to the initial conditions is investigated, it is found that sometimes, trivial
choices of modelling, or even coding, can have an important impact upon the
validity of the model itself.
During the course of the
validation and verification processes, interesting phenomenon could be
identified, and demonstrated some interesting properties. Further investigation
will be necessary in order to improve
the understanding of the reasons for this surprising diffusion process, as well
trying to avoid the emergence of extreme scenarios.
Bak, P., 1997, How Nature Works: The Science of Self Organized
Criticality (Oxford, Oxford University Press).
Barthélémy O. Moss S., Downing T., Rouchier J., (2002), Policy Modelling
with ABSS: The Case of Water Demand Management, CPM report 02-92
R. Duboz, E. Ramat and P. Preux, Towards a coupling of continuous and
discrete formalisms in ecological
modelling - influences of the choice of algorithms on results, In proceedings
of the 13th European Symposium on Simulation, Marseille 10/18/2001-10/20/2001
(p481-487).
Edmonds, B. (1999). Capturing Social Embeddedness: a Constructivist
Approach. Adaptive Behavior, 7:323-348
Environment Agency, 2001, A scenario Approach to water demand
forecasting, Environment Agency
Ferber J., 1999, Multi-Agent Systems: An Introduction to Distributed
Artificial Intelligence, Addison-Wesley Edmonds B, Hales D, M2M Communication,
2003
Hales D, Edmonds, B, 2003, M2M conference,
Marseille, 31/03-01/04/2003
Herrington, P. and E. (1996). Climate change and the demand for water.
London, Hmso 1996.
Jensen, H., 1998, Self-Organized Criticality: Emergent Complex Behaviour
in Physical and Biological Systems (Cambridge: Cambridge University Press).
Moss, S., 2000, Competition in Intermediated markets: statistical
signatures and critical densities, CPM Report 01-79
Moss S., Downing T., Rouchier J., 2000, Demonstrating the Role of
Stakeholder Participation: An Agent Based Social Simulation Model of Water
Demand Policy and Response, CPM Report 00-76
Moss, S., Pahl-Wostl, C. and Downing, T., "Agent based integrated
assessment modelling", Integrated Assessment 2: 17-34, 2001
Rogers Everett, M. (1995). Diffusion of innovations. New York ; London,
Free Press.
Trondsen T.J., 1996, Some characteristics of adopters of a major
innovation in the computer field and its potential use in marketing, INDUSTRIAL
MARKETING MANAGEMENT, vol. 25, issue 6.
Valente T.W., 1996, Social Networks thresholds in the diffusion of
innovation, Social Networks, vol. 18, pp. 69-89.
Young, P.H. 2002, The diffusion of innovation in social networks,
working paper, The Santa Fe Institute