UNIVERSITY
OF CALIFORNIA TRANSPORTATION CENTER
FINAL
REPORT
PROJECT
TITLE: Development of Estimation Procedures for
Activity-Based Model Forecasting
PRINCIPAL
INVESTIGATOR: Will Recker
DEPT
& CAMPUS ADDRESS: Institute of Transportation Studies
University
of California, Irvine
Irvine,
CA 92697-3600
TELEPHONE: (949)
824-5642
FAX
NUMBER: (949) 824-8385
E-mail Address: wwrecker@uci.edu
FUNDING: UCTC Year 12 Research Grant
OVERVIEW:
The work is based on previous activity-based
research conducted by the principal investigator and his colleagues and
directed toward developing a practical estimation procedure that will enable
the use of a mathematical programming activity-based model as an effective
demand forecasting tool. In previous
studies the modeling approach has been applied successfully to a number of
transportation applications to explore issues relating to such areas as vehicle
emissions, accessibility, trip-chaining, ridesharing and travel time reduction;
in all of these applications the specification of the objective function is
prescribed by the analyst; e.g., the minimization of emissions produced by
travel. Conversely, the typical problem
in demand modeling is focused on inferring the relative weights associated with
potential components of the utility function that are determinants to a
population's revealed selection of the decision variables. This particular aspect of the application of
the activity-based research approach has remained a challenge.
It is argued that the activity-based modeling
framework developed offers a real analytical option for estimating the relative
importance of factors associated with the spatial and temporal
interrelationships among the out-of-home activities that motivate a household's
need or desire to travel. Demand
estimation within the activity-based modeling framework can provide both the
necessary constraint considerations on the household's decision alternatives
within a utility-maximizing structure as well as a convenient mechanism for
generating the set of feasible alternatives that are likely to be
considered.
In this research, we implement an estimation
procedure for a particular mathematical programming activity-based model in
order to estimate the relative importance of factors associated with spatial
and temporal interrelationships among the out-of-home activities that motivate
a household’s need or desire to travel.
The method uses a genetic algorithm to estimate coefficient values of
the utility function, based on a particular multidimensional sequence alignment
method to deal with the nominal, discrete, attributes of the activity/travel
pattern (e.g., which household member performs which activity, which vehicle is
used, sequencing of activities), and a time sequence alignment method to handle
temporal attributes of the activity pattern (e.g., starting and ending time of
each activity and/or travel). The
estimation procedure is tested on data drawn from a well-know activity/travel
survey. The research has produced the first known activity-based model
framework that can be applied to empirical demand forecasting – an application
that has long eluded activity-based modeling proponents.
KEY FINDINGS:
In this research, we develop and implement an estimation procedure for
the Household Activity Pattern Problem (HAPP) model (a mathematical programming
activity-based model offered by Recker, 1995) in order to estimate the relative
importance of factors associated with the spatial and temporal
interrelationships among the out-of-home activities that motivate a household’s
need or desire to travel. The procedure
provides both the necessary constraint considerations on the household’s
decision alternatives within a utility-maximizing structure as well as a
convenient mechanism for generating the set of feasible alternatives that are
likely to be considered.
The HAPP model is in the form of a Mixed Integer Linear Programming
model (MILP), i.e., one comprising continuous variables (such temporal
attributes of an activity pattern as the starting times of the associated
activities) as well as discrete variables (e.g., attributes associated with the
sequencing of activities, travel modes used, and persons performing the
activities). Because of this
complexity, an estimation approach based on a heuristic algorithm is the best
(and, arguably, the only) available option for solution. From the collection of such heuristic
algorithms, a Genetic Algorithm (GA) approach (Holland, 1992 a,b) was used
because of its considerable advantages for this particular application. To deal with the comparison of discrete
attributes between the actual activity pattern and the predicted activity
pattern, we use a variation of the Multidimensional Sequence Alignment Method
(Arentze, et al, 2002). The comparison
of continuous attributes is based on a Time Sequence Method that was developed
as part of this research. We used data
drawn from a well-known activity-travel survey to test the implementation of
the proposed estimation procedure, and then list further research needed to
extend the estimation procedure.
The general form of the HAPP mathematical program formulation of the
travel/activity decisions for a particular household, say, i, during
some time period is represented by:
Minimize
Subject to:
![]()
In the above,
is a vector of
coefficients that defines the relative contributions of each of the decision
variables to the overall disutility of the travel regime to the household. Descriptively, the constraint sets
for this MILP are
classified into six groups: (a) routing constraints that define the allowable
spatial movement of vehicles and household members in completing the
household’s activity agenda; (b) scheduling constraints specify the
relationship of arrival time, activity begin time, and waiting time, and
continuity condition along the temporal dimension; (c) assignment constraints
that are applied to match the relations between activity participation and
vehicle usage as well as activity performers (household members); (d) time
window constraints that are used to specify available schedules for activity
participation; (e) coupling constraints that define the relations between
vehicle-related variables and member-related variables: and (f) side
constraints including budget, capacity and rules for ride-sharing behavior.
With the exception of the side constraints (i.e., classification “f” above”),
these constraints capture the physical conditions that ensure that each member
of the household, as well as each vehicle used by the household, have a
consistent, continuous, path through time-space that results in all of the
activities on the household’s agenda being successfully completed. The reader
interested in a detailed derivation and explanation of these constraints is
referred to the original work by Recker (1995).
We define
as a multi-objective
measure that indicates the fitness of any potential solution vector,
, in minimizing the discrepancies between the observed and
model values. Then, using standard
procedures in genetic algorithms, we generate an initial population of
candidate solutions
, p = 1,…, np and perform mating, crossover, and mutations (Holland, 1992a,
Michalewicz, 1994) according to pre-described probability rules based on
respective fitness scores until the population converges to within sufficient
tolerance of the observed values. At
the beginning of the GA procedure, we generate an initial population
randomly. Subsequently, new populations
are generated in the mating pool using the GA operations described above, based
on the given probabilities.
For this study, the HAPP Mixed-Integer Linear Programming model is
solved using the CPLEX algorithm in the GAMS software package developed by the
World Bank. Under the current
implementation, the algorithm takes as input the composition (e.g., number of
household members, number of vehicles by type), activity plan (e.g., number,
location and type of activities to be completed) and constraints (e.g.,
activity time windows, pre-assigned roles) faced by a particular household, and
a set of model coefficients (i.e., candidate utility weights), which are
decoded from the chromosome produced by the GA. (Each chromosome includes all coefficients in the HAPP model,
represented as a binary string.) Based
on this information, the HAPP model outputs the predicted activity pattern that
maximizes the objective utility function for the particular utility weights.
The first stage of the process optimally
(i.e., minimum effort) “re-constructs” the nominal parameters (i.e., integer
variables of the HAPP model) of the predicted household activity sequence to
perfectly align with the observed sequence of the particular household (i.e.,
to match the observed assignments of household member and vehicle to each
activity in the household’s activity plan, as well as the observed sequence in
which the respective activities were performed). This process uses the Multi-Dimensional Sequence Alignment Method
to compare the two activity sequences, and produces the minimal effort required
to perfectly align the two activity sequences.
The second stage of the process uses a Time Sequence Comparison Method
to compare the two activity sequences, and determine the similarity S (overlap time) between identical
activities in the two activity sequences.
(Because the activity durations are fixed, aligning the start times is a
sufficient condition for producing a complete match between the two activity
sequences, once the sequencing parameters have been matched.) This second stage has two distinct
components: 1) global temporal shift of the entire pattern, and 2) local
temporal adjustment of each activity start time.
To demonstrate the estimation procedure, we
randomly selected a few household activity patterns extracted from the
Southwest Washington and Oregon Area 1994 Activity and Travel Behavior Survey,
which contains sufficiently detailed information, including comprehensive travel/activity
diaries (with mode availability) and regional transportation network model, to
support an application of the HAPP model used in this study. Since the HAPP model represents
activity/travel behavior of the collective members of a household, detailed
information is required on travel and activity participation for each member,
as well as transportation supply information (including household vehicle
holdings and network travel times). To
help define the time window constraint space, the average activity starting and
ending times for each activity type have been computed for the whole sample to
provide benchmark information on the temporal flexibility of the
activities. Shortest path travel times
between all activity locations of each household have been generated for all
the households in the sample using TRANSCAD.
This procedure allows for the exploration of all possible
activity/travel linkages for each household, which is fundamental to the
optimization procedure.
The variables comprising the assumed disutility function is provided in
the table below.
Qualitative Interpretation of Components of
Disutility Function
|
Variable |
Interpretation |
|
|
Total cost of travel to the household |
|
|
Total travel time expended by members of the household |
|
|
The difference between the time that a household member arrives home
from a particular activity and the time he/she started the activity; a
measure of delay time in returning home from an activity because he/she went
to another activity prior to returning home (i.e., trip chained). |
|
|
A measure of how close a household member comes to being unable to
complete an activity before its window closes, i.e., the risk of not being
able to perform an activity if some stochastic delay (e.g., congestion) had
arisen. |
|
|
Same as |
|
|
Total extent of the travel day; i.e., how spread out the travel is. |
|
|
The "cost" associated with using more vehicles than
absolutely necessary; e.g., using two different vehicles, rather than one. |
|
|
Total waiting time (at destination) before performing activities. |
The results of the estimation were very encouraging. In all cases considered, the genetic
algorithm converged either to the actual observed pattern or to a very close
approximation to the observed pattern in relatively few iterations (four to
five).
Although we believe that the
procedures developed under this research represent an important step in moving
activity-based travel approaches beyond descriptive analysis toward a
foundation for practical demand modeling, there nonetheless remain significant
challenges. In the current
implementation we have addressed the inference of utility coefficients for one
activity pattern at a time; in a sense, we have managed only to find a
“feasible” set of coefficients that, if applied to the assumed utility
function, would replicate the observed pattern of that particular household.
Further work is needed to deal with multiple activity patterns simultaneously
so that a common estimated coefficient set can be applied to an entire sample
of activity patterns with the result that the predicted activity patterns are
closest to the actual activity patterns, within some stated confidence. The mechanical aspects of this step are
simple enough – it can be accomplished by redefining the fitness score as some
weighted average across all sample points, and adding a “global” loop on the
estimation process – but we may never be able to ferret out robust statistical
measures of model performance or, more importantly, confidence in applying such
coefficients to the population.
Because the total number of
possible multidimensional operation sets is equal to multiplication of the
numbers of all optimal paths in Uni-dimensional Sequence Alignment Method for
all attributes, the Multidimensional Sequence Alignment Method might be very
time-consuming when the number of activities considered is larger than those
considered in the three test cases reported here. Additionally, the Time Sequence Comparison technique that has
been employed, which simply offsets activities along the time dimension to
maximize the overlapped time or number of overlapped activities, may be too
simplistic. For purposes of
illustration, we have used a fitness score that is simply the difference
between the “edit distance” (i.e., the number of steps to equate two sequences)
and the degree of overlap along the temporal axis. The scaling problem is not addressed and the fitness score has no
practical meaning.
Finally, despite its complete
description (and analytical management) of the “hard” physical constraints
imposed on a household’s collection of activity/travel decisions, the HAPP
model has no explicit representation of such “soft” constraints as each
member’s role in the household, or their particular predilections and
idiosyncrasies, that may be imposed on top of these physical constraints. One promising avenue for incorporating these
considerations may be in adapting some of the procedures identified with the
relatively new field of uncertain programming (Liu, 1999).
Arentze, T.A., F. Hofman, C.-H. Joh, and H.
Timmermans (2002). Activity pattern similarity: a multidimensional sequence
alignment method. Transportation Research
Part B: Methodological, 36, pp. 385-403.
Holland, J.H. (1992a). Adaptation in
Natural and Artificial Systems, 2nd Ed., MIT Press, Cambridge,
MA.
Holland, J.H. (1992b). Genetic algorithms. Scientific
American, July, pp. 66-72.
Liu, B.
(1999). Uncertain Programming. John
Wiley & Sons, Inc. New York.
Michalewicz, Z. (1992). Genetic Algorithms
+ Data Structures = Evolution Programs. Springer-Verlag, New York.
Recker, W.W. (1995). The household activity
pattern problem: general formulation and solution. Transportation Research,
29B, 1, pp. 61-77.
REFERENCES
(Other papers arising from this and related UCTC work):
Recker, W.W. (2001). A bridge
between travel demand modeling and activity-based travel analysis. Transportation Research , Part B:
Methodological, 35, 481-506.