UNIVERSITY OF CALIFORNIA TRANSPORTATION CENTER

FINAL REPORT

 

PROJECT TITLE:                            Development of Estimation Procedures for Activity-Based Model Forecasting

PRINCIPAL INVESTIGATOR:      Will Recker

DEPT & CAMPUS ADDRESS:       Institute of Transportation Studies

                                                            University of California, Irvine

                                                            Irvine, CA 92697-3600

TELEPHONE:                                   (949) 824-5642

FAX NUMBER:                                (949) 824-8385

E-mail Address:                         wwrecker@uci.edu

FUNDING:                                        UCTC Year 12 Research Grant

OVERVIEW:

The work is based on previous activity-based research conducted by the principal investigator and his colleagues and directed toward developing a practical estimation procedure that will enable the use of a mathematical programming activity-based model as an effective demand forecasting tool.  In previous studies the modeling approach has been applied successfully to a number of transportation applications to explore issues relating to such areas as vehicle emissions, accessibility, trip-chaining, ridesharing and travel time reduction; in all of these applications the specification of the objective function is prescribed by the analyst; e.g., the minimization of emissions produced by travel.  Conversely, the typical problem in demand modeling is focused on inferring the relative weights associated with potential components of the utility function that are determinants to a population's revealed selection of the decision variables.  This particular aspect of the application of the activity-based research approach has remained a challenge.

It is argued that the activity-based modeling framework developed offers a real analytical option for estimating the relative importance of factors associated with the spatial and temporal interrelationships among the out-of-home activities that motivate a household's need or desire to travel.  Demand estimation within the activity-based modeling framework can provide both the necessary constraint considerations on the household's decision alternatives within a utility-maximizing structure as well as a convenient mechanism for generating the set of feasible alternatives that are likely to be considered. 

In this research, we implement an estimation procedure for a particular mathematical programming activity-based model in order to estimate the relative importance of factors associated with spatial and temporal interrelationships among the out-of-home activities that motivate a household’s need or desire to travel.  The method uses a genetic algorithm to estimate coefficient values of the utility function, based on a particular multidimensional sequence alignment method to deal with the nominal, discrete, attributes of the activity/travel pattern (e.g., which household member performs which activity, which vehicle is used, sequencing of activities), and a time sequence alignment method to handle temporal attributes of the activity pattern (e.g., starting and ending time of each activity and/or travel).  The estimation procedure is tested on data drawn from a well-know activity/travel survey.  The research has produced the first known activity-based model framework that can be applied to empirical demand forecasting – an application that has long eluded activity-based modeling proponents.

KEY FINDINGS:

In this research, we develop and implement an estimation procedure for the Household Activity Pattern Problem (HAPP) model (a mathematical programming activity-based model offered by Recker, 1995) in order to estimate the relative importance of factors associated with the spatial and temporal interrelationships among the out-of-home activities that motivate a household’s need or desire to travel.  The procedure provides both the necessary constraint considerations on the household’s decision alternatives within a utility-maximizing structure as well as a convenient mechanism for generating the set of feasible alternatives that are likely to be considered.

The HAPP model is in the form of a Mixed Integer Linear Programming model (MILP), i.e., one comprising continuous variables (such temporal attributes of an activity pattern as the starting times of the associated activities) as well as discrete variables (e.g., attributes associated with the sequencing of activities, travel modes used, and persons performing the activities).  Because of this complexity, an estimation approach based on a heuristic algorithm is the best (and, arguably, the only) available option for solution.  From the collection of such heuristic algorithms, a Genetic Algorithm (GA) approach (Holland, 1992 a,b) was used because of its considerable advantages for this particular application.  To deal with the comparison of discrete attributes between the actual activity pattern and the predicted activity pattern, we use a variation of the Multidimensional Sequence Alignment Method (Arentze, et al, 2002).  The comparison of continuous attributes is based on a Time Sequence Method that was developed as part of this research.  We used data drawn from a well-known activity-travel survey to test the implementation of the proposed estimation procedure, and then list further research needed to extend the estimation procedure.

The general form of the HAPP mathematical program formulation of the travel/activity decisions for a particular household, say, i, during some time period is represented by:

Minimize                                                                                                     

Subject to:

In the above,  is a vector of coefficients that defines the relative contributions of each of the decision variables to the overall disutility of the travel regime to the household.  Descriptively, the constraint sets  for this MILP are classified into six groups: (a) routing constraints that define the allowable spatial movement of vehicles and household members in completing the household’s activity agenda; (b) scheduling constraints specify the relationship of arrival time, activity begin time, and waiting time, and continuity condition along the temporal dimension; (c) assignment constraints that are applied to match the relations between activity participation and vehicle usage as well as activity performers (household members); (d) time window constraints that are used to specify available schedules for activity participation; (e) coupling constraints that define the relations between vehicle-related variables and member-related variables: and (f) side constraints including budget, capacity and rules for ride-sharing behavior. With the exception of the side constraints (i.e., classification “f” above”), these constraints capture the physical conditions that ensure that each member of the household, as well as each vehicle used by the household, have a consistent, continuous, path through time-space that results in all of the activities on the household’s agenda being successfully completed. The reader interested in a detailed derivation and explanation of these constraints is referred to the original work by Recker (1995).

We define  as a multi-objective measure that indicates the fitness of any potential solution vector, , in minimizing the discrepancies between the observed and model values.  Then, using standard procedures in genetic algorithms, we generate an initial population of candidate solutions , p = 1,…, np and perform mating, crossover, and mutations (Holland, 1992a, Michalewicz, 1994) according to pre-described probability rules based on respective fitness scores until the population converges to within sufficient tolerance of the observed values.  At the beginning of the GA procedure, we generate an initial population randomly.  Subsequently, new populations are generated in the mating pool using the GA operations described above, based on the given probabilities.

For this study, the HAPP Mixed-Integer Linear Programming model is solved using the CPLEX algorithm in the GAMS software package developed by the World Bank.  Under the current implementation, the algorithm takes as input the composition (e.g., number of household members, number of vehicles by type), activity plan (e.g., number, location and type of activities to be completed) and constraints (e.g., activity time windows, pre-assigned roles) faced by a particular household, and a set of model coefficients (i.e., candidate utility weights), which are decoded from the chromosome produced by the GA.  (Each chromosome includes all coefficients in the HAPP model, represented as a binary string.)  Based on this information, the HAPP model outputs the predicted activity pattern that maximizes the objective utility function for the particular utility weights.

The first stage of the process optimally (i.e., minimum effort) “re-constructs” the nominal parameters (i.e., integer variables of the HAPP model) of the predicted household activity sequence to perfectly align with the observed sequence of the particular household (i.e., to match the observed assignments of household member and vehicle to each activity in the household’s activity plan, as well as the observed sequence in which the respective activities were performed).  This process uses the Multi-Dimensional Sequence Alignment Method to compare the two activity sequences, and produces the minimal effort required to perfectly align the two activity sequences.

The second stage of the process uses a Time Sequence Comparison Method to compare the two activity sequences, and determine the similarity S (overlap time) between identical activities in the two activity sequences.  (Because the activity durations are fixed, aligning the start times is a sufficient condition for producing a complete match between the two activity sequences, once the sequencing parameters have been matched.)  This second stage has two distinct components: 1) global temporal shift of the entire pattern, and 2) local temporal adjustment of each activity start time.

To demonstrate the estimation procedure, we randomly selected a few household activity patterns extracted from the Southwest Washington and Oregon Area 1994 Activity and Travel Behavior Survey, which contains sufficiently detailed information, including comprehensive travel/activity diaries (with mode availability) and regional transportation network model, to support an application of the HAPP model used in this study.  Since the HAPP model represents activity/travel behavior of the collective members of a household, detailed information is required on travel and activity participation for each member, as well as transportation supply information (including household vehicle holdings and network travel times).  To help define the time window constraint space, the average activity starting and ending times for each activity type have been computed for the whole sample to provide benchmark information on the temporal flexibility of the activities.  Shortest path travel times between all activity locations of each household have been generated for all the households in the sample using TRANSCAD.  This procedure allows for the exploration of all possible activity/travel linkages for each household, which is fundamental to the optimization procedure.

The variables comprising the assumed disutility function is provided in the table below.

Qualitative Interpretation of Components of Disutility Function

Variable

Interpretation

Total cost of travel to the household

Total travel time expended by members of the household

The difference between the time that a household member arrives home from a particular activity and the time he/she started the activity; a measure of delay time in returning home from an activity because he/she went to another activity prior to returning home (i.e., trip chained).

A measure of how close a household member comes to being unable to complete an activity before its window closes, i.e., the risk of not being able to perform an activity if some stochastic delay (e.g., congestion) had arisen.

Same as  above, but with respect to returning home from an activity no later than that needed/desired.

Total extent of the travel day; i.e., how spread out the travel is.

The "cost" associated with using more vehicles than absolutely necessary; e.g., using two different vehicles, rather than one.

Total waiting time (at destination) before performing activities.

 

The results of the estimation were very encouraging.  In all cases considered, the genetic algorithm converged either to the actual observed pattern or to a very close approximation to the observed pattern in relatively few iterations (four to five).

Although we believe that the procedures developed under this research represent an important step in moving activity-based travel approaches beyond descriptive analysis toward a foundation for practical demand modeling, there nonetheless remain significant challenges.  In the current implementation we have addressed the inference of utility coefficients for one activity pattern at a time; in a sense, we have managed only to find a “feasible” set of coefficients that, if applied to the assumed utility function, would replicate the observed pattern of that particular household. Further work is needed to deal with multiple activity patterns simultaneously so that a common estimated coefficient set can be applied to an entire sample of activity patterns with the result that the predicted activity patterns are closest to the actual activity patterns, within some stated confidence.  The mechanical aspects of this step are simple enough – it can be accomplished by redefining the fitness score as some weighted average across all sample points, and adding a “global” loop on the estimation process – but we may never be able to ferret out robust statistical measures of model performance or, more importantly, confidence in applying such coefficients to the population.

Because the total number of possible multidimensional operation sets is equal to multiplication of the numbers of all optimal paths in Uni-dimensional Sequence Alignment Method for all attributes, the Multidimensional Sequence Alignment Method might be very time-consuming when the number of activities considered is larger than those considered in the three test cases reported here.  Additionally, the Time Sequence Comparison technique that has been employed, which simply offsets activities along the time dimension to maximize the overlapped time or number of overlapped activities, may be too simplistic.  For purposes of illustration, we have used a fitness score that is simply the difference between the “edit distance” (i.e., the number of steps to equate two sequences) and the degree of overlap along the temporal axis.  The scaling problem is not addressed and the fitness score has no practical meaning.

Finally, despite its complete description (and analytical management) of the “hard” physical constraints imposed on a household’s collection of activity/travel decisions, the HAPP model has no explicit representation of such “soft” constraints as each member’s role in the household, or their particular predilections and idiosyncrasies, that may be imposed on top of these physical constraints.  One promising avenue for incorporating these considerations may be in adapting some of the procedures identified with the relatively new field of uncertain programming (Liu, 1999).

References

Arentze, T.A., F. Hofman, C.-H. Joh, and H. Timmermans (2002). Activity pattern similarity: a multidimensional sequence alignment method. Transportation Research Part B: Methodological, 36, pp. 385-403.

Holland, J.H. (1992a). Adaptation in Natural and Artificial Systems, 2nd Ed., MIT Press, Cambridge, MA.

Holland, J.H. (1992b).  Genetic algorithms.  Scientific American, July, pp. 66-72.

Liu, B.  (1999).  Uncertain Programming.  John Wiley & Sons, Inc. New York.

Michalewicz, Z. (1992). Genetic Algorithms + Data Structures = Evolution Programs. Springer-Verlag, New York.

Recker, W.W. (1995). The household activity pattern problem: general formulation and solution.   Transportation Research, 29B, 1, pp. 61-77.

 

 

REFERENCES (Other papers arising from this and related UCTC work):

 

Recker, W.W. and A. Parimi (1999).   Development of a microscopic activity-based framework for analyzing the potential impact of TCMs on vehicle emissions. Transportation Research, Part D: Transport and Environment, 4, 357-378.

Recker, W.W., C. Chen and M.G. McNally (2001).   Measuring the impact of efficient household travel decisions on potential travel time savings and accessibility gains. Transportation Research, Part A: Policy and Practice, 35, 339-369.

Recker, W.W. (2001).  A bridge between travel demand modeling and activity-based travel analysis. Transportation Research , Part B: Methodological, 35, 481-506.