Collecting, Cleaning and Evaluating Pilot Project Data

Photo credit: Susana Hey, MBTA, 2019

From July 2015 through June 2016, the MBTA ran the Youth Pass Pilot, which tested a way to provide reduced-priced MBTA passes to eligible young people without too much administrative burden on the MBTA. The results of the program were published in the Youth Pass Final Report and presented to the FMCB, who made two major changes based on the pilot research: adding the capability to purchase a Student Monthly Pass at fare vending machines (FVMs) with a Student CharlieCard, and making the means-tested portion of the Youth Pass for those not in school permanent.

Youth Pass Pilot Goals

Before the Youth Pass Pilot, some young people riding the T received reduced-price passes during the school year from their school, but these passes were limited to some schools and were only valid for the school year (and therefore did not cover the months of July and August). Demand for the Youth Pass came from students not able to access the Student Pass and young people not enrolled in middle school or high school. The Youth Pass Pilot was designed to evaluate three aspects of the proposed program: benefits to the participants, cost to the MBTA, and administrative feasibility of working with municipal partners. In order to answer these questions, the pilot included a robust data collection and analysis component.

Benefit to the participants was principally measured by the ability to take more trips on the MBTA. The table below shows MBTA usage for the participants who previously did not receive a reduced-price pass. During a school month, before receiving their Youth Pass, these participants traveled an average of 33 times on the MBTA. After receiving the Youth Pass, participants traveled an average of 58 times.

Average Unlinked Trips Per Month for School Months

Note: No student monthly LinkPass use in baseline data

Participant CategoryBaseline: School MonthYouth Pass: School MonthChange (Total)Change (Percentage)
Enrolled in School (Did not use Monthly Student Pass)27.454.1+26.9+97.4%
Not Enrolled in School37.362.2+24.9+66.8%
Average for All Participants32.657.6+25.0+76.7%
Data Sources: MBTA Youth Pass pilot program application data; MBTA baseline AFC data; MBTA Youth Pass pilot AFC data.  
Source: MBTA Youth Pass Pilot Evaluation — Final Report

On average, Youth Pass Pilot participants increased their usage of the MBTA by over 75% after enrolling in the program and purchasing a discounted MBTA pass. While the resulting data is straightforward, the process to collect and analyze the data was more complicated. The remainder of this post discusses the process followed to gather, clean and analyze the data for the table above.

Collecting the data

Researchers designing the Youth Pass Pilot created a detailed research design to ensure they would have adequate data for evaluating the Pilot. The first step was to design an intake survey for potential participants in the program to collect demographic and school enrollment information, and transit usage patterns (frequency, purpose, payment type). The Youth Pass Pilot program was advertised by both the MBTA, non-profit organizations and the municipal partners and as a result, over 4000 potential participants applied online. Once randomly selected for the program, participants were contacted and brought in their proof of eligibility to their applicable partner office. At this point, participants agreed to be research subjects and filled out paperwork agreeing to have their T usage tracked for the study. Each was assigned a participant number to anonymize their data for the rest of the study. To establish a baseline of usage data, participants provided their current CharlieCard number (or were given a new, empty card) and were asked to use the MBTA as they normally would for 30 days before they could purchase a Youth Pass. While the plurality of participants began at the beginning of the pilot, additional youth were added on a rolling basis throughout the program; therefore, this baseline data in the above table covers multiple months.

After completing the 30-day baseline period, participants were able to purchase a pilot Youth Pass at the partner office. These passes were not specially coded in the MBTA’s Automated Fare Collection (AFC) database, so each card serial number had to be manually recorded and matched with a participant in order to analyze the usage data. In cases where participants lost passes, the new numbers had to be matched as well. After matching pass numbers, analysts queried the MBTA AFC database to pull all the associated usage data from any serial number designated as belonging to a Youth Pass participant (many participants ended up using multiple cards during the Pilot period). These queries searched all the transactions in the AFC database (roughly 30 million each month) and returned all those made with a serial number associated with a Youth Pass participant.

Multiple layers of complexity and potential error came into play at this step. The serial numbers were recorded manually by the city partners and the cards do not come in any particular order, so it was possible they could be copied down incorrectly. Data had to be checked to make sure the usage passed the “smell test” — was in the correct time period and included a pass purchase at the partner office. If participants lost their cards, they could receive a replacement LinkPass on a ticket if they reported it and it was early enough in the month. While this allowed participants to continue riding the T, their usage of any tickets were difficult to link to the rest of their usage data. Tickets are also often flashed (rather than inserted into the farebox) on-board buses, so these trips would be absent from the data. Fortunately, the rate of lost passes was very low (less than 2%), so the effect of lost passes on the collected data was marginal.

Analyzing the data

After the trip data was assembled, the final dataset included tables of trips made by Youth Pass participants, where each row represented one interaction with a fare device (usually a faregate or a farebox). For each interaction, we knew the date, time, type of interaction, location (if the device was stationary and not a farebox), amount of the transaction (usually $0 for a pass validation) and the signcode displayed if the interaction took place on a vehicle. We also knew the serial number of the card, and had joined that to a participant number, so we were able to divide participants into various groups (for example, which city they lived in or their household income) to explore their usage data.

Putting it together

One of the key questions we examined in the Youth Pass Pilot was whether access to the Youth Pass increased young people’s use of the system, and by how much. There are multiple examinations of this question in the full report, but the most striking result was contained in the table posted above. To make this table, we first excluded any participant who had reported receiving a student monthly pass. Most of this group received a pass from Boston Public Schools, and were only interested in the Youth Pass for the summer (the changes to the student pass discussed above have addressed the needs of these people). The remaining participants were divided into two groups: those who were in school (but didn’t receive a student pass) and those who were not in school. We then divided the usage data into two buckets: “Summer” months (July and August) and “School” months, as we had noticed travel patterns varied widely between the summer and school year. School months showed more usage in general than the summer months, so those months are shown here. Trips which occurred during months when the participant did not use a Youth Pass were also removed from the data.

Finally, the baseline data was averaged for each group and compared to the data from the study period. As you can see, the Youth Pass increased usage of the T for participants by a large amount. While this makes sense when one considers that participants previously had to pay more for these rides, the extent of the increase indicated the true latent demand for MBTA service. The research also surveyed participants to find out the purpose of these trips and how they would have taken them without a Youth Pass. This data, combined with the other aspects of the pilot, provided the FMCB with the information needed to make a decision about whether to extend the pilot to a full program.