Identify moved carsFirst, we need to identify whether the car has moved or not. To do this, I removed duplicated rows based on location and only keep those whose location info has changed. Next, I get the rows represent the car first and last shown at that location (this is to get the time stamp of a trip: last shown at location is the start of a trip and first shown at another location if the end of a trip). Find and accurately plot the routeThen I use a for loop to loop through different cars (that have moved) and for each car, I loop through different trips (most likely, a car will have multiple trips). For each trip, I can get the route info from goole map api (library(ggmap) does it). It also gives the distance between each turn. The route info is an approximation to the actual movement of a car. Car2go doesn't supply realtime GPS info when the car is moving, it only records when a customer checks out a trip. However, I think the approximated route is close enough to a real life scenario, which assumes most car2go customers use the service for transportation purpose, other than leisure and recreation activities. Then I was able to plot the route for each car using the route data return by google maps. The result is shown on the left below. While it roughly represents the route on the map, it fails for curvy roads with less turns. The reason is that route(output = 'simple') only gives instruction for each turn, and between each turn, geom_path uses a straight line. In order to solve this problem, I found this article, which converts polyline from goolge map api with route(output = 'all') and outputs (lon, lat) coordinates. Now the path represents the actual route on the road, as shown on the right plot above. All trips during a 24hr spanNext, I plot all the trips happened over Dec 07 13:40 - Dec 08 13:36 (Monday - Tuesday). The covered area is, as expected, similar to the service area of Car2go Austin. Suprisingly, no trips took place in UT campus during this period (except very few in north campus). There could be several reasons: 1: limited parking space, 2: students are studying at home for final exams rather than taking classes, so there is significant less population, 2: Car2go is less popular than public transport for students. The actual reason is unknow from this set of data. More data (taken during normal semester time, during weekend when more parking is avalable, etc.) is needed. Update: I just found UT campus is a stop-over area only, therefore, it is not suprising at all. Trip statisticsNext let's take a look at trip statistics. Most trips are less than 5 miles and 50 minutes. Note there are a significant amount of less than 1-mile trips. While some of them are actual trips by customers, the rest could be noise in the data or moveover by Car2go. Now, we can take a look at the starting time of a trip during a day. As shown in the above plot, most trips are for commute (~8am and ~6pm) and very few trips took place during midnight.
2 Comments
1/7/2017 09:52:52 am
This is wonderful representation of car route and its preferred timings.
Reply
3/14/2017 01:30:51 am
This is quite interesting but scary at the same time. Does it do realtime updates as well? Because I don't think I'd like to publish something like this online.
Reply
Leave a Reply. |
AuthorA mechanical engineer who also loves data. Archives
January 2018
CategoriesBlogs I enjoy reading |