Update: using Manhattan distance in k-meansNote the capital 'K' in Kmeans function from the amap package.
Using 'Manhattan distance' puts more weight to sparsely distributed locations. As a result, we can see more proposed EV locations in remote areas and less in downtown.
Code is published on my github.
Previously, I did some analysis on car2go's location data to find the most popular roads in Austin. But we can do much more. One question I have is: if car2go wants to replace the entire Austin fleet with electric vehicles, where should the charging stations be? Can we use the existing public charging stations? How many more shall we build and where? In this article, I will try to answer them using the data I scraped. If ever one day car2go decides to do so, it should be a more thorough analysis than this one, especially in business domain. However, this could be a good starting point.
Fun fact: Car2go has the only all-EV fleet in San Diego in the whole US. ## Location data
I'll use the locationI scraped last month. A car will have multiple entries because it is not constantly moving. Those duplicated entries seem redundant at first. However, since charging an EV takes substantially more time than filling the tank, a car staying at one place for a prolonged time makes this place more suitable for a charging station. Therefore, these entries puts more weight in my algorithm later on.
First, let's see those locations.
Note those remote home areas: the domain, far west and the parking spot near airport.
## Finding optimal location for charging stations
To locate optimal charging stations, we need to minimize the distance that car2go staff have to move the car from where it is returned to the station. One method immediately coming to mind is K-means. It does exactly what we need to find those locations (or centroids). So the next question is: how many charging stations? Can we use the data to determine the number? Let's plot the within-group sum of square.
So it seems after 10, the overall WSS reduction is not significant wrt increasing no. of centroids. But is this the optimal number? It seems too few. We have to consider more aspects: cost of a new charging station, cost of moving the vehicles per unit distance, max range of a car, or even towing expence. All these requires more data and a business mind. For the sake of this article, I will assume building a charging station is relatively cheap and top priority is customer convenience. So let's take 50 charging stations.
So the crosses in the figure are proposed charging stations. The algorithm suggests we deploy the station at each of those remote home areas: the domain, far west and the parking spot near airport. More stations should be deployed in downtown as expected.
## Using existing public charging stations
For those locations, can we use existing charging stations in Ausin? I downloaded ev station data from here: http://www.afdc.energy.gov/data_download/. Now let's plot proposed (
X) and existing stations (E) together.
Again, downtown is well covered. But residential areas like Barton hills and South Lamar are not. The reason is that public EV stations are often built in places of interest (e.g. malls) while car2go parking rules require the cars to park on street meters. If I have to park at a mall, I need to pay the entire duration. So given this fact, it is not suprising that additional charging stations are needed.
The criteria for a new station is that no existing station is within 0.5 miles of the proposed station.
The light-blue crosses represent a station very close to existing ones and dark-blue crosses are new one to be built. There are 14 in total.
## Conclusion
OK there you have it. I just used the k-means method to propose new charging stations if car2to decides to deploy an all-EV fleet in Austin. There are 14 locations that require new charging stations. Most of these locations are residential areas far from downtown, where the EV infastructure is lacking.
Back to R script, k-means is really easy to implement. The harder part is to connect the data with business insights.
4 Comments
Interesting read. Nice to see how less code is necessary to obtain those results meanwhile.
Reply
Thanks for your read!
Reply
Long
2/28/2016 06:40:53 pm
Even in k-means, the objective isn't euclidean distance...replacing the objective with sum of distances, e.g. euclidean or manhattan, would make the problem difficult.
Reply
Jun
2/28/2016 07:55:00 pm
Why is that?
Reply
## Leave a Reply. |
## AuthorA mechanical engineer who also loves data. ## Archives
April 2016
## Categories## Blogs I enjoy reading |