Visualization of Citi Bike station node influence

My group's midterm is based on pruning "weak" stations and/or adding supplementary stations to the Citi Bike network in the hope of improving the system overall. The first step in this process is to identify weak stations that might be candidates for pruning, so I decided to make a visualization to get this information.

I used Gephi to create my visualization. Gephi worked best when I gave it separate data tables for nodes and edges. I loaded the Citi Bike station data (nodes) and trip data for February 2014 (edges) to form a graph.

One of the first things I realized was that the nodes could be easily divided into Manhattan stations and Brooklyn stations just by looking at the network's modularity (the strength of division of a network into modules). There are of course thousands of trips that start in Manhattan and end in Brooklyn, and vice versa, but the vast majority of trips start and end in the same borough. In the graphic below, Gephi grouped the nodes in this shape after I ran a modularity analysis (with a resolution of 2.0).


This means that pruning the weakest node overall might not affect the network as much as we had hoped it would. However, focusing on each module specifically is bound to have better results.

To determine the weakest nodes I looked at eigenvector centrality (Google's PageRank is a variant of this metric), which is "a measure of the influence a node." Connections to higher-scoring nodes are valued higher than those to lower-scoring nodes. The results (after 1,000 iterations) can be seen below. Gephi has several layout options to optimally display the network graph based on the chosen measurement (in this case eigenvector centrality). The Brooklyn visualization used a different layout for improved legibility.


Red nodes have higher scores and blue nodes have lower scores. The visualization makes it obvious which nodes are candidates for pruning; Railroad Ave & Kay Ave in Brooklyn and Leonard St & Church St in Manhattan are the lowest-scoring nodes in their respective boroughs.

This is accurate for trips made in February 2014. However to really get a sense of the all-time weakest nodes we would need to use all available data in these algorithms. There are likely seasonal changes in ridership that make certain stations less utilized in the winter.

1 comment:

  1. As a qualified plasma cutter operator, want to|you should|you have to} have an excellent habit hold up} your plasma cutting machine often, is in a position to} extend the service life of the machine. High definition CNC pasma cutter for sheet steel uses STARFIRE management system with FastCAM software for auto material saving operate, massive capacity storing operate, handy to learn and process. If Every axis of a machine software is managed by utilizing a mini computer[which can run utilizing Coding (G-Codes & M-Codes)] referred to as as Computer Numerical Controlled Machine. If Every axis of a machine software is managed by utilizing a mini-computer [which can vibrating panties reviews run utilizing Coding (G-Codes & M-Codes)] known as as Computer Numerical Controlled Machine. Still on the fence about utilizing a 5-axis CNC machine vs. a 3-axis machine?

    ReplyDelete

Speak now...