NYC Subway
It is hard to mentally separate New York City from its iconic subway system, which can boast of having the most stations of any metro system in the world (424* 472 if interchange stations (i.e. different sets of platforms) are counted separately.
, exactly one hundred more stations than Shanghai, which comes in second) as well as the the highest ridership, most boardings per mile, longest track lengths, and most lines of any rapid transit system in North America.
All of that is a roundabout way of saying that — despite its mounting troubles — the subway system is a fundamental and dynamic part of New York City and its economy. But how exactly? We can explore this relationship using businesses, their addresses, and their functional tag, classified according to the Locus Classification System. Specifically, we compare the distributions of businesses surrounding subway stations and non-subway areas, and identify various economic communities in New York through the lens of subway stations.
The Locus Classification System (LCS) assigns economic entities a functional barcode, with each field in the barcode corresponding to a specific attribute of the classified entity. LCS applies a unified framework to classify both companies and occupations.
First, we identify the businesses surrounding a station. Take the Franklin St subway station in Tribeca as an example. We can define a radius (792 ft, roughly 3 blocks*
This is roughly the same length as one avenue.
) around the station and identify all businesses that fall within that radius. We do this for each subway station in New York City, as well as a corresponding number of randomly selected points (taking care to ensure that the radii of these points do not overlap with the radii of any subway station). We will call these subway areas and non-subway areas, respectively.
We then tag all businesses by their Locus functional attributes: what they do and for whom they do it. At the most basic level, all companies perform one of four activities:
By calculating the proportion of businesses in an area that perform a particular function, we can identify functional patterns. To illustrate, many investment banks (they allocate money) and management consultancies (they strategize) are located in Midtown. Sure enough, the proportion of businesses around the 51st St station that manage is 42%, vastly higher than most subway stations (the average is 14%), indicating a high concentration of this function (and business suits) there. We can do the same calculations for all functions in each area.
Wondering why some words are colored? Hover over verbs and nouns that are colored to learn what Locus code they correspond to. Each word corresponds to a specific Locus activity or resource and corresponding color.
So how would the distribution of businesses in subway areas be different?
Well, since subways are a major mode of transportation for commuters and visitors in New York, we would expect businesses in the immediate vicinity of stations to be more reliant on foot traffic, visibility, and convenience for their customers.
For that we want to investigate businesses that primarily deal with
We see that this is the case with the proportion of businesses that
The proportion of businesses that
One might also suspect that businesses that deal with transportation and logistics may benefit from being close to the subway system, since being highly connected is essential for them.
We would hence expect these businesses to be highly concentrated in subway areas. This turned out to be true particularly for businesses that
Bearing in mind that these businesses represent only 3% of all businesses, they were still more likely to be found close to subway stations, making up 3.2% of businesses in subway areas compared to 2.7% in non-subway areas. This is true for the
While differences between the businesses in subway and non-subway areas exist, it turns out that subway areas are not all the same. In fact, these differences are more pronounced than the difference between subway and non-subway areas. This is not actually surprising. Most would agree that the concrete jungle of the Financial District looks very different from the residential landscape of the Upper East Side, in no small part due to the businesses that make up the neighborhood.
Visualizing, deconstructing, and understanding the boundaries between these different economic communities is more suitably done with subway stations than zipcodes given how integral and noticeable they are to the daily lives of New Yorkers.
PROFESSIONAL SERVICES AND GOVERNMENTAL OFFICES
Let’s revisit the Financial District. Its subway stations*Wall St, Broad St, Fulton St, Rector St, and Cortland St have a distinct functional distribution, similar to that of subway stations in Two Bridges*City Hall, Brooklyn Bridge-City Hall, Chambers St, Midtown*Times Square, 34th St, 50th St, Bryant Park, Lexington Ave-53rd St, Lexington Ave-59th St, Grand Central, and Brooklyn Heights*Borough Hall, Hoyt St, Court St, Jay St-Metro Tech. Specifically, these economic communities are marked by high concentrations of businesses that
For example, while the median proportion of businesses that act on
It is possible to identify subclusters by being more specific about what businesses do, specifically: professional services, finance, and public administration. Professional services is high in businesses that
Tech and Information
Related to professional services are the technology and information functions, which have obvious geographic patterns. This is visible from the concentration of businesses that
The concentration emanates from Soho and Nolita*Houston St, Prince St, Bleecker St, Spring St, where the proportion of businesses that act on Information is 15-28%, compared to the median of 6%. There is also a second hotspot around the Flatiron District — sometimes referred to as Silicon Alley — with slightly lower proportions of around 18-20%. This concentration of Information businesses is not as pronounced immediately around these hotspots (e.g. in the financial district or midtown), and gets even lower as we move further out. In the other boroughs, it is almost non-existent.
A recent report found that living in apartments that are a ten minute walk from subway stations decreases rent by around 10%*Let’s be honest, in New York, that is a decent chunk of money.. Why is this the case? While a shorter commute no doubt plays a role, it is not a stretch to imagine it being at least partly due to the economic composition of each area; the proportion of restaurants and bars it has, for example. There are undoubtedly many more interesting changes in the presence of a subway station. Perhaps as a natural experiment, we can track how the composition of businesses surrounding the recently completed 2nd Avenue subway stations on 72nd, 86th, and 96th St will have changed from their current state in five or ten years.
Additionally, as evidenced in the latter half of this article, cities contain various and distinct economic communities. Using subway stations as landmarks enables all of us to chart our daily path and explore the economic communities we interact with, and perhaps learn more about those with which we do not. A final question remains: do the businesses around the Wall St. station look similar to those around the Downtown Crossing station in Boston and the Canary Wharf station in London? Likewise, do Yankee Stadium and Fenway Park belong to the same economic community? With the right data, these questions can be easily answered.
Notes
Subway station data was obtained from NYCOpenData, while the business data was obtained from Dun & Bradstreet. To compare the differences between subway and non-subway areas, we used the independent samples t-test, controlling for multiple testing with the Benjamini-Hochberg false discovery rate. To estimate the changes in function with distance from subway stations, we used a multiple regression model and included a quadratic term for distance (to account for its exponential distribution) as well as the population density, average wage, and number of housing units of the zip code to which the coordinate belonged. The map visualizations were made with Bokeh in Python.