New York , Adding data to the debate
Airbnb has become a go to place for people to lease or rent short-term lodging. Airbnb has seen a meteoric growth since its inception in 2008 with number of rentals listed on its website growing exponentially each year. Whether you are looking to rent a room in your house or find a cabin in the woods, you can use Airbnb. New York City has been one of the hottest markets for Airbnb, with over 48,000 listings. This means there are over 40 homes being rented out per square km. in NYC on Airbnb!
A Look at the Data:
In this post, I will perform an exploratory analysis of the Airbnb dataset sourced from the Inside Airbnb website to understand the rental landscape in NYC through various static and interactive visualizations and predict the prices of the Airbnb properties with cutting-edge machine learning tools. Specifically there were two main datasets used in the analysis: listings data and calendar data for New York.
After looking at the data I arrived at four questions aimed at various topics of interest: availability, cost, property types, common streets listed, neighborhood. Specifically, I looked at:
What is the average pricing of the properties for renting? Are there any seasonal spiking in the pricing of the properties?
Can we use some of the features in the listing data to predict property pricing? What are the most important factors affecting the cost of the properties?
Lets take a look at our first question of interest.
Both the dataset were required to analyse this question.Based on the content in the listings data, I analysed the most common property types in New York available for renting.
Not surprisingly, Apartments and Houses take up an overwhelming majority of all listings followed by Townhouse and Condominium.There are total 36 types of properties available for renting in New York out of which, the property types displayed in the graph above have the maximum listings.
For analyzing the availability of the property throughout the year, calendar dataset was used. A plot of the proportion of homes available over the months in New York is shown in the plot below.There are few interesting takeaways from this plot.
The highest availability of Airbnb homes will be on February 2020 with 40% of properties available for renting while lowest availability was observed on September2019 with just 10–15% of homes available. The availability largely increased after September2019. There will be quite a dip of 10% in the available after February2020. There won’t be major ups and downs in the availability after February with the range of availability lying in 25–32%.
The more detail analysis of the availability from September-2019 to September-2020 can be seen in the plot below.
Day-1 corresponds to 09/12/2019 and Day-360 corresponds to 09/11/2020.
Below you can see the average price listing for each property in New York.
Most of the properties fall in the range of 0–200 dollars.As the price increases you can see the dip in frequency of the homes.The average pricing of homes in New York around $152 while median and highest pricing is around $105 and $10,000.
Seasonal spiking in the pricing of the homes can be seen in the graph below.
Similar to availability, we can also see how listings prices change over time. The highest average property price will be on the December-2019 ranging just above $162. The lowest would observe on February-2020, due to the highest availability of the homes in that period. A huge downfall of the pricing will be observed after December lasting till February-2020. After which price tends to rise till April-2020. There won’t be a major fluctuation in the pricing from April-2020 to September-2020.
A more detail analysis of the pricing over the year can be observed in the plot below.
Day-1 corresponds to 09/12/2019 and Day-360 corresponds to 09/11/2020.
There are over 318 streets in New York having Airnbnb properties available for renting .After analyzing the listings data, I found the most common streets out of them.
Here are the most costliest streets in the New York for renting.
Forest Hills and Chelsea dominate the pricing of the properties in New York with average pricing for them ranging $1005 and $800 respectively.
Pricing of the homes with respect to the neighborhood group can be seen in the graph below.
Manhattan is the costliest neighborhood for Airbnb properties with average pricing ranging at $200. Bronx seems to be cheapest with the average pricing at approximately $80. Brooklyn, Queens and State Island are lying at the affordable pricing of $100-$125.
After analyzing, visualizing and getting the proper insights of the first three questions, the toughest challenge is to predict the pricing of the new properties based on the data. There are more than 90 columns in the data, hence selecting the appropriate columns for predicting price is the challenging part.
I developed a machine learning model, for answering this specific question, which produced the median absolute error score of just 20. Median metric is used to calculate the error because of various outliers present in the data.
According to the my model, here are the most important factors affecting the property price.
Longitude turns out out be the most important factor affecting the property price which clarifies there is a huge difference in pricing of the properties from east to west in the New York City. Latitude is also one of the most important factors which explains the varying property price from north to south.Here is little little insight in to pricing across Longitudes and Latitudes.
Room type is the second important factor.There are fours room types listed: Entire home, private, shared, and hotel rooms.Here is some insight on how the room type affects the pricing.
Next is the accommodates, pricing tends to go up with increasing accommodates which is quite obvious.The same is applicable to bedrooms which is the fourth most important factor followed by latitude and availability of the properties across the year.
- We looked at different types of properties and how the availability of the properties change over time.
- We then looked at average pricing of the properties which turned out be around $152.We also looked at the seasonal spiking of the properties.
- We looked at the most common and costliest streets in New York.Forest hills is the most costliest street to rent while New York Street is the most common listed in the data.We saw how pricing changes over different neighborhood groups. Manhattan seemed to be the most costliest neighborhood group while Bronx is the cheapest.
- Finally, we confirmed that we can predict the prices of the new properties based on the data we analyzed. Longitude, Room type, Accommodates, Bedrooms, Latitude and Availability turned out to be the most important factors affecting the pricing of the properties.
According to you, what are the most important factors affecting the pricing of the Airbnb properties?
To see more about this analysis, take a look at my Github repository available here.