Project 2 – AirBnB Western Australia 2023-2024
This project takes the perspective of an investor looking to invest in property and use for AirBnB. The investor will want to know where the best locations are. The Western Australian AirBnB dataset was donwloaded from AirBnB and comprised 2 files:
The original plan was to load the csv files into excel, cleanup the data and then import into Tableau.
1. Combine Data
The calendar.csv file had 4.2M rows. Excel can only handle 1,048,576 rows.
– split the calendar.csv into 5 files, each with approximately 1M rows.
– created a new CSV file called wa-airbnb.xlsx and added 6 sheets called calendar1-5 & listings.
– copied each group of 1M rows from the calendar files into a unique sheet
– copied listings.csv data into the listings sheet
2. AirBnB Addresses
The listings data contains addresses of the AirBnB properties. This data was inconsistent or blank. To do any location analysis I needed to clean and simplify the data.
– ’name’ column identified as having the most consistent location information.
– formula used in ‘neighbourhood_group_cleansed’ column to extract location from ’name’ column (between two specific charcaters)
=MID(A1, SEARCH(“in”, A1)+3, SEARCH(“Â”, A1) – SEARCH(“in”, A1) -3)
– converted cells with formula to text (copy/paste values only)
Now that the location was correct I wanted the associated postcodes
– created new sheet called ‘postcodes’ and added all postcodes and associated locations
– added ‘postcode’ column to listings sheet and used VLOOKUP to populate
3. Import into Tableau
Imported xlsx file
Union of 5 sheets labelled calendar1-5
Joined union and listings sheet
4. Create Worksheets (for Dashboard)
– Price (avg) by postcode (bar chart) To see how expensive each postcode is
– Price (avg) by postcode (map) As above but using a map to represent the data. Also shows neighbouring postcodes
– Future booking revenue Determine how much revenue could be achieved over the next year
– Price (avg) per bedroom Determine best ROI
– Bedroom count per listing Shows how much competition there is for the number of bedrooms
5. Create Dashboard
The dashboard displays all the sheets. The views are linked so interacting with one data point will impact the data displayed by the others.
This project could be enhanced by including the median house price per postcode. This would enable a review of purchase cost verses expected returns and possibly how quickly the financial outlay could be recovered.
There was a single postcode (Rottnest Island) that distorted the results. It could be beneficial to remove Rottnest from the dashboard.