Bhanu Patlolla is the Director of Data Science and AI systems at Extra Space Storage. He joined the company in 2016 as a Data Scientist after working in ecommerce and supply chain.
Bhanu started his career in ecommerce as a business analyst and developed an expert knowledge of using SQL, Excel, and Python. During that time, he became more interested in Supply Chain and Operations Research, which led him to pursue a master’s in industrial engineering at Georgia Tech.
During his master’s, he interned at Cardlytics, a marketing platform that does machine learning on customer shopping trends and provides insights to different companies. “That’s where I gained my first experience with big data and machine learning,” Bhanu shared.
After completing his degree, he worked with The Home Depot on the supply chain team but found he wasn’t particularly interested in the work they had. That’s when an opportunity opened up at Extra Space Storage.
Drupad Kumar Khublani, a senior data scientist and member of the Extra Space Data Science team, recently sat down with Bhanu to talk about the evolution of data-driven insights in the self storage industry. Part one of this two-part interview covers how data science has changed over the past several years, and how moving to a cloud-based system improved the ability to leverage data and enhanced decision-making among customer acquisition teams at Extra Space Storage.
How important is data for the self storage industry?
We see data-driven insights as one of our core competencies. Our team is influential within the company, largely because we put a lot of effort into understanding data and driving decisions based on that. Whether it be pricing tests, speed tests, or marketing tests—we never make a decision based on a hunch because hunches often end up being wrong. That approach has increased confidence in leadership to test ideas, more than going off what someone feels. Especially in an industry where many people think data is not as important, I would say data has given us an edge compared to our competitors regarding our performance.
How has the use of data changed over time within our company and the industry?
If you look at other competitors, we have seen that they have started spending more resources on internal data science teams. Take pricing, for example. Some of them did not have in-house pricing teams. They would hire external software providers to do their pricing. But now, they have moved to setting up their own pricing teams. We have also seen a shift in attracting talent. Some of our large competitors have started looking to major tech companies and even hiring some Ph.D. holders to lead their data science teams. So, there has definitely been a significant change in the industry.
Within the company, from what I have seen since I joined Extra Space, is a shift in the number of very data-passionate teams. Early on, it was mostly the pricing, website testing, and paid search teams that were heavily reliant on data. Now, the data science team has influenced other business areas like the National Solutions Center, Customer Experience, Acquisitions, and SEO.
What are three significant changes in technology and data science that you’ve seen at Extra Space Storage?
The first that comes to my mind, because it’s more recent, is switching over to a cloud-based provider. We used to have an environment where we had servers in our basement. The team became experts coding in SQL, and later Python and R to build our models. Now, we have adopted Azure as the platform for our cloud-based environment, and the team has been working with Databricks for the last two years. A few of us are now very well-versed in using Spark technologies to make processes faster. They’re able to create better pipelines, compare more models easily, and bring more insights to test new ideas. So that’s an area of technology where we’ve seen significant changes.
Second, the number of teams we have started influencing has significantly changed. We started small overseeing pricing, but now, as I mentioned, we oversee pricing and support many other teams—Paid Search, Web, Brand, SEO, Call Center, Customer Experience, Product, and Acquisitions. That’s a substantial change in the number of teams we touch.
And three, I would say the people. We started with a small team of just four data scientists. Now, the team has grown to about 14 people—each specializing in a particular field, bringing diversity of talent.
What did the data infrastructure look like before moving to a cloud-based provider?
Before the shift to cloud, we worked closely with the data management team, and everything was stored in SQL databases. We used to code a lot in SQL. We would set up stored procedures to run one after another to transform the data, feed the consolidated data into a Python model, and finally push the outputs back into SQL, which would be the endpoint. From there, other teams would use our recommendations, and either show them in reporting or on our website.
Now it has expanded. We had our traditional SQL databases migrated to Azure SQL database. But we also now have access to Data Lake, and the team works exclusively on using Data Lake as the storage. Some production-level datasets continue to reside in SQL. So that’s how we’ve evolved our data storage. Processing computations now happens more on the cloud. We use Databricks to set up notebooks that run in SQL, R, or Python. And they have effectively replaced what we used to do using stored procedures.
What tools was the team using before cloud, and how has it changed?
It would depend on the data scientist. We had reasonably fast-performing personal laptops for each data scientist. To some extent, they could run some of these models or at least test versions of their algorithm on a smaller dataset on their local machines. Other than that, we had a remote machine with much more computation power; if the datasets were huge, we would write on that remote machine. Now in the new environment on Databricks, you can create a computation cluster of whatever size you want and run your models.
What were the steps taken to improve the data infrastructure?
There were responsibilities split between our team and the data management team. On the Data Science team, we had a lot of meetings internally and with external consultants to talk about the right approach, the right cloud platform, and if the cloud was the right step. Once we established that, figuring out the right provider was challenging. Our company has been relying on Microsoft products, so it was a logical step to use Azure. I would say, though it’s the number two cloud provider, it does a decent job of what we’re trying to accomplish. And it didn’t require us to change many things internally like we would have had to if we chose something like GCP or AWS.
That made a lot of things easier for us when we started and in getting to the next step. Once we had the cloud in place, we decided to go with a data-first approach, meaning all the datasets had to be migrated in a format that made sense to all the teams. And only then will individual teams like Data Science move their applications onto the cloud.
It was about a year-long project for the Data Management team to migrate their portion, and then this year (2022), we started migrating our applications, and it’s been a year-long project for our team as well.
Were there differences in how much time it took to migrate to the cloud for different teams or projects?
Yes. Within our team, two major systems needed to be migrated. First was the paid search bidding system. That took us around 3-5 months to migrate, and I would say that was possible because the people working on the paid search system had recently worked on a new model, so it was fresh on their minds on what to do to migrate to the cloud.
With the pricing system, the challenge was that it’s a 10-year-old system, so there were many nuances and applications dependent on that system. We went about it by first figuring out the scope and all the different parts of the system that needed to be migrated. We also had to consider all the teams that rely on the system and keep them in the loop. Once we scoped out the project, we scoped out the resources we would need to migrate and how much time it would take to accomplish the task. We started with one resource for the project, and then had a couple of new hires brought up to speed. This project was done along with their day-to-day requirements supporting business teams and test executions. It is still a work in progress, but we are about 90% complete.
Were there other elements you had to consider when moving to the cloud?
Yes. With the paid search rating system, we decided to lift and shift. That means we essentially just took every element and put it on the cloud as is. We didn’t necessarily change much.
With the pricing system, it was a once-in-a-lifetime opportunity for us to fix a few fundamental things. With the old pricing model, there were too many moving parts. There were quarterly processes, monthly processes, and daily processes. We redesigned elements so that we could run all processes daily. Making such fundamental changes takes time and brainstorming, and so it takes much longer if your project involves items that aren’t just doing “lift and shift,” but fundamentally redoing things. And especially because we are a data-driven company, we didn’t want to say that something was better just because it was on a newer system or because the logic seemed to make more sense. We wanted to test the new algorithm and prove that it worked better than the previous one. When you account for the time it takes to test the new recommendations and redesign the fundamental elements of the systems, it can take a year or even longer.
Since moving to the cloud, how have things improved for stakeholders and helping them make data-driven decisions?
When we started, most things were done through Excel tools. We would set up macros to pull in SQL datasets and built views on top of it. Over time, especially for reports with a lot more eyes on them, we partnered with the Financial Planning & Analysis team to set up reports in Power BI. We also used internal web-based applications to report test metrics. There were three different tools when we started. With the shift to the cloud, it has become much easier for us to interact with Power BI to display insights. And the Databricks platform itself has excellent dashboarding capabilities, making it very easy for ad hoc analysis and reporting, removing the need for web-based applications and Excel-based reporting
Interested in a data science career with Extra Space Storage? Learn more about job opportunities at careers.extraspace.com.