A Startup CTO’s Guide to Structuring Your Data Infrastructure

A Startup CTO’s Guide to Structuring Your Data Infrastructure

Taking business decisions and planning business operations are done mostly by analytics and data-driven methods now. However, using the data effectively to gain proper business insights is not easy or free. It requires not only experience and expertise but also a lot of tools and infrastructure. Engineering needs to build and maintain database tools for better business performance. As a CTO of an emerging business, one’s major challenge is to appropriately plan and invest in data infrastructure to successfully establish a data-driven culture.

A start-up in steady growth may be experiencing many organizational changes from time to time, and it is essential for the CTO to incorporate all these changes to be in-line with this data-driven culture. Various strategies are required at each stage of evaluation of a start-up business whether you are talking about employee management, security administration, human and technical resource allocation, or product development. Here are a few things technology officers need to know to handle these changes well.

Immediate startup with one to three employees

This may be the very beginning stage of a business where you may not have even started to see a steady revenue. A typical company website at this phase may have a homepage, and there is some limited traffic, and there may not be many other data to look at. At this phase, the CTO may not have to bother about a huge data infrastructure set up.

However, this is the ideal time to think ahead and ensure that you have a proper mechanism to retain data by keeping future analytical needs in mind. Always be aware of and follow the best database design practices and limit access to updates and deletes which may ultimately end up in destroying some valuable information. As new features are built, think of ways to analyze them and prepare the code accordingly.

This phase is also good to ensure that you have the Google Analytics account set up and try to establish a proper hierarchy and accountability to pave the way to data-driven business culture. Even at this phase now, your business users may ask you to pull some data from the DB. However, as per RemoteDBA.com, these may be minimal, and it is okay at this phase to handle them ad hoc.

Establishments with four to ten employees

 

At this stage, you start to see some real data flowing in. You have to try to find a fine product-market fit. The company leadership should keep a table on the basic KPIs of the business, and different stakeholders may be trying to understand the real dynamics of the products and operations, which will increase the data requests.

So, it is important for the technical team to ensure the availability of the best possible tools for the users to explore and report relevant data on time. A fair part of this challenge is to understand the business users which can be anyone inside and outside the company. Regarding infrastructural requirements, try to set up some slave database by giving read-only accounts to meet analytic queries without getting interfered with production. Next, try to learn how your business users want to interact with the data.

There are myriad of tools, free as well as paid, to use instantly for this purpose without the need for you to roll up your own systems. Some may prefer to get the SQL results to excel whereas some others may prefer to use visual tools which can abstract away the raw SQL. Understand the needs of your business users well and better plan your tools which make the right sense.

Even though you have to try and get and the best with an eye on the future, you may focus on tools which don’t require a huge investment. You may maintain three to four engineers and minimal working hours to finish the database features. It may not be ideal to dedicate more than 25% of your productive engineering time for analytics unless your specific business industry requires it at baseline.

Eleven to thirty employees

You will start to see the data volumes picking up at this phase. There should be at least one business user dedicatedly spending the time to do data analysis. Your data may have spread across various silos if you have made a proper effort in putting the analytical tools at the hand of business users. Provided this, you may have to get more tools, more data, and faster processing.

This is the right time to think of a long-term data strategy. If you are planning to build yourself, then you have to do it effectively by dedicating a skilled engineering resource to work on the data infrastructure needs. Scaling up to incorporate the increased amount of data in-flow and handling database queries will be harder at this stage.

With the needs in hand for the time being, you will find some strategic indices at the slave database enough for the need, but soon after you may start to find the need for a scaled up data warehouse to meet the need. You may also face some extract-transform-load (ETL) which will aggregate data from various sources, transform and clean it, and load to the querying engines.

At this phase, there are also options to hire a third-party consultant to offer a custom solution or try to outsource your data to an external data warehouse service. In fact, there is no such off-the-shelve solution which can get your team fully off the hook.

Conclusion

Further, as the data volume increases and use-cases keep on growing, database management can become more complicated and important. The top corporate like Twitter or Amazon has a big team of data infrastructure engineers, and it may be your need too ultimately. So, as your organization grows, one needs to start thinking about the “Analytics Director” role who can plan for a long-term strategy and accomplish a visionary database infrastructure for your organization. Without a long-term plan, remember that the switching costs may grow higher from time to time, so without a proper strategy, you are in trouble in terms of data management.

ADD YOUR COMMENT