In a nutshell, a hack against customers of cloud storage firm, Snowflake appears like it may turn into one of the most monumental data-breaches in history.
Hacker group ShinyHunters, claim to be selling 560 million records from Ticketmaster and 30 million from Santander.
Snowflake, a public-cloud-services company which allows companies to store massive datasets on its servers, revealed that hackers had been attempting to access its customers’ accounts using stolen login details from one of their employees.
Since then, Snowflake first said a “limited number” of customer accounts had been accessed, however, cybercriminals have publicly claimed to be selling stolen data relating to Snowflake accounts. It has been reported by the likes of TechCrunch and Wired that hundreds of Snowflake customer passwords have been found online.
There’s also a snowball effect at play here, because in recent days a BreachForums account, with the handle Sp1d3r has posted two more companies whose data it claims is related to the Snowflake incident:
1) A financial services company, LendingTree and subsidiary QuoteWizard (alleged 190 million customers’ details).
2) Automotive giant Advance Auto Parts (alleged 380 million customers’ details)
Not a good look for Snowflake, and certainly not a good outcome for the victim customers.
Let’s get into how situations like this can be prevented by understanding the difference between public vendor vs private cloud ETL tools.
Firstly, some quick context:
– Public clouds involve hosting of data infrastructure with internet facing (public) endpoints on top of a cloud provider such as AWS, Azure, Google Cloud, which the likes of Snowflake use to provide services to Ticketmaster and Santander. Essentially, cloud computing resources are fully managed by these third parties in multi-tenant environments.
– Private clouds on the other hand (also known as an internal cloud or corporate cloud), is a cloud computing environment in which all hardware and software resources are dedicated exclusively to a single company and any authorised partnering individuals.
– ETL stands for Extract, Transform, and Load. ETL tools are a set of software tools that are used to extract data from one or more sources, transform it into a consistent and clean format, and load it into a target system or database.
Public Vendor Cloud ETL Tools are offered by third-party vendors and are hosted on a public cloud platform – just like Snowflake does for its database infrastructure. These tools allow users to manage their data flow via one interface which links to both a variety of data sources and destinations. Public Vendor Cloud ETL Tools offer a wide range of integrations and examples include Stitch Data, FiveTran, Rivery, Keboola and many others.
Private Cloud ETL Tools, on the other hand, are hosted on a company’s private cloud infrastructure and offer more control over data security and compliance, as the data resides within the company’s private network. Often, private cloud networks are thought of as being less flexible than public vendor cloud ETL tools, but as we get into the next sections – you’ll see this is definitely not always the case…
Public Vendor Cloud ETL Tools: Pros
Public Vendor Cloud ETL Tools: Cons
Private Cloud ETL Tools: Pros
Private Cloud ETL Tools: Cons
Central to this decision is of course company size, budget, compliance requirements and general ‘data risk’ (such as for larger companies in general, and especially for industries like healthcare), but these are the key considerations for when cloud ETL tools should be considered:
It’s true that most ETL tools harnessing private cloud infrastructure come with cost and maintenance burdens, so the approach and choice of partner should always be central to the decision. Most of the time, these burdens are a result of poor planning and sub-optimal architecture.
Because the team at Matatika understand the merits of both options, and are well-versed in harnessing ETL tools in various cloud contexts and industries – we offer both private and public cloud solutions. A key difference lies in the fact that we offer both with the same high levels of support, security, upgrades, performance, scalability, and operational stability.
In fact, Matatika provides the complete set of ETL tools to rapidly load data from 500+ sources into your data warehouse. From there, we specialise in not only ensuring data from multiple sources is accurate and consolidated, but also transforming the data in innovative ways using reputable ETL tools. From smart visualisation, to leading BI insights delivered by advanced AI analysis – Matatika deliver solutions which are efficient, scalable built-for-purpose and most importantly, ready for the future of tech and AI.
If you’d like to get more detailed information, feel free to download the “Ultimate Guide to Getting More from Your Data”, or get in touch to discuss any data opportunities your company has right now and get some helpful consultation.
Interested in learning more? Get in touch and speak with one of our experts to see how we can help your much your business.
Stay up to date with our insights as they become available.