Once we discuss with prospects, we hear that they need to have the ability to harness insights from knowledge as a way to make well timed, impactful, and actionable enterprise selections. A typical sample with data-driven organizations is that they’ve many diﬀerent knowledge sources they should ingest into their analytics methods. This requires them to construct handbook knowledge pipelines spanning throughout their operational databases, knowledge lakes, streaming knowledge, and knowledge inside their warehouse. As a consequence of this complicated setup, it might probably take knowledge engineers weeks and even months to construct knowledge ingestion pipelines. These knowledge pipelines are pricey, and the delays can result in missed enterprise alternatives. Moreover, knowledge warehouses are more and more turning into mission essential methods that require excessive availability, reliability, and safety.
Amazon Redshift is a completely managed petabyte-scale knowledge warehouse utilized by tens of 1000’s of shoppers to simply, rapidly, securely, and cost-effectively analyze all their knowledge at any scale. This 12 months at re:Invent, Amazon Redshift has introduced quite a lot of options that can assist you simplify knowledge ingestion and get to insights simply and rapidly, inside a safe, dependable setting.
On this weblog, I introduce a few of these new options that ﬁt into two most important classes:
- Simplify knowledge ingestion
- Amazon Redshift now helps auto-copy from Amazon S3 (obtainable in preview). With this new functionality, Amazon Redshift routinely masses the recordsdata that arrive in an Amazon Easy Storage Service (Amazon S3) location that you just specify into your knowledge warehouse. The recordsdata can use any of the codecs supported by the Amazon Redshift copy command, comparable to CSV, JSON, Parquet, and Avro. On this manner, you don’t have to manually or repeatedly run copy procedures. Amazon Redshift automates file ingestion and takes care of data-loading steps underneath the hood.
- With Amazon Aurora zero-ETL integration with Amazon Redshift, you should utilize Amazon Redshift for close to real-time analytics and machine studying on petabytes of transactional knowledge saved on Amazon Aurora MySQL databases (obtainable in restricted preview). With this functionality, you possibly can select the Amazon Aurora databases containing the information you wish to analyze with Amazon Redshift. Information is then replicated into your knowledge warehouse inside seconds after transactional knowledge is written into Amazon Aurora, eliminating the necessity to construct and keep complicated knowledge pipelines. You’ll be able to replicate knowledge from a number of Amazon Aurora databases into the identical Amazon Redshift occasion to run analytics throughout a number of functions. With close to real-time entry to transactional knowledge, you possibly can leverage Amazon Redshift’s analytics and capabilities, comparable to built-in machine studying (ML), materialized views, knowledge sharing, and federated entry to a number of knowledge shops and knowledge lakes, to derive insights from transactional and different knowledge.
- With the final availability of Amazon Redshift Streaming Ingestion, now you can natively ingest a whole lot of megabytes of information per second from Amazon Kinesis Information Streams and Amazon MSK into an Amazon Redshift materialized view and question it in seconds. Study extra on this put up.
- Make your knowledge warehouse safer and dependable
- Now you can enhance the provision of your knowledge warehouse by selecting a number of Availability Zone (AZ) deployments. Multi-AZ deployments on your Amazon Redshift clusters can be found in preview and scale back restoration occasions to seconds by means of automated restoration. On this manner, you possibly can construct options which might be extra compliant with the suggestions of the Reliability Pillar of the AWS Nicely-Architected Framework.
- With dynamic knowledge masking (obtainable in preview), you possibly can defend delicate data saved in your knowledge warehouse and be certain that solely the related knowledge is accessible by customers primarily based on their roles. You’ll be able to restrict how a lot identifiable knowledge is seen to customers utilizing a number of ranges of insurance policies so totally different customers and teams can have totally different ranges of information entry with out having to create a number of copies of information. Dynamic knowledge masking enhances different granular entry management capabilities in Amazon Redshift together with row-level and column-level safety and role-based entry controls. On this manner, Dynamic Information Masking helps you meet necessities for GDPR, CCPA, and different privateness laws.
- Amazon Redshift now helps central entry controls for knowledge sharing with AWS Lake Formation (obtainable in public preview). Now you can use Lake Formation to simplify governance of information shared from Amazon Redshift and centrally handle granular entry throughout all data-sharing customers.
There have been different attention-grabbing information for Amazon Redshift at re:Invent you may need already heard about:
- The overall availability of Amazon Redshift integration for Apache Spark makes it straightforward to construct and run Spark functions on Amazon Redshift and Redshift Serverless, opening up the information warehouse for a broader set of AWS analytics and machine studying options.
- AWS Backup now helps Amazon Redshift. AWS Backup lets you outline a central backup coverage to handle knowledge safety of your functions and may also defend your Amazon Redshift clusters. On this manner, you could have a constant expertise when managing knowledge safety throughout all supported companies.
Availability and Pricing
Multi-AZ deployments, central entry management for knowledge sharing with AWS Lake Formation, auto-copy from Amazon S3, and dynamic knowledge masking can be found in preview in US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), Europe (Eire), and Europe (Stockholm).
There isn’t a further value for utilizing auto-copy from Amazon S3 and close to real-time analytics on transactional knowledge. There isn’t a additional cost for dynamic knowledge masking and central entry management for knowledge sharing. For extra data, see Amazon Redshift pricing.
These new capabilities take you one step additional in analyzing all of your knowledge throughout knowledge sources with easy knowledge ingestion capabilities, whereas enhancing the safety and reliability of your knowledge warehouse.