vdwh-logo

Data Solution Design Patterns

Implementation and Automation

Training with Roelant Vos


Register now!


"For a data warehouse, we do not have enough time."

… sounds familiar to you?

Automation and code generation enables faster, and more flexible, data solution implementation. Learn the revolutionary approach for a fully automated solution from Roelant Vos.

  • Implement a Persistent Staging Area
  • Apply hybrid modelling techniques and patterns based on Data Vault
  • Define a metadata model for automation, code generation, and virtualization
  • Apply DevOps, testing and control frameworks for an automated solution
  • Ensure that the delivered data meets the consumers’ expectations

This practical design and implementation training provides you with everything you need to build and maintain an automated data solution from start to finish.


What can Data Solution Automation offer?


Working with data can be complex, and often the ‘right’ answer for the purpose is the result of a series of iterations where business subject matter experts (SMEs) and data professionals collaborate.

This is an iterative process by its very nature. Even with the best effort and available knowledge, the resulting data model will be subject to progressive understanding that is inherent in working with data.

In other words, the data solution model is not always something you always can get right in one go. In fact, it can take a long time for a model to stabilise, and in the current fast-paced environments this may even never be the case.

Choosing the right design patterns for your data solution helps maintain both the mindset and capability for the solution to keep evolving with the business, the technology, and to reduce technical debt on an ongoing basis.

This mindset also enables some truly fascinating opportunities such as the ability to maintain version control of the data model, the design metadata, and their relationship - to be able to represent the entire data solution as it was at a certain point in time - or to even allow different data models for different business domains.

This idea, combined with the capability to automatically (re)deploy different structures and interpretations of data as well as the data logistics to populate or deliver these we call ‘Data Solution Virtualisation’.

The idea of an automated virtual data solution was conceived while working on improvements for generating Data Warehouse loading processes. It is, in a way, an evolution in ETL generation thinking. Combining Data Vault with a Persistent Staging Area (PSA) provides additional functionality because it allows the designer to refactor all, or parts, of the solution.

Being able to deliver a virtual data solution provides options. It does not mean you have to virtualise the entire solution, but you can pick-and-choose which approach works best for the given scenario and change technologies and models over time.

To allow ideas to grow, creators need an immediate connection to what they are creating. This means that, as a creator, you need to be able to directly see what the effect of your changes are on what you are working on.

This is what the virtual data solution as a concept and mindset intends to enable: to enable a direct connection to data to support any kind of exploration and enabling creativity while using it.

Thinking of Data Warehousing in terms of virtualisation is in essence about following the guiding principle to establish a direct connection to data. It is about finding ways to seek simplification, to keep working on removing barriers to deliver data and information. It is about enabling ideas to flourish because data can be made available for any kind of discovery or assertion.

Virtual Data Warehousing is the ability to present data for consumption directly from a raw data store by leveraging data warehouse loading patterns, information models and architecture. In many data solutions, it is already considered a best practice to be able to ‘virtualise’ Data Marts in a similar way. The Virtual Data Warehouse takes this approach one step further by allowing the entire data solution to be refactored based on the original raw transactions.

This ability requires a Persistent Staging Area (PSA), also known as a Persistent Historized Data Store, where the data that is received is stored as it has been received, at the lowest level of detail. If data is retained this way, everything you do with your data can always be repeated at any time – deterministically. In the best implementations, the virtual data solution allows you to work at the level of simple metadata mappings, modelling and interpretation "business logic", abstracting away the more technical details.

A virtual data solution is not the same as data virtualisation. These two concepts are fundamentally different. Data virtualisation, by most definitions, is the provision of unified direct access to data across many ‘disparate’ data stores.

It is a way to access and combine data without having to physically move the data across environments. Data virtualisation does not however focus on loading patterns and data architecture and modelling.

The virtual data solution, on the other hand, is a flexible and manageable approach towards solving data integration and time variance topics using data warehouse concepts, essentially providing a defined schema-on-read.

The Virtual Data Warehouse is enabled by virtue of combining the principles of data logistics generation, hybrid data warehouse modelling concepts and a Persistent Staging Area (PSA). It is a way to create a more direct connection to the data because changes made in the metadata and models can be immediately represented in the information delivery.

Persisting data in a more traditional Data Warehouse sense is always still an option, and may be required to deliver the intended performance. The deterministic nature of a Virtual Data Warehouse allows for dynamic switching between physical and virtual structured, depending on the requirements.

In many cases, this mix of physical and virtual objects in the Data Warehouses changes over time itself, when business focus changes. A good approach is to ‘start virtual’, and persist where required.


Download PDF

Your Trainer

Roelant Vos has been active in Data Warehousing (DWH) and Business Intelligence (BI) for more than 20 years, and is well known as an expert in the Data Vault community.

For more than 10 years, he has been sharing his ideas, tips, and thoughts on his blog roelantvos.com.

Having worked as a software developer, consultant, trainer, and decision maker in the corporate world, Roelant has observed data management from various distinctly different points of view.

The common theme has always been a passion for automation, code generation, reusable patterns, and model-driven design – the key to making data solutions manageable and flexible.

His focus is now on providing training, consultancy, and open-source software development to make the delivery of robust data solutions easier.

You want to ...

  • Learn what kind of solution architecture supports flexible data delivery that can evolve with the business
  • Fully understand the concepts behind the essential data loading patterns, what options can be considered, and how to implement these
  • Leverage generation techniques for data logistics (‘ETL’), to be able to spend more time on more value-adding work such as data modelling and improving data delivery
  • Work on a Do-It-Yourself (DIY) data solution framework, or have adopted a Data Warehouse Automation (DWA) product and seek deeper understanding on the used patterns and modelling approaches
  • Get a full overview of all components that are necessary for a robust and manageable data solution

    This course covers advanced modelling and implementation techniques, and applies to a wide range of data professionals including Data Warehouse professionals, data modelers, architects, and data engineers.

Prerequisites

  • Sufficient understanding of English (the course language is English)
  • Understanding of data engineering, for example Data Warehousing and ETL development
  • Knowledge of SQL (e.g. joining, window functions)
  • Some scripting / programming experience
  • Familiarity with data modeling techniques for data warehousing (e.g. Dimensional Modeling, Ensemble Logical Modelling techniques including Data Vault)

Is this for me?

By adopting Data Vault patterns on top of a Persistent Staging Area (PSA) – a historised record of all original transactions – an unparalleled level of flexibility in implementing and maintaining a data solution can be achieved. The repetitive aspects of data preparation are reduced, and it becomes easier to adjust the solution to ever-changing business- and technical requirements.

These patterns are seemingly straightforward – almost deceptively so.

But, in fact, every pattern requires far-reaching considerations at a technical and conceptual level to truly match the business expectations.

Data Vault modelling provides elegant features to manage complexities, but success still depends on correct modelling of the data, and correct application of the patterns. Leveraging data logistics (‘ETL’) generation and virtualisation techniques allows for a great degree of flexibility, because you can quickly refactor and test different modelling approaches to understand which one fits best for your use-case.

This enables you to spend more time on more value-adding work such as improving the data models and delivery of data.

This advanced training is relevant for anyone seeking to understand how to leverage ‘model-driven-design’ and ‘pattern-based code-generation’ techniques to accelerate development. The content applies to a wide range of data professionals including Data Warehouse specialists, data modellers and architects as well as data engineers and data integration developers.

Flexible design and implementation

The intent of the training is to cover the architecture and concepts for a flexible data solution, with a focus to ‘deep dive’ into the patterns and practical implementation techniques as quickly as possible.

To facilitate this, the training discusses the implementation of the main Data Vault modelling concepts including their various edge-cases and considerations. The mechanisms to deliver information for consumption by business users (i.e. ‘marts’) will also be covered, including details on how to produce the ‘right’ information by implementing business logic and managing multiple timelines for reporting (‘bitemporal’).

The training provides tools and configurations which you can use to start automating your own development – or understand the approaches used in commercial ‘off-the-shelf’ software so that these can be fully utilised.

Training content and schedule

Day 1

  • Pattern-based design
  • Solution prerequisites and essential components
  • Solution architecture
  • Data staging concepts
  • Investigate the source and target models
  • Introducing design metadata
  • Core Business Concept pattern

Day 2

  • Natural Business Relationships pattern
  • Context pattern
  • Historisation
  • Technical considerations
  • Scheduling, workflows, and parallelism
  • Continuous loading
  • DevOps and versioning

Day 3

  • Temporality concepts
  • Data delivery – Dimensions and Facts
  • Application of business logic
  • Completing the solution

Optional Evening Sessions

  • The training course includes one or two optional evening ‘hands-on’ sessions where anyone interested can work with some of the technology, create their own patterns, or write a simple code generator.
  • Each evening session usually runs for 1-2 hours (depending on interest) and typically is scheduled between 1800 and 2000.
  • What workshop we do depends on the group’s interest. This will be discussed and agreed on the first day of the training.
  • The prerequisite for the optional hands-on sessions is to have a pre-installed local environment with SQL Server 2016 or 2019, Integration Services (SSIS) and Visual Studio 2019 and/or 2022 with SQL Server Data Tools (SSDT). This covers most of the scenarios with minimal additional installation.


Overview PDF

Workshop available world-wide

This workshop will be offered globally by Roelant Vos, also in-house on request. Please have a look on the dates or contact us.

Dates & Prices

Hanover
  • planned for
    28th to 30th August 2023
  • EUR 3.451 incl. VAT
EUR 2.900 plus VAT
Inhouse
Price on request

Registration





If you have any further questions, please contact us:

info@dwhpatterns.com


Copyright: Roelant Vos

Imprint | Privacy Policy | Image Sources