This book describes Kettle and how it can be implemented, applied and managed, including an extensive collection of use cases and best practices. A major part of the book will be based on Kimball's 34 ETL subsystems. (Note that the book does not assume prior Kettle or ETL knowlegde which makes it an ideal start for anyone wanting to learn an ETL tool.
)The book will cover all distinct components that make up the Kettle product and shows how they can be applied to real-world scenarios. The book uses a solutions-oriented approach, meaning that the available toolset is not discussed from the tool perspective but from the solution perspective (i.e. what someone can accomplish using the product).The first half of the book (parts 1, 2 and 3) is devoted to the basic Kettle functionality and how it can be applied to get ETL solutions up and running. Parts 2 and 3 follow the '34 ETL subsystems' as described by Ralph Kimball. The 34 subsystems cover the entire ETL lifecycle and make for an excellent guideline to cover all parts of data warehousing with Kettle. The second half of the book (parts 4, 5 and 6) cover more advanced or specialized topics like clustering, extensibility and loading a data vault model. For every subject a real life example will be used that people can easily relate to, but due to the diverse nature of the different chapters there won't be an overall case to illustrate the concepts by. The variety of examples will also ensure a more lively discussion of the different topics. The book and the samples in it cover everything from simple single table data migration to complex multi system clustered data integration tasks.When people have read this book they will have learned the following: What ETL and data integration is, and why they need it The components that form the Kettle ETL tool set (and how s these components fulfill particular data integration needs) How to install and configure Kettle, and how to connect it to various data sources and targets. How to design and build every aspect of an ETL solution using Kettle How to build and load a data warehouse with Kettle How to deploy and schedule ETL solutions How to integrate and extend Kettle How to run and scale Kettle solutions using a distributed 'cloud' environment
show more show less