Pentaho Data Integration Fundamentals

Onsite Training

For groups of six or more

Request Quote

Public Training

Online

Florence

D-76131 Karlsruhe

Paris

San Francisco, CA

Geneva

Rome

Online - EMEA

Classes marked with Confirmed are guaranteed to run. Sign up now while there is still space available!

Classes marked with Full are full and no additional registrations are accepted. If you cannot find another class that suits your schedule, feel free to request a class and we will do our best to accomodate your needs.


Don't see a date that works for you?

Request Class

Pentaho Data Integration Fundamentals Ratings

Averaged from 76 responses.

Training Organized
Training Objectives
Training Expectations
Training Curriculum
Training Labs
Training Overall

What do these ratings mean?

Training Course

The volume, variety and velocity of data are increasing rapidly.  Organizations need fast and easy-to-use tools to harness data for actionable insight. One of the biggest challenges facing organizations today is the requirement to provide a consistent, single version of the truth across all sources of information in an analytics-ready format.

With powerful data extract, transform and load (ETL) capabilities, an intuitive and rich graphical design environment, and an open and standards-based architecture, Pentaho Data Integration is increasingly the choice over proprietary and homegrown data integration tools.

Back to Courses

Description

Id: DI1000
Level: Introductory
Audience: Data Analyst
Delivery Method: Instructor-led online, Private on-site, Public classroom
Duration: 3 Day(s)
Cost: $1,950.00 USD
Credits: 3
Category: Pentaho Data Integration

 

Pentaho Data Integration provides a full ETL solution, including:

  • Rich graphical designer to empower ETL developers
  • Broad connectivity to any type of data, including diverse and big data
  • Enterprise scalability and performance, including in-memory caching
  • Big data integration, analytics and reporting, including Hadoop, NoSQL, traditional OLTP & analytic databases
  • Modern, open, standards-based architecture

Through a series of lectures and hands-on exercises covering theory, best practices, and design patterns, Pentaho Data Integration Fundamentals provides students the skills they need to maximize the value of data to the organization. This course helps prepare you for the Pentaho Data Integration Certification Exam.

Duration

3 Days

Upcoming Classes

Online

Instructor-led online training

Location Mar 2015 Apr 2015 May 2015 Jun 2015 Jul 2015
Online Mar 31 – Apr 2
Mar 31 – Apr 2
May 12 – May 14
Jun 9 – Jun 11
Jun 29 – Jul 1
Online - EMEA Jun 2 – Jun 4

Classes in bold are guaranteed to run!

Italy

Location Mar 2015 Apr 2015 May 2015 Jun 2015 Jul 2015
Florence (Italian language) Apr 13 – Apr 15
Rome (Italian language) May 25 – May 27

Classes in bold are guaranteed to run!

Germany

Location Mar 2015 Apr 2015 May 2015 Jun 2015 Jul 2015
Karlsruhe (German language) Apr 14 – Apr 16
Jun 23 – Jun 25

Classes in bold are guaranteed to run!

France

Location Mar 2015 Apr 2015 May 2015 Jun 2015 Jul 2015
Paris (French language) Apr 20 – Apr 22
Jun 8 – Jun 10

Classes in bold are guaranteed to run!

United States

Location Mar 2015 Apr 2015 May 2015 Jun 2015 Jul 2015
San Francisco, CA Apr 21 – Apr 23

Classes in bold are guaranteed to run!

Switzerland

Location Mar 2015 Apr 2015 May 2015 Jun 2015 Jul 2015
Geneva (English language) May 4 – May 6
Jun 22 – Jun 24

Classes in bold are guaranteed to run!

Course Benefits

  • Improve productivity by giving your data integration team the skills they need to succeed with Pentaho Data Integration
  • Learn to deliver data to a wide variety of applications using Pentaho's out-of-the-box data standardization, enrichment and quality capabilities
  • Interactive, hands-on training materials significantly improve skill development and maximize retention

Skills Achieved

At the completion of this course, you should be able to:

  • Create, preview, and run basic transformations containing steps and hops
  • View transformation results in the Step Metrics view and the Log view
  • Configure the Pentaho Enterprise Repository, including basic security
  • Use the Pentaho Enterprise Repository to: create folders, store transformations and jobs, move, lock, revise, delete, and restore artifacts.
  • Configure error handling for transformation steps
  • Create a database connection and use Database Explorer to interact with data sources
  • Create transformations that involve configuring the following steps: Table input, Table output, Text file output, CSV file input, Insert/Update, Add constants, Filter, Value Mapper, Stream lookup, Join rows, Merge join, Sort rows, JavaScript, Database Lookup, Set Environment Variables
  • Learn how to use transformation steps to perform complex calculations on the data stream
  • Create reusable transformations using parameterized values and environment variables
  • Use Pentaho Data Integration to cleanse and correct data
  • Load data from and write data to different data sources
  • Create Pentaho Data Integration jobs that: run multiple transformations, use variables, contain sub-jobs, provide built-in error notification, load and process multiple text files, and convert files into Microsoft Excel format
  • Configure logging for transformation steps and for job entries and examine the logged data
  • Schedule and monitor the execution of a transformation in Pentaho Data Integration and in the Pentaho Enterprise Console

This course is the third course in the Data Analyst learning path. Students with prior database development or administration experience who are new to Pentaho Data Integration should take this course.

There are no prerequisites for this course but some ETL experience is preferred.
Though not a requirement, attendees would benefit from taking Business Analytics User Console (BA1000) prior to taking this class to gain an overview of the Pentaho Business Analytics interface.

Students attending classroom courses in the United States are provided with a PC to use during class. Students attending courses outside the US should contact the Authorized Training Provider regarding PC requirements for Pentaho courses.

In general, if your training provider requires you to bring a PC to class, it must meet the following requirements. You can also verify your system against the Compatibility Matrix: List of Supported Products topic in the Pentaho Documentation site.

  • Windows XP, 7 desktop operating system (for Macintosh support, please contact your Customer Success Manager)
  • RAM: at least 4GB
  • Hard drive space: at least 2GB for the software, and more for solution and content files
  • Processor: dual-core AMD64 or Intel EM64T
  • USB port

Online courses require a broadband Internet connection, the use of a modern Web browser (such as Microsoft Internet Explorer or Mozilla Firefox), and the ability to connect to GoToTraining. For more information on GoToTraining requirements, see http://www.gotomeeting.com/online/training. Online courses use Pentaho’s cloud-based exercise environment. Students are provided access to a virtual machine used to complete the exercises.

For online courses, students are provided with a secured, electronic course manual. Printed manuals are not provided for online courses. When an electronic manual is provided, students are encouraged to print the exercise book before class begins, though this is not required.

Students attending this course on-site should contact their Customer Success Manager for hardware and software requirements. You can also email us at training@pentaho.com for more information regarding on-site training requirements.

Day 1

Module 1: Introduction to Pentaho Data Integration

  Lesson 1: Objectives & Class Logistics

  Lesson 2: What is Pentaho Data Integration (PDI)?


Module 2: Transformation Basics

  Lesson 1: Learning the PDI User Interface

  Lesson 2: Creating Transformations

      Exercise 1: Generate Rows, Sequence, Select Values

  Lesson 3: Error Handling & Logging Introduction

  Lesson 4: Introduction to Repositories


Module 3: Reading & Writing Files

  Lesson 1: Input & Output Steps

  Lesson 2: Parameters & kettle.properties

      Exercise 2: CSV Input to Multiple Text Output Using Switch/Case

      Exercise 3: Serializing Multiple Text Files

      Exercise 4: De-serialize a File

Day 2

Module 4: Working with Databases

  Lesson 1: Connecting to & Exploring a Database

  Lesson 2: Table Input & Output

      Exercise 5: Reading & Writing to Database Tables

  Lesson 3: Insert, Update, & Delete Steps

  Lesson 4: Data Cleansing

  Lesson 5: Using Parameters & Arguments in SQL

      Exercise 6: Input with Parameters & Table Copy Wizard


Module 5: Data Flows & Lookups

  Lesson 1: Copying and Distributing Data

      Exercise 7: Parallel Processing

  Lesson 2: Lookups

      Exercise 8: Lookups & Data Formatting

  Lesson 3: Merging Data

Day 3

Module 6: Calculations

  Lesson 1: Using the Group By Step

  Lesson 2: Calculator

      Exercise 9: Calculating & Aggregating Order Quantity

  Lesson 3: Regular Expression

  Lesson 4: User Defined Java Expression

  Lesson 5: JavaScript


Module 7: Job Orchestration

  Lesson 1: Introduction to Jobs

      Exercise 10: Loading JVM Data into a Table

  Lesson 2: Sending Alerts

  Lesson 3: Looping & Conditions

      Exercise 11: Creating a Job with a Loop

  Lesson 4: Executing Jobs from a Terminal Window (Kitchen)


Module 8: Scheduling

  Lesson 1: Setting up the Scheduler

  Lesson 2: Monitoring Scheduled Tasks


Module 9: Exploring Data Integration Repositories

  Lesson 1: The Pentaho Data Integration Repository

      Exercise 12: Using the Pentaho Enterprise Repository


Module 10: Detailed Logging

  Lesson 1: Detailed Logging throughout Execution

Upcoming Classes

Online

Instructor-led online training

Location Mar 2015 Apr 2015 May 2015 Jun 2015 Jul 2015
Online Mar 31 – Apr 2
Mar 31 – Apr 2
May 12 – May 14
Jun 9 – Jun 11
Jun 29 – Jul 1
Online - EMEA Jun 2 – Jun 4

Classes in bold are guaranteed to run!

Italy

Location Mar 2015 Apr 2015 May 2015 Jun 2015 Jul 2015
Florence (Italian language) Apr 13 – Apr 15
Rome (Italian language) May 25 – May 27

Classes in bold are guaranteed to run!

Germany

Location Mar 2015 Apr 2015 May 2015 Jun 2015 Jul 2015
Karlsruhe (German language) Apr 14 – Apr 16
Jun 23 – Jun 25

Classes in bold are guaranteed to run!

France

Location Mar 2015 Apr 2015 May 2015 Jun 2015 Jul 2015
Paris (French language) Apr 20 – Apr 22
Jun 8 – Jun 10

Classes in bold are guaranteed to run!

United States

Location Mar 2015 Apr 2015 May 2015 Jun 2015 Jul 2015
San Francisco, CA Apr 21 – Apr 23

Classes in bold are guaranteed to run!

Switzerland

Location Mar 2015 Apr 2015 May 2015 Jun 2015 Jul 2015
Geneva (English language) May 4 – May 6
Jun 22 – Jun 24

Classes in bold are guaranteed to run!