This is a simple java program which takes a transformation "first_transformation.ktr" and executes the transformation
Create a simple Test.java file and execute
import org.pentaho.di.core.KettleEnvironment;
import org.pentaho.di.core.exception.KettleException;
import org.pentaho.di.trans.Trans;
import org.pentaho.di.trans.TransMeta;
/**
* Hello world!
*
*/
public class Test
{
public static void main( String[] args )
{
try {
KettleEnvironment.init();
TransMeta metaData = new TransMeta("first_transformation.ktr");
Trans trans = new Trans( metaData );
trans.execute( null );
trans.waitUntilFinished();
if ( trans.getErrors() > 0 ) {
System.out.print( "Error Executing transformation" );
}
} catch( KettleException e ) {
e.printStackTrace();
}
}
}
This is the simple transformation created using spoon tool
filename: first_transformation.ktr
<?xml version="1.0" encoding="UTF-8"?>
<transformation>
<info>
<name>first_transformation</name>
<description/>
<extended_description/>
<trans_version/>
<trans_type>Normal</trans_type>
<directory>/</directory>
<parameters>
</parameters>
<log>
<trans-log-table><connection/>
<schema/>
<table/>
<size_limit_lines/>
<interval/>
<timeout_days/>
<field><id>ID_BATCH</id><enabled>Y</enabled><name>ID_BATCH</name></field><field><id>CHANNEL_ID</id><enabled>Y</enabled><name>CHANNEL_ID</name></field><field><id>TRANSNAME</id><enabled>Y</enabled><name>TRANSNAME</name></field><field><id>STATUS</id><enabled>Y</enabled><name>STATUS</name></field><field><id>LINES_READ</id><enabled>Y</enabled><name>LINES_READ</name><subject/></field><field><id>LINES_WRITTEN</id><enabled>Y</enabled><name>LINES_WRITTEN</name><subject/></field><field><id>LINES_UPDATED</id><enabled>Y</enabled><name>LINES_UPDATED</name><subject/></field><field><id>LINES_INPUT</id><enabled>Y</enabled><name>LINES_INPUT</name><subject/></field><field><id>LINES_OUTPUT</id><enabled>Y</enabled><name>LINES_OUTPUT</name><subject/></field><field><id>LINES_REJECTED</id><enabled>Y</enabled><name>LINES_REJECTED</name><subject/></field><field><id>ERRORS</id><enabled>Y</enabled><name>ERRORS</name></field><field><id>STARTDATE</id><enabled>Y</enabled><name>STARTDATE</name></field><field><id>ENDDATE</id><enabled>Y</enabled><name>ENDDATE</name></field><field><id>LOGDATE</id><enabled>Y</enabled><name>LOGDATE</name></field><field><id>DEPDATE</id><enabled>Y</enabled><name>DEPDATE</name></field><field><id>REPLAYDATE</id><enabled>Y</enabled><name>REPLAYDATE</name></field><field><id>LOG_FIELD</id><enabled>Y</enabled><name>LOG_FIELD</name></field></trans-log-table>
<perf-log-table><connection/>
<schema/>
<table/>
<interval/>
<timeout_days/>
<field><id>ID_BATCH</id><enabled>Y</enabled><name>ID_BATCH</name></field><field><id>SEQ_NR</id><enabled>Y</enabled><name>SEQ_NR</name></field><field><id>LOGDATE</id><enabled>Y</enabled><name>LOGDATE</name></field><field><id>TRANSNAME</id><enabled>Y</enabled><name>TRANSNAME</name></field><field><id>STEPNAME</id><enabled>Y</enabled><name>STEPNAME</name></field><field><id>STEP_COPY</id><enabled>Y</enabled><name>STEP_COPY</name></field><field><id>LINES_READ</id><enabled>Y</enabled><name>LINES_READ</name></field><field><id>LINES_WRITTEN</id><enabled>Y</enabled><name>LINES_WRITTEN</name></field><field><id>LINES_UPDATED</id><enabled>Y</enabled><name>LINES_UPDATED</name></field><field><id>LINES_INPUT</id><enabled>Y</enabled><name>LINES_INPUT</name></field><field><id>LINES_OUTPUT</id><enabled>Y</enabled><name>LINES_OUTPUT</name></field><field><id>LINES_REJECTED</id><enabled>Y</enabled><name>LINES_REJECTED</name></field><field><id>ERRORS</id><enabled>Y</enabled><name>ERRORS</name></field><field><id>INPUT_BUFFER_ROWS</id><enabled>Y</enabled><name>INPUT_BUFFER_ROWS</name></field><field><id>OUTPUT_BUFFER_ROWS</id><enabled>Y</enabled><name>OUTPUT_BUFFER_ROWS</name></field></perf-log-table>
<channel-log-table><connection/>
<schema/>
<table/>
<timeout_days/>
<field><id>ID_BATCH</id><enabled>Y</enabled><name>ID_BATCH</name></field><field><id>CHANNEL_ID</id><enabled>Y</enabled><name>CHANNEL_ID</name></field><field><id>LOG_DATE</id><enabled>Y</enabled><name>LOG_DATE</name></field><field><id>LOGGING_OBJECT_TYPE</id><enabled>Y</enabled><name>LOGGING_OBJECT_TYPE</name></field><field><id>OBJECT_NAME</id><enabled>Y</enabled><name>OBJECT_NAME</name></field><field><id>OBJECT_COPY</id><enabled>Y</enabled><name>OBJECT_COPY</name></field><field><id>REPOSITORY_DIRECTORY</id><enabled>Y</enabled><name>REPOSITORY_DIRECTORY</name></field><field><id>FILENAME</id><enabled>Y</enabled><name>FILENAME</name></field><field><id>OBJECT_ID</id><enabled>Y</enabled><name>OBJECT_ID</name></field><field><id>OBJECT_REVISION</id><enabled>Y</enabled><name>OBJECT_REVISION</name></field><field><id>PARENT_CHANNEL_ID</id><enabled>Y</enabled><name>PARENT_CHANNEL_ID</name></field><field><id>ROOT_CHANNEL_ID</id><enabled>Y</enabled><name>ROOT_CHANNEL_ID</name></field></channel-log-table>
<step-log-table><connection/>
<schema/>
<table/>
<timeout_days/>
<field><id>ID_BATCH</id><enabled>Y</enabled><name>ID_BATCH</name></field><field><id>CHANNEL_ID</id><enabled>Y</enabled><name>CHANNEL_ID</name></field><field><id>LOG_DATE</id><enabled>Y</enabled><name>LOG_DATE</name></field><field><id>TRANSNAME</id><enabled>Y</enabled><name>TRANSNAME</name></field><field><id>STEPNAME</id><enabled>Y</enabled><name>STEPNAME</name></field><field><id>STEP_COPY</id><enabled>Y</enabled><name>STEP_COPY</name></field><field><id>LINES_READ</id><enabled>Y</enabled><name>LINES_READ</name></field><field><id>LINES_WRITTEN</id><enabled>Y</enabled><name>LINES_WRITTEN</name></field><field><id>LINES_UPDATED</id><enabled>Y</enabled><name>LINES_UPDATED</name></field><field><id>LINES_INPUT</id><enabled>Y</enabled><name>LINES_INPUT</name></field><field><id>LINES_OUTPUT</id><enabled>Y</enabled><name>LINES_OUTPUT</name></field><field><id>LINES_REJECTED</id><enabled>Y</enabled><name>LINES_REJECTED</name></field><field><id>ERRORS</id><enabled>Y</enabled><name>ERRORS</name></field><field><id>LOG_FIELD</id><enabled>N</enabled><name>LOG_FIELD</name></field></step-log-table>
</log>
<maxdate>
<connection/>
<table/>
<field/>
<offset>0.0</offset>
<maxdiff>0.0</maxdiff>
</maxdate>
<size_rowset>10000</size_rowset>
<sleep_time_empty>50</sleep_time_empty>
<sleep_time_full>50</sleep_time_full>
<unique_connections>N</unique_connections>
<feedback_shown>Y</feedback_shown>
<feedback_size>50000</feedback_size>
<using_thread_priorities>Y</using_thread_priorities>
<shared_objects_file/>
<capture_step_performance>N</capture_step_performance>
<step_performance_capturing_delay>1000</step_performance_capturing_delay>
<step_performance_capturing_size_limit>100</step_performance_capturing_size_limit>
<dependencies>
</dependencies>
<partitionschemas>
</partitionschemas>
<slaveservers>
</slaveservers>
<clusterschemas>
</clusterschemas>
<modified_user>-</modified_user>
<modified_date>2011/08/31 19:03:08.937</modified_date>
</info>
<notepads>
</notepads>
<order>
<hop> <from>Generate Rows</from><to>Write to log</to><enabled>Y</enabled> </hop> </order>
<step>
<name>Generate Rows</name>
<type>RowGenerator</type>
<description/>
<distribute>Y</distribute>
<copies>1</copies>
<partitioning>
<method>none</method>
<schema_name/>
</partitioning>
<fields>
<field>
<name>Test</name>
<type>String</type>
<format/>
<currency/>
<decimal/>
<group/>
<nullif>Hello World!</nullif>
<length>-1</length>
<precision>-1</precision>
</field>
</fields>
<limit>10</limit>
<cluster_schema/>
<remotesteps> <input> </input> <output> </output> </remotesteps> <GUI>
<xloc>123</xloc>
<yloc>213</yloc>
<draw>Y</draw>
</GUI>
</step>
<step>
<name>Write to log</name>
<type>WriteToLog</type>
<description/>
<distribute>Y</distribute>
<copies>1</copies>
<partitioning>
<method>none</method>
<schema_name/>
</partitioning>
<loglevel>log_level_basic</loglevel>
<displayHeader>Y</displayHeader>
<fields>
<field>
<name>Test</name>
</field>
</fields>
<cluster_schema/>
<remotesteps> <input> </input> <output> </output> </remotesteps> <GUI>
<xloc>331</xloc>
<yloc>212</yloc>
<draw>Y</draw>
</GUI>
</step>
<step_error_handling>
</step_error_handling>
<slave-step-copy-partition-distribution>
</slave-step-copy-partition-distribution>
<slave_transformation>N</slave_transformation>
</transformation>
-------
Assuming all the dependent jars are included in class path the above program should result in the following output.
INFO 31-08 19:14:46,992 - first_transformation - Dispatching started for transformation [first_transformation]
INFO 31-08 19:14:47,024 - first_transformation - This transformation can be replayed with replay date: 2011/08/31 19:14:47
INFO 31-08 19:14:47,039 - Generate Rows - Finished processing (I=0, O=0, R=0, W=10, U=0, E=0)
INFO 31-08 19:14:47,039 - Write to log -
------------> Linenr 1------------------------------
Test = Hello World!
====================
INFO 31-08 19:14:47,039 - Write to log -
------------> Linenr 2------------------------------
Test = Hello World!
====================
INFO 31-08 19:14:47,039 - Write to log -
------------> Linenr 3------------------------------
Test = Hello World!
====================
INFO 31-08 19:14:47,039 - Write to log -
------------> Linenr 4------------------------------
Test = Hello World!
====================
INFO 31-08 19:14:47,039 - Write to log -
------------> Linenr 5------------------------------
Test = Hello World!
====================
INFO 31-08 19:14:47,039 - Write to log -
------------> Linenr 6------------------------------
Test = Hello World!
====================
INFO 31-08 19:14:47,039 - Write to log -
------------> Linenr 7------------------------------
Test = Hello World!
====================
INFO 31-08 19:14:47,039 - Write to log -
------------> Linenr 8------------------------------
Test = Hello World!
====================
INFO 31-08 19:14:47,039 - Write to log -
------------> Linenr 9------------------------------
Test = Hello World!
====================
INFO 31-08 19:14:47,039 - Write to log -
------------> Linenr 10------------------------------
Test = Hello World!
====================
INFO 31-08 19:14:47,039 - Write to log - Finished processing (I=0, O=0, R=10, W=10, U=0, E=0)
The POM file used with to build and run this example is
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.ameeth.poc</groupId>
<artifactId>pentaho</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>pentaho</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<pentaho.kettle.version>4.0.1-GA</pentaho.kettle.version>
</properties>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.16</version>
</dependency>
<dependency>
<groupId>pentaho.kettle</groupId>
<artifactId>kettle-core</artifactId>
<version>${pentaho.kettle.version}</version>
</dependency>
<dependency>
<groupId>pentaho.kettle</groupId>
<artifactId>kettle-db</artifactId>
<version>${pentaho.kettle.version}</version>
</dependency>
<dependency>
<groupId>commons-vfs</groupId>
<artifactId>commons-vfs</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>pentaho.kettle</groupId>
<artifactId>kettle-engine</artifactId>
<version>${pentaho.kettle.version}</version>
</dependency>
<dependency>
<groupId>pentaho.kettle</groupId>
<artifactId>kettle-ui-swt</artifactId>
<version>${pentaho.kettle.version}</version>
</dependency>
<dependency>
<groupId>pentaho-library</groupId>
<artifactId>libformula</artifactId>
<version>1.1.7</version>
<exclusions>
<exclusion>
<groupId>commons-logging</groupId>
<artifactId>commons-logging-api</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.codehaus.janino</groupId>
<artifactId>janino</artifactId>
<version>2.5.16</version>
</dependency>
<dependency>
<groupId>rhino</groupId>
<artifactId>js</artifactId>
<version>1.7R2</version>
</dependency>
<dependency>
<groupId>javax.mail</groupId>
<artifactId>mail</artifactId>
<version>1.4.1</version>
</dependency>
</dependencies>
</project>
Create a simple Test.java file and execute
import org.pentaho.di.core.KettleEnvironment;
import org.pentaho.di.core.exception.KettleException;
import org.pentaho.di.trans.Trans;
import org.pentaho.di.trans.TransMeta;
/**
* Hello world!
*
*/
public class Test
{
public static void main( String[] args )
{
try {
KettleEnvironment.init();
TransMeta metaData = new TransMeta("first_transformation.ktr");
Trans trans = new Trans( metaData );
trans.execute( null );
trans.waitUntilFinished();
if ( trans.getErrors() > 0 ) {
System.out.print( "Error Executing transformation" );
}
} catch( KettleException e ) {
e.printStackTrace();
}
}
}
This is the simple transformation created using spoon tool
filename: first_transformation.ktr
<?xml version="1.0" encoding="UTF-8"?>
<transformation>
<info>
<name>first_transformation</name>
<description/>
<extended_description/>
<trans_version/>
<trans_type>Normal</trans_type>
<directory>/</directory>
<parameters>
</parameters>
<log>
<trans-log-table><connection/>
<schema/>
<table/>
<size_limit_lines/>
<interval/>
<timeout_days/>
<field><id>ID_BATCH</id><enabled>Y</enabled><name>ID_BATCH</name></field><field><id>CHANNEL_ID</id><enabled>Y</enabled><name>CHANNEL_ID</name></field><field><id>TRANSNAME</id><enabled>Y</enabled><name>TRANSNAME</name></field><field><id>STATUS</id><enabled>Y</enabled><name>STATUS</name></field><field><id>LINES_READ</id><enabled>Y</enabled><name>LINES_READ</name><subject/></field><field><id>LINES_WRITTEN</id><enabled>Y</enabled><name>LINES_WRITTEN</name><subject/></field><field><id>LINES_UPDATED</id><enabled>Y</enabled><name>LINES_UPDATED</name><subject/></field><field><id>LINES_INPUT</id><enabled>Y</enabled><name>LINES_INPUT</name><subject/></field><field><id>LINES_OUTPUT</id><enabled>Y</enabled><name>LINES_OUTPUT</name><subject/></field><field><id>LINES_REJECTED</id><enabled>Y</enabled><name>LINES_REJECTED</name><subject/></field><field><id>ERRORS</id><enabled>Y</enabled><name>ERRORS</name></field><field><id>STARTDATE</id><enabled>Y</enabled><name>STARTDATE</name></field><field><id>ENDDATE</id><enabled>Y</enabled><name>ENDDATE</name></field><field><id>LOGDATE</id><enabled>Y</enabled><name>LOGDATE</name></field><field><id>DEPDATE</id><enabled>Y</enabled><name>DEPDATE</name></field><field><id>REPLAYDATE</id><enabled>Y</enabled><name>REPLAYDATE</name></field><field><id>LOG_FIELD</id><enabled>Y</enabled><name>LOG_FIELD</name></field></trans-log-table>
<perf-log-table><connection/>
<schema/>
<table/>
<interval/>
<timeout_days/>
<field><id>ID_BATCH</id><enabled>Y</enabled><name>ID_BATCH</name></field><field><id>SEQ_NR</id><enabled>Y</enabled><name>SEQ_NR</name></field><field><id>LOGDATE</id><enabled>Y</enabled><name>LOGDATE</name></field><field><id>TRANSNAME</id><enabled>Y</enabled><name>TRANSNAME</name></field><field><id>STEPNAME</id><enabled>Y</enabled><name>STEPNAME</name></field><field><id>STEP_COPY</id><enabled>Y</enabled><name>STEP_COPY</name></field><field><id>LINES_READ</id><enabled>Y</enabled><name>LINES_READ</name></field><field><id>LINES_WRITTEN</id><enabled>Y</enabled><name>LINES_WRITTEN</name></field><field><id>LINES_UPDATED</id><enabled>Y</enabled><name>LINES_UPDATED</name></field><field><id>LINES_INPUT</id><enabled>Y</enabled><name>LINES_INPUT</name></field><field><id>LINES_OUTPUT</id><enabled>Y</enabled><name>LINES_OUTPUT</name></field><field><id>LINES_REJECTED</id><enabled>Y</enabled><name>LINES_REJECTED</name></field><field><id>ERRORS</id><enabled>Y</enabled><name>ERRORS</name></field><field><id>INPUT_BUFFER_ROWS</id><enabled>Y</enabled><name>INPUT_BUFFER_ROWS</name></field><field><id>OUTPUT_BUFFER_ROWS</id><enabled>Y</enabled><name>OUTPUT_BUFFER_ROWS</name></field></perf-log-table>
<channel-log-table><connection/>
<schema/>
<table/>
<timeout_days/>
<field><id>ID_BATCH</id><enabled>Y</enabled><name>ID_BATCH</name></field><field><id>CHANNEL_ID</id><enabled>Y</enabled><name>CHANNEL_ID</name></field><field><id>LOG_DATE</id><enabled>Y</enabled><name>LOG_DATE</name></field><field><id>LOGGING_OBJECT_TYPE</id><enabled>Y</enabled><name>LOGGING_OBJECT_TYPE</name></field><field><id>OBJECT_NAME</id><enabled>Y</enabled><name>OBJECT_NAME</name></field><field><id>OBJECT_COPY</id><enabled>Y</enabled><name>OBJECT_COPY</name></field><field><id>REPOSITORY_DIRECTORY</id><enabled>Y</enabled><name>REPOSITORY_DIRECTORY</name></field><field><id>FILENAME</id><enabled>Y</enabled><name>FILENAME</name></field><field><id>OBJECT_ID</id><enabled>Y</enabled><name>OBJECT_ID</name></field><field><id>OBJECT_REVISION</id><enabled>Y</enabled><name>OBJECT_REVISION</name></field><field><id>PARENT_CHANNEL_ID</id><enabled>Y</enabled><name>PARENT_CHANNEL_ID</name></field><field><id>ROOT_CHANNEL_ID</id><enabled>Y</enabled><name>ROOT_CHANNEL_ID</name></field></channel-log-table>
<step-log-table><connection/>
<schema/>
<table/>
<timeout_days/>
<field><id>ID_BATCH</id><enabled>Y</enabled><name>ID_BATCH</name></field><field><id>CHANNEL_ID</id><enabled>Y</enabled><name>CHANNEL_ID</name></field><field><id>LOG_DATE</id><enabled>Y</enabled><name>LOG_DATE</name></field><field><id>TRANSNAME</id><enabled>Y</enabled><name>TRANSNAME</name></field><field><id>STEPNAME</id><enabled>Y</enabled><name>STEPNAME</name></field><field><id>STEP_COPY</id><enabled>Y</enabled><name>STEP_COPY</name></field><field><id>LINES_READ</id><enabled>Y</enabled><name>LINES_READ</name></field><field><id>LINES_WRITTEN</id><enabled>Y</enabled><name>LINES_WRITTEN</name></field><field><id>LINES_UPDATED</id><enabled>Y</enabled><name>LINES_UPDATED</name></field><field><id>LINES_INPUT</id><enabled>Y</enabled><name>LINES_INPUT</name></field><field><id>LINES_OUTPUT</id><enabled>Y</enabled><name>LINES_OUTPUT</name></field><field><id>LINES_REJECTED</id><enabled>Y</enabled><name>LINES_REJECTED</name></field><field><id>ERRORS</id><enabled>Y</enabled><name>ERRORS</name></field><field><id>LOG_FIELD</id><enabled>N</enabled><name>LOG_FIELD</name></field></step-log-table>
</log>
<maxdate>
<connection/>
<table/>
<field/>
<offset>0.0</offset>
<maxdiff>0.0</maxdiff>
</maxdate>
<size_rowset>10000</size_rowset>
<sleep_time_empty>50</sleep_time_empty>
<sleep_time_full>50</sleep_time_full>
<unique_connections>N</unique_connections>
<feedback_shown>Y</feedback_shown>
<feedback_size>50000</feedback_size>
<using_thread_priorities>Y</using_thread_priorities>
<shared_objects_file/>
<capture_step_performance>N</capture_step_performance>
<step_performance_capturing_delay>1000</step_performance_capturing_delay>
<step_performance_capturing_size_limit>100</step_performance_capturing_size_limit>
<dependencies>
</dependencies>
<partitionschemas>
</partitionschemas>
<slaveservers>
</slaveservers>
<clusterschemas>
</clusterschemas>
<modified_user>-</modified_user>
<modified_date>2011/08/31 19:03:08.937</modified_date>
</info>
<notepads>
</notepads>
<order>
<hop> <from>Generate Rows</from><to>Write to log</to><enabled>Y</enabled> </hop> </order>
<step>
<name>Generate Rows</name>
<type>RowGenerator</type>
<description/>
<distribute>Y</distribute>
<copies>1</copies>
<partitioning>
<method>none</method>
<schema_name/>
</partitioning>
<fields>
<field>
<name>Test</name>
<type>String</type>
<format/>
<currency/>
<decimal/>
<group/>
<nullif>Hello World!</nullif>
<length>-1</length>
<precision>-1</precision>
</field>
</fields>
<limit>10</limit>
<cluster_schema/>
<remotesteps> <input> </input> <output> </output> </remotesteps> <GUI>
<xloc>123</xloc>
<yloc>213</yloc>
<draw>Y</draw>
</GUI>
</step>
<step>
<name>Write to log</name>
<type>WriteToLog</type>
<description/>
<distribute>Y</distribute>
<copies>1</copies>
<partitioning>
<method>none</method>
<schema_name/>
</partitioning>
<loglevel>log_level_basic</loglevel>
<displayHeader>Y</displayHeader>
<fields>
<field>
<name>Test</name>
</field>
</fields>
<cluster_schema/>
<remotesteps> <input> </input> <output> </output> </remotesteps> <GUI>
<xloc>331</xloc>
<yloc>212</yloc>
<draw>Y</draw>
</GUI>
</step>
<step_error_handling>
</step_error_handling>
<slave-step-copy-partition-distribution>
</slave-step-copy-partition-distribution>
<slave_transformation>N</slave_transformation>
</transformation>
-------
Assuming all the dependent jars are included in class path the above program should result in the following output.
INFO 31-08 19:14:46,992 - first_transformation - Dispatching started for transformation [first_transformation]
INFO 31-08 19:14:47,024 - first_transformation - This transformation can be replayed with replay date: 2011/08/31 19:14:47
INFO 31-08 19:14:47,039 - Generate Rows - Finished processing (I=0, O=0, R=0, W=10, U=0, E=0)
INFO 31-08 19:14:47,039 - Write to log -
------------> Linenr 1------------------------------
Test = Hello World!
====================
INFO 31-08 19:14:47,039 - Write to log -
------------> Linenr 2------------------------------
Test = Hello World!
====================
INFO 31-08 19:14:47,039 - Write to log -
------------> Linenr 3------------------------------
Test = Hello World!
====================
INFO 31-08 19:14:47,039 - Write to log -
------------> Linenr 4------------------------------
Test = Hello World!
====================
INFO 31-08 19:14:47,039 - Write to log -
------------> Linenr 5------------------------------
Test = Hello World!
====================
INFO 31-08 19:14:47,039 - Write to log -
------------> Linenr 6------------------------------
Test = Hello World!
====================
INFO 31-08 19:14:47,039 - Write to log -
------------> Linenr 7------------------------------
Test = Hello World!
====================
INFO 31-08 19:14:47,039 - Write to log -
------------> Linenr 8------------------------------
Test = Hello World!
====================
INFO 31-08 19:14:47,039 - Write to log -
------------> Linenr 9------------------------------
Test = Hello World!
====================
INFO 31-08 19:14:47,039 - Write to log -
------------> Linenr 10------------------------------
Test = Hello World!
====================
INFO 31-08 19:14:47,039 - Write to log - Finished processing (I=0, O=0, R=10, W=10, U=0, E=0)
The POM file used with to build and run this example is
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.ameeth.poc</groupId>
<artifactId>pentaho</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>pentaho</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<pentaho.kettle.version>4.0.1-GA</pentaho.kettle.version>
</properties>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.16</version>
</dependency>
<dependency>
<groupId>pentaho.kettle</groupId>
<artifactId>kettle-core</artifactId>
<version>${pentaho.kettle.version}</version>
</dependency>
<dependency>
<groupId>pentaho.kettle</groupId>
<artifactId>kettle-db</artifactId>
<version>${pentaho.kettle.version}</version>
</dependency>
<dependency>
<groupId>commons-vfs</groupId>
<artifactId>commons-vfs</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>pentaho.kettle</groupId>
<artifactId>kettle-engine</artifactId>
<version>${pentaho.kettle.version}</version>
</dependency>
<dependency>
<groupId>pentaho.kettle</groupId>
<artifactId>kettle-ui-swt</artifactId>
<version>${pentaho.kettle.version}</version>
</dependency>
<dependency>
<groupId>pentaho-library</groupId>
<artifactId>libformula</artifactId>
<version>1.1.7</version>
<exclusions>
<exclusion>
<groupId>commons-logging</groupId>
<artifactId>commons-logging-api</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.codehaus.janino</groupId>
<artifactId>janino</artifactId>
<version>2.5.16</version>
</dependency>
<dependency>
<groupId>rhino</groupId>
<artifactId>js</artifactId>
<version>1.7R2</version>
</dependency>
<dependency>
<groupId>javax.mail</groupId>
<artifactId>mail</artifactId>
<version>1.4.1</version>
</dependency>
</dependencies>
</project>
Thanks a LOT! I am a Java newbie and office clerk, trying to ease my work with "automated synchronization", read "Mulesoft.org ESB" and other weekend-projects. ;)
ReplyDeleteYour post (thanks for linking it in the pentaho wiki! Made it SO much easier to find as "reliable" with search engines) goes very well together with http://pentahodev.blogspot.com/2009/08/developdebug-kettle-plugin-in-eclipse.html
Here is my command line command on Windows7:
C:\foo>"C:\Program Files (x86)\Java\jdk1.7.0
\bin\javac.exe" -cp .;lib\kettle-engine.jar;lib\kettle-core.jar;libext\*;libext\
pentaho\*;libext\commons\*;lib\kettle-db.jar Test.java
I copied all referenced folders/files from a download of Data%20Integration/4.2.0-stable/pdi-ce-4.2.0-stable.zip (not the source version, although I did experiment with it).
Then
C:\foo>"C:\Program Files (x86)\Java\jdk1.7.0
\bin\java.exe" -cp .;lib\kettle-engine.jar;lib\kettle-core.jar;libext\*;libext\
pentaho\*;libext\commons\*;lib\kettle-db.jar Test
spat out the log lines! *hoooray*
Two weeks! Without your and that other post had it been im-pos-si-ble!..
I am trying but got following error,
ReplyDeleteC:\Documents and Settings\vj\Desktop\software\data-integration>java -cp .;lib\kettle-engine.jar;lib\kettle-core.jar;libext\*;libext\pentaho\*;libext\commons\*;lib\kettle-db.jar "C:\projects\test\test-kettle\com\testme\TestMe"
Exception in thread "main" java.lang.NoClassDefFoundError: C:\projects\test\
test-kettle\com\testme\TestMe
Caused by: java.lang.ClassNotFoundException: C:\projects\test\test-kettle\com\testme\TestMe
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
Could not find the main class: C:\projects\test\test-kettle\com\testme\TestMe. Program will exit.
Its not able to find the main class "Could not find the main class: C:\projects\test\test-kettle\com\testme\TestMe. Program will exit. " Please check how to run a java program.
ReplyDeletehelpme
ReplyDeleteDonde descargo la API?
me auto responde de aqui baje la version estable hasta este momento es la 4.3.0
Deletehttp://repository.pentaho.org/artifactory/pentaho/
English please
DeleteWhat about to run a job, saved on the enterprise repository, from a java class??
ReplyDeleteHi,
ReplyDeleteThis example give me a error on this line:
KettleEnvironment.init();
The error is the following:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/log4j/Appender
at org.pentaho.di.core.KettleEnvironment.init(KettleEnvironment.java:69)
at org.pentaho.di.core.KettleEnvironment.init(KettleEnvironment.java:53)
at teste.Teste.main(Teste.java:24)
Caused by: java.lang.ClassNotFoundException: org.apache.log4j.Appender
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
... 3 more
Java Result: 1
How I resolve that?
log4j is already included in the path let me know how you are running the application
DeleteGood morning, Ameeth.
ReplyDeleteFirst a great 2013.
I wonder if you ever needed to create a java applet to call a job created in Kettle?
Because when I run a job, it does nothing. Only starts and lost.
If you can answer, I thank you.
Gelson
Exception in thread "main" java.lang.NoSuchMethodError: org.pentaho.di.i18n.BaseMessages.getString(Ljava/lang/Class;Ljava/lang/String;[Ljava/lang/String;)Ljava/lang/String;
ReplyDeleteat org.pentaho.di.core.logging.LogLevel.(LogLevel.java:39)
at org.pentaho.di.core.logging.DefaultLogLevel.(DefaultLogLevel.java:36)
at org.pentaho.di.core.logging.DefaultLogLevel.getInstance(DefaultLogLevel.java:41)
at org.pentaho.di.core.logging.DefaultLogLevel.getLogLevel(DefaultLogLevel.java:50)
at org.pentaho.di.core.logging.LogChannel.(LogChannel.java:40)
at org.pentaho.di.core.logging.LogChannel.(LogChannel.java:28)
at org.pentaho.di.core.plugins.BasePluginType.(BasePluginType.java:71)
at org.pentaho.di.core.plugins.BasePluginType.(BasePluginType.java:82)
at org.pentaho.di.core.plugins.StepPluginType.(StepPluginType.java:76)
at org.pentaho.di.core.plugins.StepPluginType.getInstance(StepPluginType.java:82)
at org.pentaho.di.core.KettleEnvironment.init(KettleEnvironment.java:83)
at com.penta.practice.App.main(App.java:17)
These exception can be due to library version mismatch. Are you using a maven build
DeleteUnable to load class for step/plugin with id . Check if the plugin is available in the plugins subdirectory of the Kettle distribution.
ReplyDeleteWhen i am trying to run a transformation from java code , Am getting such an error .Thanks beforehand !
Please post a complete stack trace
DeleteSay "first_transformation.ktr" having many parameters. Then how will I pass parameter/value along with transformation call.
ReplyDelete-- Chandrajit Samanta
chandrajit.samanta@gmail.com
you can call trans.setParameterValue(key,value)
Deleteand then trans.activateParameters();
This comment has been removed by the author.
ReplyDeletehi Ameeth I need to talk you about pentaho even i stay in pune,9158552080 is my contact number.contact me as soon as possible need ur help in pentaho since I am working on that tool.
ReplyDeleteHi, I am trying to learn on running Pentaho with a Java application. I hope you could assist me. Where can I get the packages for:
ReplyDeleteorg.pentaho.di.core.KettleEnvironment;
org.pentaho.di.core.exception.KettleException;
org.pentaho.di.trans.Trans;
org.pentaho.di.trans.TransMeta;
? Error message upon trying to run the program: Test.java:1: error: package org.pentaho.di.core does not exist
Thank you for this blog.
Hi,
ReplyDeleteimport org.pentaho.di.core.KettleEnvironment;
for the above import class which jar file is required,because i didn't get any API.
i am using java code in eclipse to run the ktr.please do help if you have any idea about.
Thanks
Tabrez
If you are familiar with the maven build then the post includes the content of pom.xml file which has all the required dependencies.
ReplyDeletei am not aware of maven only heard the name .so what i need to do ...?
ReplyDeleteplugin is enough or i need to fully install the maven tool.?
thanks
Tabrez
I did a quick dependency check and below is the list of libraries with there version which will be required to work it correctly.
Deletelog4j:log4j:jar:1.2.16
pentaho.kettle:kettle-core:jar:4.0.1-GA
pentaho.kettle:kettle-db:jar:4.0.1-GA
commons-vfs:commons-vfs:jar:1.0
commons-logging:commons-logging:jar:1.0.4
pentaho.kettle:kettle-engine:jar:4.0.1-GA
pentaho.kettle:kettle-ui-swt:jar:4.0.1-GA
pentaho-library:libformula:jar:1.1.7
pentaho-library:libbase:jar:1.1.6
org.codehaus.janino:janino:jar:2.5.16
rhino:js:jar:1.7R2
javax.mail:mail:jar:1.4.1
javax.activation:activation:jar:1.1
Either get some understanding of maven and use it OR download these libraries with the given version
Hope this helps
Hi Ameeth I would like to thank you for the great post.I would like to generate a report by passing hdfspath programmatically using kettle api.Is it possible?
ReplyDeleteThanks in advance
Manasa
Check the latest version of Pentaho they support the HDFS integration
Deletehttp://wiki.pentaho.com/display/BAD/Hadoop
Hi Ameeth ,
ReplyDeleteI have created maven java project and also I have added kettle dependency ,
But whenever I will execute the .ktr file then I get the following error
Unable to load class for step/plugin with id [ConcatFields]. Check if the plugin is available in the plugins subdirectory of the Kettle distribution.
So please suggest how to fix them?
Thanks in advance
Pratik
The sample transformation code doesn't give any errors when the transformation has wrong DB credentials? Please help ?
ReplyDeleteHi Amith
ReplyDeleteCan you please let me know how to store .ktr file into MySql db.I have done all the setup with Maven and added all the dependencies. However, I'm not sure how to get the Connection object using Kettle API.Kindly provide me a sample program to get Connection Object and execution of Sql stmts.
Thanks
Sirisha
Hi…!!!
ReplyDeleteI do not speak much English…
so I’m using GOOGLE TRANSLATOR.
I need your help if you can help me.
I need to pass some parameters to a transformation
but I have a situation I want to pass several data belonging to the same variable
—————————————————————————————————————————————–
TransMeta metaData = new TransMeta(“Logic.ktr”);
Trans trans = new Trans(metaData);
for (int i = 0; i < num; i++) {
trans.setVariable("DATA", text[i]);
}
trans.execute(null);
trans.prepareExecution(null);
trans.startThreads();
trans.waitUntilFinished();
—————————————————————————————————————————————–
for example, if I send A, B, C, D
the code only sends D.
I wish to send A, B, C, D
You know what I can do?
Thanks so much for any help
you can create a CSV files for list of data and send the file name as parameter. You can use the CSV file input to read from the file in you transformation.
ReplyDeletehi Ameeth, i have implemented same example its running for hello world but when i m using mongo database to process data not able read and write log. i m getting following log:
ReplyDelete2017/12/27 16:46:29 - MongoDB_Run.0 - Released server socket on port 0
2017/12/27 16:46:29 - Write to log 2 2.0 - Released server socket on port 0
2017/12/27 16:46:29 - Read_Project_Run - Step [MongoDB_Run.0] initialized flawlessly.
2017/12/27 16:46:29 - Read_Project_Run - Step [Write to log 2 2.0] initialized flawlessly.
2017/12/27 16:46:29 - MongoDB_Run.0 - Starting to run...
2017/12/27 16:46:29 - Write to log 2 2.0 - Starting to run...
2017/12/27 16:46:29 - MongoDB_Run.0 - Signaling 'output done' to 1 output rowsets.
2017/12/27 16:46:29 - MongoDB_Run.0 - Finished processing (I=0, O=0, R=0, W=0, U=0, E=0)
2017/12/27 16:46:29 - Read_Project_Run - Transformation has allocated 2 threads and 1 rowsets.
2017/12/27 16:46:29 - Write to log 2 2.0 - Signaling 'output done' to 0 output rowsets.
2017/12/27 16:46:29 - Write to log 2 2.0 - Finished processing (I=0, O=0, R=0, W=0, U=0, E=0)
while executing job from spoon it is success
Delete2017/12/27 14:57:00 - Write to log 2 2.0 - Finished processing (I=0, O=0, R=198, W=198, U=0, E=0)
Hi Ameeth , i have an issue in our program , we people are using generic ktr for our tables(around 30), we recently noticed , while transforming pentaho doing some rounding the values not all for data types - NUmber and format like #.# ex- 2.25 -> 2.2 , 2.75 -> 2.8 , could you please tell me how to configure this decimal format in ktr files.
ReplyDeleteSatish I have not tried this but you should be able to configure the formatting or conversion of data type using "Select Values" step
Delete