Thursday, 26 January 2017

Teradata NPARC (Named Pipe ARChive)

Brief Introduction to Teradata:
Teradata is a fully scalable relational database management system produced by Teradata Corp. It is widely used to manage large data warehousing operations. 

The Teradata database system is based on off-the-shelf symmetric multiprocessing technology combined with communication networking, connecting symmetric multiprocessing systems to form large parallel processing systems.

What happens when business data grows?

Sometimes, working in IT we need to upgrade entire configurations of our systems or servers to get in sync with business requirements. As a growing IT, data is also growing and for that we may need to get rid of old configuration systems or servers and for that we need to migrate the existing business data.

Moving/Migrating entire datasets or say full system from one Teradata system to another Teradata system or any kind of data system like Oracle or SQL  is a bothersome task even when the data is already present on the disk or tapes and we are provided with all the load utilities(Teradata fast export , Teradata MultiLoad, Teradata Parrallel Transporter or by Basic Teradata queries (i.e fexp, mload, tpt, bteq etc).


We oftenly meets up to the scenario when there is no more disk space left to hold the entire dataset, we contact our respective IT team to add some more disk space onto the server, IT team denies, we still want to move data. Challenging right!!! Is there any way to accomplish this? I believe yes.

Teradata Archive/Recovery Utility i.e Teradata ARC can be a solution here. One of the Teradata Archive/Recovey process named NPARC (Named Pipe ARC) serves this purpose. It is a way to get data from one platform to another platform or migarting data on the same machine without concerning about the disk space.


What is NPARC and How it works?

NPARC (Named Pipe ARChive) is an NCR utility that transfers data between two Teradata systems.  It is basically ARC using named pipes instead of tape, making data transfers much faster.  The "donor" system runs an ARC DUMP into the pipes and the "receiver" system runs an ARC COPY with the pipes as its input.  The bynets of both systems are connected via LAN cable through your node's LAN ports.

I’ve seen some cases while moving data from one platform to another, Here is few of it.

Scenario one: When we move the data within the system. Below NPARC steps helps in it

Essentials:
·                     Two different users with archive(dump) and restore access rights on the databases TeradataSource and TeradadatTarget.
·                     The user used for restore/copy should have additional access right which is create table access right.
·                     Two databases(source and target). Source database is where the data is being archived and the target where the data is being restored/copied.
·                     A named pipe
·                     Dump/Archive and Restore/Copy scripts
·                     Teradata Server (any version) up and running
·                     Teradata arcmain installed on a Linux machine that can connect to the Teradata Server. Also,Windows machine can be used but requires few more softwares such as Teradata npaxsmod used to create a named pipe on Windows. Using Linux for simplicity.
Firstly, we create the source(TeradataSource) database and target(TeradataTarget) database:

create database TeradataSource as perm=10e7;
grant dump,restore on TeradataSource to TeradataSource;

create database TeradataSource as perm=10e7;

grant dump,restore on TeradataSource to TeradataSource;

Now, add some data to the source(TeradataSource) database:

A) create a table named t1 in TeradataSource database and collect the statistics as well.

create table TeradataSource.t1(i int, ch varchar(100));
insert into TeradataSource.t1(1,'aaaa');
insert into TeradataSource.t1(2,'bbbb');
insert into TeradataSource.t1(3,'cccc');
insert into TeradataSource.t1(4,'dddd');
insert into TeradataSource.t1(5,'eeee');

collect stats on TeradataSource.t1 column i;

collect stats on t1 column ch;

B) create one more table t2 same as table t1.

create table TeradataSource.t2 as TeradataSource.t1 with data and stats;

Now, create arc login users:


create user arch1 as perm=0 and password=arch1;
grant dump, restore on TeradataSource to arch1;

create user arch2 as perm=0 and password=arch2;

grant dump, restore on TeradataTarget to arch2;
grant create table on TeradataTarget to arch2;


Named Pipe creation: 

Using mknod command to create the named pipe, you can also use mkfifo command as well.

mknod NPARC p

Dump/Archive script(nparc.arc):

logon arch1,arch1;

archive data tables

  (TeradataSource) all,
release lock,
file=NPARC;

logoff;

Restore/Copy script(nparc.cpy):
logon arch2,arch2;

copy data tables
  (TeradataTarget) (from(TeradataSource)),
release lock,
file=NPARC;

logoff;

Note: Please make sure you keep all the scripts and the Named Pipe in the same directory. If not, you need to use different ways to make things work.



Finally, run the arcmain jobs.

Note: please make sure you run consumer process first, it will block the producer process.


The consumer job:

arcmain < nparc.cpy




The producer job:


arcmain < nparc.arc


As you can see, the data has been migrated to the target database with ease and without much of the disk usage. This can be stretched to move data across Teradata systems.


Technical team:
Pankaj Chahar         

References:

http://www.teradata.com
https://en.wikipedia.org

http://developer.teradata.com/blog/jim/2011/10/using-a-named-pipe-archive-to-copy-a-database-from-one-system-to-another

No comments:

Post a Comment