Teradata
Intelligent Memory (TIM)- Running at the speed of Business
Overview of Teradata
Teradata
is a RDBMS (Relational Database Management System). This system is solely based
on off-the-shelf (readymade) symmetric multiprocessing (SMP) technology
combined with communication networking, connecting SMP systems to form large Massively
parallel processing systems (MPP). It is wholly used to manage large data
warehousing operations. Teradata acts as a single data store that can
accept large number of concurrent requests from multiple client applications.
Parallelism
along with load distribution shared among several users, Execution of complex
queries with a maximum of 256 joins, Parallel efficiency, Complete scalability
, Teradata Intelligent Memory are some of Teradata’s widely shown Features.
Shared Nothing Architecture: a distributed computing architecture in which each node is independent and self-sufficient, and there is no single point of contention across the system. More specifically, none of the nodes share memory or disk storage.
Architectural View of Teradata, How it works –Shared Nothing?
What is Teradata Intelligent
Memory ?
Teradata
is “ Temperature Aware” Database
technology.
Intelligent
Memory is the latest feature in the list of temperature-aware features in the
Teradata Database. A feature of Teradata which automatically and transparently
keeps the “hottest” or most “frequently accessed data” in memory for rapid and fastest query
performance possible at a fraction of the cost of an in-memory database.Teradata
Intelligent Memory combines RAM and disk for High Performance Big Data without
the extreme requirement of exclusive in-memory operations.
Why and How Teradata memory
called “Intelligent “ ?
The
TD Database continuously tracks the “temperature” senses of all its data. The
most frequently accessed information is identified inside the database on a
“hot” data list. Teradata Intelligent Memory automatically puts in information
to its extended memory block whenever data on the very hot list is fetched from
disk for query processing and analysis. When we need this data for
another query, the database automatically looks to Teradata Memory. This
eliminates the need to access a physical disk where I/O speed is up to 3k times
slower. Hence Being 3k times faster using the temperature aware feature, we
call it Teradata “Intelligent” Memory. --Automatic
Intelligence
Advantages:
Input/Output
for fetching data from disk for CPU processing and the CPU processing cycles
are the two major system resources of computer in which constraints generally
impact system performance. By having the most frequently used data in memory
and thereby eliminating needed I/O, performance is enhanced. --Relief From I/O Constraints
The
Teradata Database file system (TDFS) knows about what data is available in memory (Disk)
and automatically uses that copy (cached) just as it would use data out of
cache instead of going to disk. -- Make
the Most of memory
TD
Intelligent Memory lifts up this the fact in an innovative manner so as to
achieve high DB performance without the cost of buying enough memory to store
the entire db. By keeping a replica of the most used data in memory, pdisks I/O
to SSD and HDD can be reduced dramatically and helps in running at the
speed of business. solid-state disk ( i.e SSD) , hard disk drive (i.e HDD) and Physical Disks
(i.e pdisks)
Making
sure that we can access “hot” data for fast processing enhances query as well
as system performance, which in return provides business leaders more timely
insights to improve decision making. --
Rapid access equals better business decisions.
In today’s
world we are getting a terabyte/petabyte or more of memory onto a server, if we
want to pay the price. As far as processing big data analyatics is concerned
this means that we’d have tens or even hundreds of servers if we wanted to load anything that speaks of big
data and its analysis, and we’d need to have software that could integrate
those resources well, assuming we wanted to hold all that data in memory.
And
that is the reason why more of the companies think that in-memory technology is
the thing that has been deployed to pin OLTP databases in memory and run Business
Intellignce queries off the same Db. We can mirror the servers involved and we
get a huge rise in speed. It may really be the best use of in-memory or say
intelligent memory technology.
And
So?
The
“data analyst txn” are not a simple one.
They varies significantly according to the goal
and the behaviour of the data being analyzed. We cannot
simply model these in the way we can model the OLTP transactions. But we know a
couple things for sure. First, the transaction will go very faster if the most
frequently accessed data is held in memory and only has to be read from disk
once. Second, it will go faster if we employ as much parallelism as possible so
as Teradata we have for “Parallelism” and “in-memory Txn”.
And
this means that in-memory technology and big data, whether they like it or not,
will really play nicely together.
How Cache benefits Teradata
Intelligent Memory?
As
with most computer cache techniques, Data is kept there for short periods seconds
or at most minutes. Teradata employs a caching approach as well; what it calls
the FSG (File Segment) cache. And Teradata is smart enough to make sure that no
data kept in the FSG cache will be moved into Intelligent Memory, and vice
versa.
TIM plays
with cache to emphasize on long run data usage. The increased memory blocks
works best with the already present File Segment cache to list heat intensity based
data collections that will define queries over rising time period and make them
more stabilize. --Cache Partner
How Teradata categorize
Multi-temperature Data?
In
Teradata, the frequency at which data is accessed for R/W ops often described
as its “Heat intensity or Temperature” ,Also
analyzing and managing data by its temperature can open up opportunities
to provide value across the entire enterprise. Hence, we categorize data on
basis of their frequent access/temperature as :
Data Temp
|
Definition of usage
|
Business Examples
|
White Hot
|
Continuous,
Often expected spikes of repeated data access
|
Live campaign,
data repeatedly queried and accessed
|
Hot
|
Frequently
access Data
|
Initially for
live campaign, to analyze sales figure or see trends using reports
|
Warm
|
Data accessed
less frequently and usually with less urgency
|
In a month or
two, thje campaign is changed or closed
|
Cold
|
Data historical
Information usually seen in data mining and analysis activities
|
Campaign ended,
reports completed, now data accessed on yearly reviews
|
Dormant
|
Data that has
not been touched for considerable period of time or not at all
|
Data archived,
long finished campaign
|
What happens to rest of
Memory after Teradata Intelligent Memory ?
After use of “Hot” data, Teradata leaves behind the “Cold” data. And that “cold” data uses the Teradata
virtual Storage concept and compresses the data left behind.
And then comes the Compress on
Cold feature:
The Teradata Database has improved the use of hybrid storage to achieve
more intelligent multi-temperature data management where the "hot"
data is the most frequently used and "cold" data is least used or
dormant. It is the industry's only intelligent virtual storage solution
that automatically migrates and compresses, or decompresses, data between drive
types to achieve optimum performance and storage utilization. This keeps
data from turning into dead data. –Block
Level Compression( BLC)
Teradata Database intelligently succeeds to manage data to maximize
performance while optimizing the return on system resources. It
automatically compresses the coldest or the least frequently used data on the
system to save disk storage space and getting the left space for other uses.
Keeping data in its natural decompressed format when it is frequently
used maximizes performance by avoiding repeated decompression processing and is
good on the system performance perspective. Automatically compressing the
data when it is less frequently used enables the storage of the most data at
the most effective costs. No DBA staff intervention is required with Teradata's
automated self-managing design.
With big data analytics becoming extensive, there's a critical need for
a database to be smart enough to dynamically judge how "hot" or
"cold" data is for an the entire enterprise. The hotter, more
popular, data need to be located on the fastest storage devices while less
active, cooler data can be pushed onto slower media. Cold data is compressed up
to five times to gain minimal storage cost. With the Teradata
Database, Teradata Virtual Storage increases intelligent management
of data by automatically decompressing and relocating once cold data onto
faster storage systems as demand for the data heats up. For example, the
Teradata Database will recognize when monthly year-over-year data should be
cycled in or out of archival media as needed and without laborious database
administration intervention.
Benefits of TIM(Teradata
Intelligent Memory)
TIM uses highly developed algo’s that itself age, track, and rank data
to make sure effective data mgmt and support for user queries. Inside TIM, we
can store and compress data in columns and rows, which increases the amount of data
availability in the memory space. TIM puts only the very hottest data to the
new extended memory space area. Organizations makes full use of it by being
able to access the most current data rapidly from system memory to satisfy the
vast majority of their queries, which also achieves a better financial ROI (return
on investment).
Teradata storage can be increased at a lower price than is achieved in
absence of virtual storage. Storage can now be implemented to a clique besides adding
both nodes and storage. The probability to mix drive sizes in each clique starts
positioning where the large volume archived data or other Cold data can be added
within the Enterprise Data Warehouse. This rise up the utilization of the EDW in
reach of deep history data analysis, with a result of enhancing the ROI for the
EDW.
The configuration flexibility of Virtual Storage now allows storage in
a clique to be expanded in a broad range of size increments since the
restrictions on drives per AMP are eliminated. Expansion can be gained by
adding the desired drive count and performing a restart. The system Reconfig
process is not needed since AMP count assignments are not typically changed.
Since the Virtual Storage or in-memory based approach does not usually require
added AMPs, only a system restart (which is just a few minutes long) is
required after new storage is added to a system based on TVS.
When is TIM not appropriate?
This Intelligent Memory product release does not suits in every TD
solution and is not appropriate for consideration in some of the systems. The
many Enterprise Data Warehouse systems
that are focused mainly or solely on operational or Active EDW environments
would contain mostly Hot data. As data ages and cools it would be moved to
archive since it no longer offers apparent business value. There’s no benefit
seen with Intelligent Memory in this case if the Cold data is simply no longer
used as it works best for “Hot” data bust not “Dormant” Data.
The fat memory is available on the 670 Data Mart Appliance, the 2700
Data Warehouse Appliance, and on the 6700 Active Enterprise Data Warehouse, and
will soon be available on the flash-accelerated Extreme Data Appliance.
The future possibilities of TIM:
Although at this time, it is not possible to say when or how TIM will
be enhanced beyond the current initial release, there are definitely many
possibilities for the future. Among possibilities for expanding the
capability of TIM is a greater sharing of data across more disks .
Teradata’s latest in-memory architecture is integrated with its
management of data heat intensity. This is very significant, because the hottest
data will locate automatically to the in-memory layer—Teradata Intelligent
Memory; the next hottest data will move self to solid state disk; and, so on.
Teradata also provides the column storage and data compression that amplify the
value of data in memory. The customer sees increased performance without having
to make decisions about which data is placed in memory.
Summary
Teradata’s Temperature aware feature accelerates warehouse query
performance and increases the value of system space by safeguarding that the
most recently used data is kept in memory. It is Teradata’s new approach to
multi-temperature data management.Intelligent memory enables flexible
configurations of mixed drive capacities within one system and clique. It also gives
cost effective and simple expansion of storage in a system without having to
add further TD nodes. It allows the use of mixed storage on a System.
Specifically, disks of different sizes and types can be mixed in a disk array,
and different disk array models can be mashed in a clique. This allows the
system to get back old disks in a new config or mix and match larger, lower
performance disks with smaller, faster performance storage.
The brain beneath the new Intelligent Memory feature is to make fine
adjustments to the underlying TD database so it stores the very hot data in a FSG
cache. With main memory access being on the order of 3k times faster than getting
out to disk drives on a server node with the TD parallel db, it makes sense to spend
a bit on main memory.
Intelligent Memory process automatically and transparently takes the
data and places it on storage by considering its thermal characteristics: Hot,
Warm, Cold. It provides relevant and good use of large capacity drives for Cold/Dead
data storage. Data placement is self and lucid optimized by moving most frequent
accessed data (‘hot data’) to faster storage, while moving rarely accessed data
(‘cold data’) to slower storage units or shared disks.
With Intelligent Memory, Teradata going up to introduce and provide the
highest performing IDW as part of the UDA (Unified Data Arch.). Intelligent
Memory uses most of the main space to provide the highest query perf with no cost of in-memory databases. It provides
the best of both worlds: It keeps the frequently used current or say “hot” data
in memory to achieve high query/system performance—without the need to restrict
available data to that which will fit in the available memory. With Teradata
Intelligent Memory, Teradata Database continues to make the full scope of data
available by keeping cooler data economically stored on disk. In addition,
Teradata Database delivers many other features and capabilities that provide
high performance. These include the industry’s best optimizer, efficient
indexes, and several intelligent scan techniques to reduce the amount of data
that must be read during query processing.
Credits:
No comments:
Post a Comment