Monday, 28 November 2016

Teradata Intelligent Memory (TIM)- Running at the speed of Business

Teradata Intelligent Memory (TIM)- Running at the speed of Business

Overview of Teradata
Teradata is a RDBMS (Relational Database Management System). This system is solely based on off-the-shelf (readymade) symmetric multiprocessing (SMP) technology combined with communication networking, connecting SMP systems to form large Massively parallel processing systems (MPP). It is wholly used to manage large data warehousing operations. Teradata acts as a single data store that can accept large number of concurrent requests from multiple client applications.
Parallelism along with load distribution shared among several users, Execution of complex queries with a maximum of 256 joins, Parallel efficiency, Complete scalability , Teradata Intelligent Memory are some of Teradata’s widely shown Features.
Shared Nothing Architecture: a distributed computing architecture in which each node is independent and self-sufficient,   and there is no single point of contention across the system.  More specifically,  none of the nodes share memory or disk storage.

Architectural View of Teradata, How it works –Shared Nothing?


What is Teradata Intelligent Memory ?
Teradata is “ Temperature Aware”  Database technology.
Intelligent Memory is the latest feature in the list of temperature-aware features in the Teradata Database. A feature of Teradata which automatically and transparently keeps the “hottest” or most “frequently accessed data”  in memory for rapid and fastest query performance possible at a fraction of the cost of an in-memory database.Teradata Intelligent Memory combines RAM and disk for High Performance Big Data without the extreme requirement of exclusive in-memory operations.
Why and How Teradata memory called “Intelligent “  ?
The TD Database continuously tracks the “temperature” senses of all its data. The most frequently accessed information is identified inside the database on a “hot” data list. Teradata Intelligent Memory automatically puts in information to its extended memory block whenever data on the very hot list is fetched from disk for query processing and analysis.  When we need this data for another query, the database automatically looks to Teradata Memory. This eliminates the need to access a physical disk where I/O speed is up to 3k times slower. Hence Being 3k times faster using the temperature aware feature, we call it Teradata “Intelligent” Memory. --Automatic Intelligence
Advantages:
Input/Output for fetching data from disk for CPU processing and the CPU processing cycles are the two major system resources of computer in which constraints generally impact system performance. By having the most frequently used data in memory and thereby eliminating needed I/O, performance is enhanced. --Relief From I/O Constraints
The Teradata Database file system (TDFS)  knows about what data is available in memory (Disk) and automatically uses that copy (cached) just as it would use data out of cache instead of going to disk. -- Make the Most of memory
TD Intelligent Memory lifts up this the fact in an innovative manner so as to achieve high DB performance without the cost of buying enough memory to store the entire db. By keeping a replica of the most used data in memory, pdisks I/O to SSD and HDD can be reduced dramatically and helps in running at the speed of business. solid-state disk ( i.e SSD) ,  hard disk drive (i.e HDD) and Physical Disks (i.e pdisks)
Making sure that we can access “hot” data for fast processing enhances query as well as system performance, which in return provides business leaders more timely insights to improve decision making. -- Rapid access equals better business decisions.
In today’s world we are getting a terabyte/petabyte or more of memory onto a server, if we want to pay the price. As far as processing big data analyatics is concerned this means that we’d have tens or even hundreds of servers if we  wanted to load anything that speaks of big data and its analysis, and we’d need to have software that could integrate those resources well, assuming we wanted to hold all that data in memory.
And that is the reason why more of the companies think that in-memory technology is the thing that has been deployed to pin OLTP databases in memory and run Business Intellignce queries off the same Db. We can mirror the servers involved and we get a huge rise in speed. It may really be the best use of in-memory or say intelligent memory technology.
And So?
The “data analyst txn”  are not a simple one. They varies significantly according to the goal and the behaviour of the data being analyzed. We cannot simply model these in the way we can model the OLTP transactions. But we know a couple things for sure. First, the transaction will go very faster if the most frequently accessed data is held in memory and only has to be read from disk once. Second, it will go faster if we employ as much parallelism as possible so as Teradata we have for “Parallelism” and “in-memory Txn”.
And this means that in-memory technology and big data, whether they like it or not, will really play nicely together.
How Cache benefits Teradata Intelligent Memory?
As with most computer cache techniques, Data is kept there for short periods seconds or at most minutes. Teradata employs a caching approach as well; what it calls the FSG (File Segment) cache. And Teradata is smart enough to make sure that no data kept in the FSG cache will be moved into Intelligent Memory, and vice versa.
TIM plays with cache to emphasize on long run data usage. The increased memory blocks works best with the already present File Segment cache to list heat intensity based data collections that will define queries over rising time period and make them more stabilize. --Cache Partner



How Teradata categorize Multi-temperature Data?
In Teradata, the frequency at which data is accessed for R/W ops often described as its “Heat intensity or Temperature” ,Also  analyzing and managing data by its temperature can open up opportuni­ties to provide value across the entire enterprise. Hence, we categorize data on basis of their frequent access/temperature as :
Data Temp
Definition of usage
Business Examples
White Hot
Continuous, Often expected spikes of repeated data access
Live campaign, data repeatedly queried and accessed
Hot
Frequently access Data
Initially for live campaign, to analyze sales figure or see trends using reports
Warm
Data accessed less frequently and usually with less urgency
In a month or two, thje campaign is changed or closed
Cold
Data historical Information usually seen in data mining and analysis activities
Campaign ended, reports completed, now data accessed on yearly reviews
Dormant
Data that has not been touched for considerable period of time or not at all
Data archived, long finished campaign




What happens to rest of Memory after Teradata Intelligent Memory  ?
After use of “Hot” data, Teradata leaves behind the “Cold” data.  And that “cold” data uses the Teradata virtual Storage concept and compresses the data left behind.
And then comes the Compress on Cold  feature:
The Teradata Database has improved the use of hybrid storage to achieve more intelligent multi-temperature data management where the "hot" data is the most frequently used and "cold" data is least used or dormant.  It is the industry's only intelligent virtual storage solution that automatically migrates and compresses, or decompresses, data between drive types to achieve optimum performance and storage utilization.  This keeps data from turning into dead data. –Block Level Compression( BLC)
Teradata Database intelligently succeeds to manage data to maximize performance while optimizing the return on system resources.  It automatically compresses the coldest or the least frequently used data on the system to save disk storage space and getting the left space for other uses.  Keeping data in its natural decompressed format when it is frequently used maximizes performance by avoiding repeated decompression processing and is good on the system performance perspective.  Automatically compressing the data when it is less frequently used enables the storage of the most data at the most effective costs. No DBA staff intervention is required with Teradata's automated self-managing design.
With big data analytics becoming extensive, there's a critical need for a database to be smart enough to dynamically judge how "hot" or "cold" data is for an the entire enterprise. The hotter, more popular, data need to be located on the fastest storage devices while less active, cooler data can be pushed onto slower media. Cold data is compressed up to five times to gain minimal storage cost. With the Teradata Database, Teradata Virtual Storage increases intelligent management of data by automatically decompressing and relocating once cold data onto faster storage systems as demand for the data heats up. For example, the Teradata Database will recognize when monthly year-over-year data should be cycled in or out of archival media as needed and without laborious database administration intervention.
Benefits of TIM(Teradata Intelligent Memory)
TIM uses highly developed algo’s that itself age, track, and rank data to make sure effective data mgmt and support for user queries. Inside TIM, we can store and compress data in columns and rows, which increases the amount of data availability in the memory space. TIM puts only the very hottest data to the new extended memory space area. Organizations makes full use of it by being able to access the most current data rapidly from system memory to satisfy the vast majority of their queries, which also achieves a better financial ROI (return on investment).
Teradata storage can be increased at a lower price than is achieved in absence of virtual storage. Storage can now be implemented to a clique besides adding both nodes and storage. The probability to mix drive sizes in each clique starts positioning where the large volume archived data or other Cold data can be added within the Enterprise Data Warehouse. This rise up the utilization of the EDW in reach of deep history data analysis, with a result of enhancing the ROI for the EDW.
The configuration flexibility of Virtual Storage now allows storage in a clique to be expanded in a broad range of size increments since the restrictions on drives per AMP are eliminated. Expansion can be gained by adding the desired drive count and performing a restart. The system Reconfig process is not needed since AMP count assignments are not typically changed. Since the Virtual Storage or in-memory based approach does not usually require added AMPs, only a system restart (which is just a few minutes long) is required after new storage is added to a system based on TVS.
When is TIM not appropriate?
This Intelligent Memory product release does not suits in every TD solution and is not appropriate for consideration in some of the systems. The many Enterprise Data Warehouse  systems that are focused mainly or solely on operational or Active EDW environments would contain mostly Hot data. As data ages and cools it would be moved to archive since it no longer offers apparent business value. There’s no benefit seen with Intelligent Memory in this case if the Cold data is simply no longer used as it works best for “Hot” data bust not “Dormant” Data.
The fat memory is available on the 670 Data Mart Appliance, the 2700 Data Warehouse Appliance, and on the 6700 Active Enterprise Data Warehouse, and will soon be available on the flash-accelerated Extreme Data Appliance.

The future possibilities of TIM:
Although at this time, it is not possible to say when or how TIM will be enhanced beyond the current initial release, there are definitely many possibilities for the future. Among possibilities for expanding the capability of TIM is a greater sharing of data across more disks .
Teradata’s latest in-memory architecture is integrated with its management of data heat intensity. This is very significant, because the hottest data will locate automatically to the in-memory layer—Teradata Intelligent Memory; the next hottest data will move self to solid state disk; and, so on. Teradata also provides the column storage and data compression that amplify the value of data in memory. The customer sees increased performance without having to make decisions about which data is placed in memory.
Summary
Teradata’s Temperature aware feature accelerates warehouse query performance and increases the value of system space by safeguarding that the most recently used data is kept in memory. It is Teradata’s new approach to multi-temperature data management.Intelligent memory enables flexible configurations of mixed drive capacities within one system and clique. It also gives cost effective and simple expansion of storage in a system without having to add further TD nodes. It allows the use of mixed storage on a System. Specifically, disks of different sizes and types can be mixed in a disk array, and different disk array models can be mashed in a clique. This allows the system to get back old disks in a new config or mix and match larger, lower performance disks with smaller, faster performance storage.
The brain beneath the new Intelligent Memory feature is to make fine adjustments to the underlying TD database so it stores the very hot data in a FSG cache. With main memory access being on the order of 3k times faster than getting out to disk drives on a server node with the TD parallel db, it makes sense to spend a bit on main memory.
Intelligent Memory process automatically and transparently takes the data and places it on storage by considering its thermal characteristics: Hot, Warm, Cold. It provides relevant and good use of large capacity drives for Cold/Dead data storage. Data placement is self and lucid optimized by moving most frequent accessed data (‘hot data’) to faster storage, while moving rarely accessed data (‘cold data’) to slower storage units or shared disks.

With Intelligent Memory, Teradata going up to introduce and provide the highest performing IDW as part of the UDA (Unified Data Arch.). Intelligent Memory uses most of the main space to provide the highest query perf  with no cost of in-memory databases. It provides the best of both worlds: It keeps the frequently used current or say “hot” data in memory to achieve high query/system performance—without the need to restrict available data to that which will fit in the available memory. With Teradata Intelligent Memory, Teradata Database continues to make the full scope of data available by keeping cooler data economically stored on disk. In addition, Teradata Database delivers many other features and capabilities that provide high performance. These include the industry’s best optimizer, efficient indexes, and several intelligent scan techniques to reduce the amount of data that must be read during query processing.

Credits:

No comments:

Post a Comment