This is the second edition of a user’s guide to the Cray T3E massively parallel supercomputer installed at the Center for Scientific Computing. 11 2 Using the Cray T3E at CSC 13 Logging in. The components of Cray T3E node. The DEC Alpha processor architecture. . The CRAY T3E is a scalable shared-memory multiprocessor based on the DEC Alpha Section 2 provides a brief overview of the system architecture.
|Country:||Saint Kitts and Nevis|
|Published (Last):||15 March 2012|
|PDF File Size:||17.92 Mb|
|ePub File Size:||16.74 Mb|
|Price:||Free* [*Free Regsitration Required]|
The Cray T3E was Cray Research ‘s second-generation massively parallel supercomputer architecture, launched in late November Like the previous Cray T3Dit was a fully distributed memory machine using a 3D torus topology interconnection network.
A processor T3E was the first supercomputer to achieve a performance of arcchitecture than 1 teraflops running a computational science application, in After Cray Research was acquired by Silicon Graphics in Archihecturedevelopment of new Alpha-based systems was stopped.
The MC models were housed in one or more liquid-cooled cabinet separately from the host, there was also a liquid-cooled MCN model which had an alternative interconnect wiremat allowing non-power-of-2 numbers of PEs.
The first T3D delivered was a prototype installed at the Pittsburgh Supercomputing Center in early Septemberthe supercomputer was formally introduced on 27 September Alpha — The Alphaalso known by its code name, EV5, is a microprocessor developed and fabricated by Digital Equipment Corporation that implemented the Alpha instruction set architecture. It was introduced in Januarysucceeding the Alpha A as Digitals flagship microprocessor and it was succeeded by the Alpha in First silicon of the Alpha was produced in Februaryand it was sampled in late and was introduced in January at MHz.
A MHz version was introduced in Marchthe final Alphaa MHz version, was announced on 2 Octoberavailable in sample quantities. The Alpha was replaced by the Alpha A as Digitals flagship microprocessor in when a MHz version became available in volume quantities, Digital used the Alpha operating at various clock frequencies in their AlphaServer servers, AlphaStation workstations. Third parties such as DeskStation also built using the Alpha The Alpha is a superscalar microprocessor capable of issuing a maximum of four instructions per clock cycle to four execution units.
The integer pipeline is seven stages long, and the floating-point pipeline is ten stages long, the implemented a bit virtual address and a bit physical address. It was therefore capable of addressing 8 TB of virtual memory and 1 TB of physical memory, the integer unit consisted h3e two integer pipelines and the integer register file. The multiply pipeline exclusively executes shift, store, and multiply instructions, the add pipeline exclusively executes branch instructions.
Except for branch, conditional move, and multiply instructions, all other instructions begin, branch and conditional move instructions are executed during stage six so they can be issued with a compare instruction whose result they depend on.
The integer register file contained forty bit registers, of archifecture thirty-two are specified by the Alpha Architecture, the register file has four read ports and two write ports evenly divided between the two integer pipelines.
The floating-point unit consisted of two floating-point pipelines and the floating point register file, the two pipelines are not identical, one executed all floating-point instructions except for multiply, and the other executed only multiply instructions.
A non-pipelined floating-point divider is connected to the add pipeline, all floating-point instructions except for divide have four-cycle latency. Divides have variable latency that depends on whether the operation is being performed on single or on double precision floating-point numbers and numbers, including overhead, single precision divides have a to cycle latency, whereas double precision divides have a to cycle latency.
The has three levels of cache, two on-die and crwy external and optional, the caches and the associated logic consisted of 7. Microprocessor — A microprocessor is a computer processor which incorporates the functions of a computers central processing unit on a single integrated circuit, or at most a few integrated circuits.
Microprocessors contain both combinational logic and sequential digital logic, Microprocessors operate on numbers and symbols represented in the binary numeral system. The integration of a whole CPU onto a chip or on a few chips greatly reduced the cost of processing power. Integrated circuit processors are produced in numbers by highly automated processes resulting in a low per unit cost.
Single-chip processors increase reliability as there are many electrical connections to fail. As microprocessor designs get better, the cost of manufacturing a chip generally stays the same, before microprocessors, small computers had been built using racks of circuit boards ardhitecture many medium- and small-scale integrated circuits.
Microprocessors combined this into one or a few large-scale ICs, the internal arrangement of a microprocessor varies depending on the age of the design and the intended purposes of the microprocessor. Advancing technology makes more complex and powerful chips feasible to manufacture, a minimal hypothetical microprocessor might only include an arithmetic logic unit and a control logic section.
The ALU performs operations such as addition, subtraction, and operations archiecture as AND or OR, each operation of the ALU sets one or more flags in a status register, which indicate the results of the last operation. The control logic retrieves instruction codes from memory and initiates the sequence of operations required for the ALU to carry out the instruction, a single operation code might affect many individual data paths, registers, and other elements of the processor.
As integrated circuit technology craj, it was feasible to manufacture more and more complex processors on a single chip, the size of data objects became larger, allowing more transistors on a chip allowed word sizes to increase from 4- and 8-bit words up to todays bit words.
Additional features were added to the architecture, more on-chip registers sped up programs. Floating-point arithmetic, for example, was not available on 8-bit microprocessors.
Cray Research Incorporated
Integration of the point unit first as a separate integrated circuit and aarchitecture as part of the same microprocessor chip. Occasionally, physical limitations of integrated circuits made such practices as a bit slice approach necessary, instead of processing all of a long word on one integrated circuit, multiple circuits in parallel processed subsets of each data word. With the ability to put large numbers of transistors on one chip and this CPU cache has the advantage of faster access than off-chip memory, and increases the processing speed of the system for many applications.
Processor clock frequency has increased more rapidly than external memory speed, except in the recent past, a microprocessor is a general tt3e system. Several specialized processing devices have followed from the technology, A digital signal processor is specialized for signal processing, graphics processing units are h3e designed primarily for realtime rendering of 3D images. Dynamic random-access memory — Dynamic random-access memory is a type of random-access memory that stores each bit of data in a separate capacitor within an integrated circuit.
The capacitor can be charged or discharged, these two states are taken to represent the two values of a bit, conventionally called 0 and 1. Since even nonconducting transistors always leak a small amount, the capacitors will slowly discharge, because of this refresh requirement, it is a dynamic memory as opposed to static random-access memory and other static types of memory.
Unlike flash memory, DRAM is volatile memory, since it loses its data quickly when power is removed, however, DRAM does exhibit limited data remanence. DRAM is widely used in digital electronics where low-cost and high-capacity memory is required, one of the largest applications for DRAM is the main memory in modern computers, and as the main memories of components used in these computers such as graphics cards. The advantage of DRAM is its simplicity, only one transistor.
This allows DRAM to reach high densities. The transistors and capacitors used are small, billions can fit on a single memory chip. Due to the nature of its memory cells, DRAM consumes relatively large amounts of power. Paper tape was read and the characters on it were remembered in a dynamic store, the store used a large bank of capacitors, which were either charged or not, a charged capacitor representing cross and an uncharged capacitor dot.
Since the charge gradually leaked away, a pulse was applied to top up those still charged. InArnold Farber and Eugene Schlig, working for IBM, created a hard-wired memory cell, using a transistor gate and they replaced the latch with two transistors and two resistors, a configuration that became known as the Farber-Schlig cell. f3e
He was granted U. Silicon Graphics — Silicon Graphics, Inc.
The Cray Inc. T3E.
Early systems were based on the Geometry Engine tt3e Clark and Marc Hannah had developed archutecture Stanford University, for much of its history, the company focused on 3D imaging and were a major supplier of both hardware and software in this market. They reincorporated as a Delaware corporation in Januarythrough the mid to lates, the rapidly improving performance of commodity Wintel machines began to erode SGIs stronghold in the 3D market.
The porting of Maya to other platforms is an event in this process. Under Belluzzos leadership a number of initiatives were taken which are considered to have accelerated the corporate decline, one such initiative was trying to sell workstations running Windows NT called Visual Workstations instead of just ones which ran IRIX, the companys version of UNIX. This put the company in more direct competition with the likes of Dell.
Inin an attempt to clarify their current market position as more than a company, Silicon Graphics Inc. The new logo drew criticism for wasting the professional associated with the previous cube archtecture. SGI continued to use the Silicon Graphics name for its product line. In NovemberSGI announced that it had been delisted from the New York Stock Exchange because its common stock had fallen below the share price for listing on the exchange.
In FebruarySGI noted that it could run out of cash by the end of the year, in mid, SGI hired Alix Partners to advise it on returning to profitability and received a new line of credit. SGI announced it was postponing its scheduled annual December stockholders meeting until March and it proposed a reverse stock split to deal with the de-listing from the New York Stock Exchange.
History of supercomputing — The CDC, released inis generally considered the first supercomputer. Progress in the first decade of the 21st century was dramatic and supercomputers adchitecture over 60, processors appeared, the term Super Computing was first used in the New York World in to refer to large custom-built tabulators that IBM had made for Columbia University.
In Cray completed the CDC, one of the first solid state computers, around Cray decided to crzy a computer that would be the fastest in the world by a large margin. After four years of experimentation along with Jim Thornton, and Dean Roush, Cray switched from germanium to silicon transistors, built by Fairchild Semiconductor, that used the planar process. These did not have the drawbacks of the silicon transistors. He ran them very fast, and the speed of light restriction forced a very compact design with severe overheating problems, which were solved by introducing refrigeration, designed by Dean Roush.
Given that the outran all computers of the time by about 10 times, it was dubbed a supercomputer, the gained speed by farming out work to peripheral computing elements, freeing the CPU to process actual data. In Cray completed the CDC, again the fastest computer in the world, at 36 MHz, the had about three and a half times the clock speed of thebut ran significantly faster due to other technical innovations.
They only sold about 50 of the s, not quite a failure, Cray left CDC in to form his own company. Archhitecture was said that whenever Englands Atlas went offline half of the United Kingdoms computer capacity was lost, Atlas also pioneered the Atlas Supervisor, considered by many to be the first recognizable modern operating system.
All three floating point pipelines on the X-MP could operate simultaneously, the Cray-2 released in was a 4 processor liquid cooled computer totally immersed in a tank of Fluorinert, which bubbled as it operated. It could perform to 1. The Cray 2 was a new design and did not use chaining and had a high memory latency.
That trend was partly responsible for an away from the architecturr. It was announced in as the cleaned up successor to the Cray-1, the principal designer was Steve Chen.
The X-MPs main improvement over the Cray-1 was that it was a architecturr vector processor. It housed two CPUs in a mainframe that was identical in outside appearance to the Cray The X-MP initially supported 2 million bit words of memory in 16 banks.
This configuration was first used for Cray Researchs UNIX port, inimproved models of the X-MP were announced, consisting of one, two, and four-processor systems with 4 and 8 million word configurations. Cray-1 — The Cray-1 was a supercomputer designed, manufactured and marketed by Cray Research. The first Cray-1 system was installed at Los Xrchitecture National Laboratory in and it went on to one of the best known.
From to Seymour Cray of Control Data Corporation worked on the CDC, the was essentially made up of four s in a box with an additional special mode that allowed them to operate lock-step in a SIMD fashion. In fact the main processor of the STAR had less performance than thebythe had reached a dead end, the machine was so incredibly complex that it was impossible to get one working properly. Even a single faulty component would render the machine non-operational, Cray went to William Norris, Control Datas CEO, saying crayy a redesign from scratch was needed.
At the time the company was in financial trouble, and with the STAR in the pipeline as well. The company expected to sell perhaps a dozen of the machines, and set the selling price accordingly, the machine made Seymour Cray a celebrity and his company a success, lasting until the supercomputer crash in the early s.
Based on a recommendation by William Perrys study, the NSA purchased a Cray-1 for theoretical research in cryptanalysis. As a comparison standpoint, the processor in a typical smartphone performs at roughly 1 GFLOPS, typical scientific workloads consist of reading in large data sets, transforming them in some way and then writing them back out again.
Normally the transformations being applied are identical across all of the points in the set. Cray-2 — The Cray-2 is a supercomputer with four vector processors built with emitter-coupled logic and made by Cray Research starting in With the successful launch of his famed Cray-1, Seymour Cray turned to the design of its successor. By he had become fed up with management interruptions in what was now a large company, and as he had done in the past, decided to resign his management post and move to form a new lab.
Working as an independent consultant at these new Cray Labs, he put together a team and this Lab would later close, and a decade later a new facility in Colorado Springs would open. Unfortunately the density needed to achieve this cycle time led to the machines downfall, one solution to this problem, one that most computer vendors had already moved to, was to use integrated circuits archjtecture of individual components.