3 PP flynn pararchitectures xử lý song song và phân tán

37 210 1
3 PP flynn pararchitectures xử lý song song và phân tán

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

PHẦN 1: TÍNH TOÁN SONG SONG Chƣơng 1 KIẾN TRÚC VÀ CÁC LOẠI MÁY TINH SONG SONG Chƣơng 2 CÁC THÀNH PHẦN CỦA MÁY TINH SONG SONG Chƣơng 3 GIỚI THIỆU VỀ LẬP TRÌNH SONG SONG Chƣơng 4 CÁC MÔ HÌNH LẬP TRÌNH SONG SONG Chƣơng 5 THUẬT TOÁN SONG SONG PHẦN 2: XỬ LÝ SONG SONG CÁC CƠ SỞ DỮ LIỆU (Đọc thêm) Chƣơng 6 TỔNG QUAN VỀ CƠ SỞ DỮ LIỆU SONG SONG Chƣơng 7 TỐI ƢU HÓA TRUY VẤN SONG SONG Chƣơng 8 LẬP LỊCH TỐI ƢU CHO CÂU TRUY VẤN SONG SONG

Thoai Nam Khoa Coõng Ngheọ Thoõng Tin ẹaùi Hoùc Baựch Khoa Tp.HCM Flynns Taxonomy Classification of Parallel Computers Based on Architectures Khoa Coõng Ngheọ Thoõng Tin ẹaùi Hoùc Baựch Khoa Tp.HCM Based on notions of instruction and data streams SISD (Single Instruction stream, a Single Data stream ) SIMD (Single Instruction stream, Multiple Data streams ) MISD (Multiple Instruction streams, a Single Data stream) MIMD (Multiple Instruction streams, Multiple Data stream) Popularity MIMD > SIMD > MISD Khoa Coõng Ngheọ Thoõng Tin ẹaùi Hoùc Baựch Khoa Tp.HCM SISD Conventional sequential machines IS : Instruction Stream DS : Data Stream CU : Control Unit PU : Processing Unit MU : Memory Unit CU PU MU IS IS DS I/O Khoa Coõng Ngheọ Thoõng Tin ẹaùi Hoùc Baựch Khoa Tp.HCM SIMD Vector computers, processor arrays Special purpose computations PE : Processing Element LM : Local Memory CU PE1 LM1 PEn LMn DS DS DS DS IS IS Program loaded from host Data sets loaded from host SIMD architecture with distributed memory Khoa Coõng Ngheọ Thoõng Tin ẹaùi Hoùc Baựch Khoa Tp.HCM MISD Systolic arrays Special purpose computations Memory (Program, Data) PU1 PU2 PUn CU1 CU2 CUn DS DS DS IS IS IS IS IS DS I/O MISD architecture (the systolic array) Khoa Coõng Ngheọ Thoõng Tin ẹaùi Hoùc Baựch Khoa Tp.HCM MIMD General purpose parallel computers CU1 PU1 Shared Memory IS IS DS I/O CUn PUn IS DS I/O IS MIMD architecture with shared memory Khoa Coõng Ngheọ Thoõng Tin ẹaùi Hoùc Baựch Khoa Tp.HCM Classification based on Architecture Pipelined Computers Dataflow Architectures Data Parallel Systems Multiprocessors Multicomputers Khoa Coõng Ngheọ Thoõng Tin ẹaùi Hoùc Baựch Khoa Tp.HCM Instructions are divided into a number of steps (segments, stages) At the same time, several instructions can be loaded in the machine and be executed in different steps Khoa Coõng Ngheọ Thoõng Tin ẹaùi Hoùc Baựch Khoa Tp.HCM IF instruction fetch ID instruction decode and register fetch EX- execution and effective address calculation MEM memory access WB- write back Instruction i IF ID EX WB IF ID EX WB IF ID EX WB IF ID EX WB IF ID EX WB Instruction i+1 Instruction i+2 Instruction i+3 Instruction #12345678 Cycles MEM MEM MEM MEM MEM 9 Instruction i+4 [...]... bi := ai * fi ci := bi + ci-1 end d1 e1 d2 / e2 / a1 f1 * + e3 f2 f3 / * b2 c1 e4 a3 * + d4 / a2 b1 c0 d3 a4 f4 * b3 c2 + output a, b, c Khoa Coõng Ngheọ Thoõng Tin ẹaùi Hoùc Baựch Khoa Tp.HCM b4 c3 + c4 Execution on a Control Flow Machine Assume all the external inputs are available before entering do loop + : 1 cycle, * : 2 cycles, / : 3 cycles, a1 b1 c1 a2 b2 c2 a4 b4 Sequential execution on a uniprocessor... How long will it take to execute this program on a dataflow computer with 4 processors? Khoa Coõng Ngheọ Thoõng Tin ẹaùi Hoùc Baựch Khoa Tp.HCM c4 Execution on a Dataflow Machine a1 b1 a2 b2 a3 b3 a4 c1 c2 c3 c4 b4 Data-driven execution on a 4-processor dataflow computer in 9 cycles Can we further reduce the execution time of this program ? Khoa Coõng Ngheọ Thoõng Tin ẹaùi Hoùc Baựch Khoa Tp.HCM... Ngheọ Thoõng Tin ẹaùi Hoùc Baựch Khoa Tp.HCM Current Types of Multicomputers MPP (Massively Parallel Processing) Total number of processors > 1000 Cluster Each node in system has less than 16 processors Constellation Each node in system has more than 16 processors Khoa Coõng Ngheọ Thoõng Tin ẹaùi Hoùc Baựch Khoa Tp.HCM MPP (Massively Parallel Processing) P/C: Microprocessor & Cache MB: Memory Bus... between the connected nodes Firing rule ằ A node can be scheduled for execution if and only if its input data become valid for consumption Dataflow languages Id, SISAL, Silage, LISP, Single assignment, applicative(functional) language Explicit parallelism Khoa Coõng Ngheọ Thoõng Tin ẹaùi Hoùc Baựch Khoa Tp.HCM a + z = (a + b) * c b * z c The dataflow representation of an arithmetic expression Khoa Coõng . := bi + ci-1 end output a, b, c / d1 e1 * + f1 / d2 e2 * + f2 / d3 e3 * + f3 / d4 e4 * + f4 a1 b1 a2 b2 a3 b3 a4 b4 c0 c4 c1 c2 c3 Khoa Coõng Ngheọ Thoõng Tin ẹaùi Hoùc Baựch Khoa Tp.HCM Execution. Ngheọ Thoõng Tin ẹaùi Hoùc Baựch Khoa Tp.HCM Execution on a Dataflow Machine c1 c2 c3 c4a1 a2 a3 a4 b1 b2 b4 b3 Data-driven execution on a 4-processor dataflow computer in 9 cycles Can we further. EX WB IF ID EX WB IF ID EX WB IF ID EX WB Instruction i+1 Instruction i+2 Instruction i +3 Instruction #1 234 5678 Cycles MEM MEM MEM MEM MEM 9 Instruction i+4 Khoa Coõng Ngheọ Thoõng Tin ẹaùi Hoùc

Ngày đăng: 14/10/2014, 20:03

Tài liệu cùng người dùng

Tài liệu liên quan