Sökning: "GEMM"

Hittade 4 uppsatser innehållade ordet GEMM.

  1. 1. Evaluation of FPGA-based High Performance Computing Platforms

    Master-uppsats, Linköpings universitet/Datorteknik

    Författare :Martin Frick-Lundgren; [2023]
    Nyckelord :FPGA; High performance computing; BUDE; GEMM; CPU; GPU;

    Sammanfattning : High performance computing is a topic that has risen to the top in the era ofdigitalization, AI and automation. Therefore, the search for more cost and timeeffective ways to implement HPC work is always a subject extensively researched.One part of this is to have hardware that is capable to improve on these criteria. LÄS MER

  2. 2. Register Caching for Energy Efficient GPGPU Tensor Core Computing

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Qiran Qian; [2023]
    Nyckelord :Computer Architecture; GPGPU; Tensor Core; GEMM; Energy Efficiency; Register File; Cache; Instruction Scheduling; Datorarkitektur; GPGPU; Tensor Core; GEMM; energieffektivitet; registerfil; cache; instruktionsschemaläggning;

    Sammanfattning : The General-Purpose GPU (GPGPU) has emerged as the predominant computing device for extensive parallel workloads in the fields of Artificial Intelligence (AI) and Scientific Computing, primarily owing to its adoption of the Single Instruction Multiple Thread architecture, which not only provides a wealth of thread context but also effectively hide the latencies exposed in the single threads executions. As computational demands have evolved, modern GPGPUs have incorporated specialized matrix engines, e. LÄS MER

  3. 3. AXI-PACK : Near-memory Bus Packing for Bandwidth-Efficient Irregular Workloads

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Chi Zhang; [2022]
    Nyckelord :General propose processor; on-chip bus protocol; irregular memory access; ASIC digital circuit design.; Generellt förslag på processor; on-chip-bussprotokoll; oregelbunden minnesåtkomst; digital ASIC-kretsdesign.;

    Sammanfattning : General propose processor (GPP) are demanded high performance in dataintensive applications, such as deep learning, high performance computation (HPC), where algorithm kernels like GEMM (general matrix-matrix multiply) and SPMV (sparse matrix-vector multiply) kernels are intensively used. The performance of these data-intensive applications are bounded with memory bandwidth, which is limited by computing & memory access coupling and memory wall effect. LÄS MER

  4. 4. Efficient LU Factorization for Texas Instruments Keystone Architecture Digital Signal Processors

    Master-uppsats, KTH/Skolan för datavetenskap och kommunikation (CSC)

    Författare :Gilbert Netzer; [2015]
    Nyckelord :LU factorization; digital signal processors; Texas Instruments; Keystone architecture; high-performance LINPACK; benchmark; performance; energy efficiency; software-pipelined loops; direct memory access; optimization;

    Sammanfattning : The energy consumption of large-scale high-performance computer (HPC) systems has become one of the foremost concerns of both data-center operators and computer manufacturers. This has renewed interest in alternative computer architectures that could offer substantially better energy-efficiency. LÄS MER