//===-- llvm/MC/MCSchedule.h - Scheduling -----------------------*- C++ -*-===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // //===----------------------------------------------------------------------===// // // This file defines the classes used to describe a subtarget's machine model // for scheduling and other instruction cost heuristics. // //===----------------------------------------------------------------------===// #ifndef LLVM_MC_MCSCHEDULE_H #define LLVM_MC_MCSCHEDULE_H #include "llvm/Config/llvm-config.h" #include "llvm/Support/DataTypes.h" #include <cassert> namespace llvm { template <typename T> class ArrayRef; struct InstrItinerary; class MCSubtargetInfo; class MCInstrInfo; class MCInst; class InstrItineraryData; /// Define a kind of processor resource that will be modeled by the scheduler. struct MCProcResourceDesc { … }; /// Identify one of the processor resource kinds consumed by a /// particular scheduling class for the specified number of cycles. struct MCWriteProcResEntry { … }; /// Specify the latency in cpu cycles for a particular scheduling class and def /// index. -1 indicates an invalid latency. Heuristics would typically consider /// an instruction with invalid latency to have infinite latency. Also identify /// the WriteResources of this def. When the operand expands to a sequence of /// writes, this ID is the last write in the sequence. struct MCWriteLatencyEntry { … }; /// Specify the number of cycles allowed after instruction issue before a /// particular use operand reads its registers. This effectively reduces the /// write's latency. Here we allow negative cycles for corner cases where /// latency increases. This rule only applies when the entry's WriteResource /// matches the write's WriteResource. /// /// MCReadAdvanceEntries are sorted first by operand index (UseIdx), then by /// WriteResourceIdx. struct MCReadAdvanceEntry { … }; /// Summarize the scheduling resources required for an instruction of a /// particular scheduling class. /// /// Defined as an aggregate struct for creating tables with initializer lists. struct MCSchedClassDesc { … }; /// Specify the cost of a register definition in terms of number of physical /// register allocated at register renaming stage. For example, AMD Jaguar. /// natively supports 128-bit data types, and operations on 256-bit registers /// (i.e. YMM registers) are internally split into two COPs (complex operations) /// and each COP updates a physical register. Basically, on Jaguar, a YMM /// register write effectively consumes two physical registers. That means, /// the cost of a YMM write in the BtVer2 model is 2. struct MCRegisterCostEntry { … }; /// A register file descriptor. /// /// This struct allows to describe processor register files. In particular, it /// helps describing the size of the register file, as well as the cost of /// allocating a register file at register renaming stage. /// FIXME: this struct can be extended to provide information about the number /// of read/write ports to the register file. A value of zero for field /// 'NumPhysRegs' means: this register file has an unbounded number of physical /// registers. struct MCRegisterFileDesc { … }; /// Provide extra details about the machine processor. /// /// This is a collection of "optional" processor information that is not /// normally used by the LLVM machine schedulers, but that can be consumed by /// external tools like llvm-mca to improve the quality of the peformance /// analysis. struct MCExtraProcessorInfo { … }; /// Machine model for scheduling, bundling, and heuristics. /// /// The machine model directly provides basic information about the /// microarchitecture to the scheduler in the form of properties. It also /// optionally refers to scheduler resource tables and itinerary /// tables. Scheduler resource tables model the latency and cost for each /// instruction type. Itinerary tables are an independent mechanism that /// provides a detailed reservation table describing each cycle of instruction /// execution. Subtargets may define any or all of the above categories of data /// depending on the type of CPU and selected scheduler. /// /// The machine independent properties defined here are used by the scheduler as /// an abstract machine model. A real micro-architecture has a number of /// buffers, queues, and stages. Declaring that a given machine-independent /// abstract property corresponds to a specific physical property across all /// subtargets can't be done. Nonetheless, the abstract model is /// useful. Futhermore, subtargets typically extend this model with processor /// specific resources to model any hardware features that can be exploited by /// scheduling heuristics and aren't sufficiently represented in the abstract. /// /// The abstract pipeline is built around the notion of an "issue point". This /// is merely a reference point for counting machine cycles. The physical /// machine will have pipeline stages that delay execution. The scheduler does /// not model those delays because they are irrelevant as long as they are /// consistent. Inaccuracies arise when instructions have different execution /// delays relative to each other, in addition to their intrinsic latency. Those /// special cases can be handled by TableGen constructs such as, ReadAdvance, /// which reduces latency when reading data, and ReleaseAtCycles, which consumes /// a processor resource when writing data for a number of abstract /// cycles. /// /// TODO: One tool currently missing is the ability to add a delay to /// ReleaseAtCycles. That would be easy to add and would likely cover all cases /// currently handled by the legacy itinerary tables. /// /// A note on out-of-order execution and, more generally, instruction /// buffers. Part of the CPU pipeline is always in-order. The issue point, which /// is the point of reference for counting cycles, only makes sense as an /// in-order part of the pipeline. Other parts of the pipeline are sometimes /// falling behind and sometimes catching up. It's only interesting to model /// those other, decoupled parts of the pipeline if they may be predictably /// resource constrained in a way that the scheduler can exploit. /// /// The LLVM machine model distinguishes between in-order constraints and /// out-of-order constraints so that the target's scheduling strategy can apply /// appropriate heuristics. For a well-balanced CPU pipeline, out-of-order /// resources would not typically be treated as a hard scheduling /// constraint. For example, in the GenericScheduler, a delay caused by limited /// out-of-order resources is not directly reflected in the number of cycles /// that the scheduler sees between issuing an instruction and its dependent /// instructions. In other words, out-of-order resources don't directly increase /// the latency between pairs of instructions. However, they can still be used /// to detect potential bottlenecks across a sequence of instructions and bias /// the scheduling heuristics appropriately. struct MCSchedModel { … }; } // namespace llvm #endif