1/21/2007

SH4 Notes - 03

Memory Management Unit (MMU)

Overview
  • 29-bit external memory space by providing 8-bit address space identifiers
  • 32-bit logical (virtual) address space
  • virtual address -> MMU -> physical address
  • 4 instruction TLB (ITLB) entries
  • 64 unified TLB (UTLB) entries
  • UTLB copies are stored in the ITLB by hardware
  • SH-4 there is support for 4 page sizes: 1-kbyte, 4-kbyte, 64-kbyte and 1-Mbyte.
Register descriptions
  • 6 MMU-related registers.
  • Page table entry high register (PTEH) : 32 bits
    - 0xFF00 0000 (P4)
    - 0x1F00 0000 (Area 7)
    -
  • PTEL (Page table entry low register) : 32 bits
  • TTB (Translation table base register) : 32 bits
  • TEA (Translation table address register) : 32 bits
  • MMUCR (MMU control register) : 32 bits

1/14/2007

Intel VT Notes - 01

  • VMX (Virtual Machine Extensions)
  • VMM (Virtual Machine Monitor)
  • Guest Software

[Basic VT Architecture]
Guest Software(VM)
---------
VMM(Virtual Machine Monitor)
---------
Hardware

  • VMX Operation
    - VMX root operation
    - VMX non-root operation
  • VMX Transition (VMX root operation <-> VMX non-root operation)
    - VM entries (VMX root operation ->VMX non-root operation)
    - VM exits (VMX non-root operation -> VMX root operation)
  • No software-visible bit whose setting indicates whether a logical processor is
    in VMX non-root operation.
  • Guest software can run at the privilege level for which it was originally designed.

1/13/2007

關於 Virtualization


[Type 1 Hypervisor]

//Without Host OS

Guest OS
----------
Hypervisor
----------
Hardware

[Type 2 Hypervisor]

//With Host OS

Guest OS
----------
Hypervisor
----------
Host OS
----------
Hardware


想法:

目前 PS3 Linux 應該就是 Type 1 Hypervisor 了。
但如果每一顆 SPE 都跑一個 VM? 不知道結果會怎麼樣。嘿嘿~

-END-

xptcall for SH4

今天成功完成了 xptcall 的 Invoke 部分,
Stub 的部分則要等下星期一了。
若都沒問題的話,就可以 contribute 回 mozilla 了。

先把重點記下來,免得以後忘了(我真的太健忘了)

1. SH4 Calling Convention
- R0 用來傳送 return value
- R1...R3 任意使用
- R4...R7 用來傳送整數及 pointer 參數,但 XPCOM 的 Method 都會需要將 R4 設為 that(this)。
- FR4...FR11 用來傳送float及 double 參數。
- 塞不進 register 時,再放入 stack 中。且 64 bits 參數不會有一半在 register 中,一半在 stack 中的情形出現。

2. jsr 需要 align 2

3. 需要 GCC 3.1 以上

4. R14 被當作 base pointer(call frame pointer) 使用,R15 則是 stack pointer。

5. JavaScript Component -> xptcinvoke -> XPCOM Component

6. XPCOM Component -> xptcstubs -> JavaScript Component

1/05/2007

SH4 Note - 02

Programming model

  • 2 processor modes : user mode and privileged mode
  • 4 kinds of registers:
    - general registers (R0 - R15, where R0 - R7 are banked registers)
    - system registers : access to these registers does not depend on the processor mode.
    - control registers
    - floating-point registers
    (FR0–FR15 and XF0–XF15 = FPR0_BANK0–FPR15_BANK0 and FPR0_BANK1–FPR15_BANK1).
General registers
  • R0_BANK0–R7_BANK0:
    - In user mode (SR.MD = 0), R0–R7 are always assigned to R0_BANK0–R7_BANK0.
    - In privileged mode (SR.MD = 1), R0–R7 are assigned to R0_BANK0–R7_BANK0 only when SR.RB = 0.
    Notes: SR (=Status Register). MD(=Mode)
  • R0_BANK1–R7_BANK1:
    - In user mode, R0_BANK1–R7_BANK1 cannot be accessed.
    - In privileged mode, R0–R7 are assigned to R0_BANK1–R7_BANK1 only when
    SR.RB = 1.
    Notes: RB (=General Register Bank specifier in privileged mode)
  • Programming Note:
    As the user’s R0–R7 are assigned to R0_BANK0–R7_BANK0, and after an exception or interrupt R0–R7 are assigned to R0_BANK1–R7_BANK1, it is not necessary for the interrupt handler to save and restore the user’s R0–R7
    (R0_BANK0–R7_BANK0).
System registers
  • MACH (32bit) : Multiply-and-accumulate register high
  • MACL (32bit) : Multiply-and-accumulate register low
  • PR (32bit) : Procedure register
    The return address is stored when a subroutine call using a BSR, BSRF or JSR instruction. PR is referenced by the subroutine return instruction (RTS).
  • PC (32bit) : Program Counter
  • FPSCR (32bit) : Floating-point status/control register
  • FPUL (32bit) : Floating-point communication register
    Data transfer between FPU registers and CPU registers is carried
    out via the FPUL register. The FPUL register is a system register, and is accessed from the CPU side by means of LDS and STS instructions. For example, to convert the integer stored in general register R1 to a single-precision floating-point number,
    the processing flow is as follows:
    R1 → (LDS instruction) → FPUL → (single-precision FLOAT instruction) → FR1
Control registers
  • SR (32bit) : Status Register
    - SR.T (bit[0]) : True/False condition or carry/borrow bit.
    - SR.S (bit[1]) : Specifies a saturation operation for a MAC instruction.
    - SR.IMASK (bit[4:7]) : Interrupt mask level.
    - SR.Q (bit[8]) : State for divide step.
    - SR.M (bit[9]) : State for divide step.
    - SR.FD (bit[15]) : FPU disable bit (cleared to 0 by a reset).
    - SR.BL (bit[28]) : Exception/interrupt block bit
    - SR.RB (bit[29]) : General register bank specifier in privileged mode
    - SR.MD (bit[30]) : Processor Mode (MD=0 : User Mode, MD=1 : Privileged mode
    - SR.RES (bit[[2:3],[10:14][16:27][31]) : Reserved
  • SSR (32bit) : Saved Status Register
    The current contents of SR are saved to SSR in the event of an exception or interrupt.
  • SPC (32bit) : Saved Program Counter
    The address of an instruction at which an interrupt or exception occurs is saved to SPC.
  • GBR (32bit) : Global Base Register
    GBR is referenced as the base address in a GBR-referencing MOV instruction.
  • VBR (32bit) : Vector Base Register
    VBR is referenced as the branch destination base address in the event of an exception or interrupt.
  • SGR (32bit) : Saved General Register
    The contents of R15 are saved to SGR in the event of an exception or interrupt.
    Notes: R15 被當做 Stack Pointer 來使用。(R14 則為 Base Pointer)
  • DBR (32bit) : Debug Base Register
Floating-point registers
  • Floating-point registers, FPRn_BANKi (32 registers)
  • Single-precision floating-point registers, FRi (16 registers)
    - FPSCR.FR = 0 : FR0–FR15 are assigned to FPR0_BANK0–FPR15_BANK0.
    - FPSCR.FR = 1 : FR0–FR15 are assigned to FPR0_BANK1–FPR15_BANK1.
  • Double-precision floating-point registers or single-precision floating-point
    register pairs, DRi (8 registers):
    DR0 = {FR0, FR1}, DR2 = {FR2, FR3}, DR4 = {FR4, FR5}, DR6 = {FR6, FR7},
    DR8 = {FR8, FR9}, DR10 = {FR10, FR11}, DR12 = {FR12, FR13}, DR14 = {FR14, FR15}
  • Single-precision floating-point vector registers, FVi (4 registers): An FV register
    comprises four FR registers:
    FV0 = {FR0, FR1, FR2, FR3}, FV4 = {FR4, FR5, FR6, FR7},
    FV8 = {FR8, FR9, FR10, FR11}, FV12 = {FR12, FR13, FR14, FR15}
  • Single-precision floating-point extended registers, XFi (16 registers)
    - FPSCR.FR = 0 : XF0-XF15 are assigned to FPR0_BANK1-FPR15_BANK1.
    - FPSCR.FR = 1 : XF0-XF15 are assigned to FPR0_BANK0-FPR15_BANK0.
  • Single-precision floating-point extended register pairs, XDi (8 registers): An XD
    register comprises two XF registers.
    XD0 = {XF0, XF1}, XD2 = {XF2, XF3}, XD4 = {XF4, XF5}, XD6 = {XF6, XF7},
    XD8 = {XF8, XF9}, XD10 = {XF10, XF11}, XD12 = {XF12, XF13}, XD14 = {XF14, XF15}
  • Single-precision floating-point extended register matrix, XMTRX: XMTRX
    comprises all 16 XF registers.

    Notes: 太酷了!日本人的想法真的蠻好玩的。
Memory-mapped registers
  • The control registers are double-mapped to the following two memory areas.
    All registers have two addresses.
    - 0x1F00 0000-0x1FFF FFFF
    - 0xFF00 0000-0xFFFF FFFF
  • 0x1F00 0000–0x1FFF FFFF
    - This area must be accessed in address translation mode using the TLB.
  • 0xFF00 0000–0xFFFF FFFF
    - Access to area 0xFF00 0000-0xFFFF FFFF in user mode will cause an address error.
    - Memory-mapped registers can be referenced in user mode by means of access that involves address translation.
Data format in registers
  • Register operands are always longwords (32 bits). When a memory operand is only a
    byte (8 bits) or a word (16 bits), it is sign-extended into a longword when loaded into
    a register.
Data formats in memory
  • Memory can be accessed in 8-bit byte, 16-bit word, or 32-bit longword form.
  • A word operand must be accessed starting from a word boundary(even address of a
    2-byte unit: address 2n)
  • A longword operand starting from a longword boundary (even address of a 4-byte unit: address 4n).
Processor states
  • Reset state:
    - power-on reset will cause all system components to be reset,
    - manual reset may, for example, avoid resetting DRAM controllers so that
    memory contents are preserved.
  • Exception-handling state:
    - In the case of a reset, the CPU branches to address 0xA000 0000 and starts
    executing the user-coded exception handling program.
    - In the case of a general exception or interrupt, the program counter (PC) contents are saved in the saved program counter (SPC), the status register (SR) contents are saved in the saved status register (SSR), and the R15 contents are saved in saved general register 15 (SGR). The CPU branches to the start address of the user-coded exception service routine, found from the sum of the contents of the vector base address and the vector offset.
  • Program execution state:
    CPU executes program instructions in sequence.
  • Power-down state:
    The power-down state is entered by executing a SLEEP instruction.

SH4 Note - 01

Overview

  • 32-bit RISC microprocessor
  • 16-bit fixed-length instruction set
  • 1 instruction cache
  • 1 operand cache (copy-back/write-through, 4-entry full-associative instruction TLB)
  • MMU (memory management unit) with 64-entry full-associative shared TLB.

CPU
  • 32-bit internal data bus
  • 32 general registers(32-bit)
  • 8 shadow registers(32-bit)
  • 7 control registers(32-bit)
  • 4 system registers(32-bit)
  • RISC
  • Load-store architecture
  • Delayed branch instructions
  • Conditional execution
  • Superscalar architecture: Parallel execution of two instructions
  • C-based instruction set(providing simultaneous execution of two instructions)
    including FPU
  • Instruction execution time: Maximum 2 instructions/cycle
  • Virtual address space: 4 Gbytes (448-Mbyte external memory space)
  • Space identifier ASIDs: 8 bits, 256 virtual address spaces
  • On-chip multiplier
  • Five-stage pipeline
FPU
  • On-chip floating-point coprocessor
  • Supports single-precision (32 bits) and double-precision (64 bits)
  • IEEE754-compliant
  • Two rounding modes: Round to Nearest and Round to Zero
  • Floating-point registers: 32 bits x 16 words x 2 banks
    (single-precision x 16 words or double-precision x 8 words) x 2 banks
  • 32-bit CPU-FPU floating-point communication register (FPUL)
  • Supports FMAC (multiply-and-accumulate), FDIV (divide) and FSQRT (square root) instructions.
  • Supports FLDI0/FLDI1 (load constant 0/1) instructions
  • Instruction execution times:
    - Latency (FMAC/FADD/FSUB/FMUL): 3 cycles (single-precision), 8 cycles (double-precision)
    - Pitch (FMAC/FADD/FSUB/FMUL): 1 cycle (single-precision), 6 cycles
    (double-precision)
    - Note: FMAC is supported for single-precision only.
  • 3-D graphics instructions (single-precision only):
    - 4-dimensional vector conversion and matrix operations (FTRV): 4 cycles
    (pitch), 7 cycles (latency)
    - 4-dimensional vector (FIPR) inner product: 1 cycle (pitch), 4 cycles (latency)
  • Five-stage pipeline
Power-down
  • Sleep mode
  • Standby mode
  • Module standby function
MMU
  • 4-Gbyte address space, 256 address space identifiers (8-bit ASIDs)
  • Single virtual mode and multiple virtual memory mode
  • Supports multiple page sizes: 1 kbyte, 4 kbytes, 64 kbytes, 1 Mbyte
  • 4-entry fully-associative TLB for instructions
  • 64-entry fully-associative TLB for instructions and operands
  • Supports software-controlled replacement and random-counter replacement algorithm
  • TLB contents can be accessed directly by address mapping