Lecture 33 - Memory Hierarchy

Goals

learn the basics of the memory hierarchy

The last stop on our tour of the innards of the computer is memory. We have already had some experience with a couple of different kinds of memory -- notably the registers and main memory. There are actually many kinds of memory in the computer and we are going to try to get a bit of a sense of what they are and how they work together

There are a number of characteristics we can talk about with respect to memory:

location - where is the memory
- inside the computer
  - inside the CPU
    - registers and some cache
  - main memory and some cache
- outside the computer
  - drives, tapes, cloud
capacity - how much can be stored
unit of transfer - how much can be moved at once
access method
- sequential access - everything must be read in order
  - access time will vary based on what we are looking for and where it is
- direct access - we can jump to regions, but then we need to do a small sequential search
- random access - we can jump immediately to what we want (constant time lookups)
- associative - random access based on a key
performance
- access time - elapsed time from start of operation to read or write completion
- memory cycle time - we may have to wait for memory to recover between operations
- transfer rate - once found, how quickly can we move the data in or out
physical type
- semiconductor, magnetic, optical, holographic, etc...
persistence
- volatile - needs power to retain values
- erasable - can be erased
- decaying - data needs to be refreshed or it will become corrupted

Memory hierarchy

We have been thinking about just two types of memory -- the registers and main memory. We have somewhat idealized main memory (to the point that some of you are still getting it confused with registers or can't see why we need both). We think about it as being unlimited and fast (and cheap would be good as well) -- it is none of these things.

In general we can have fast, small and expensive, or large, slow, and cheap.

To create a system with a balance of these things, we build a hierarchy of different kinds of memory, with the fast, small and expensive memory near the processor and the large, slow, cheap memory several layers away.

basic technologies:

SRAM (static random access memory)
- used for caches
- this is, in essence, the flip-flop circuits I showed you
- low-density, so they take up a proportionally large amount of space making them expensive
DRAM (dynamic random access memory)
- main memory (what ad copy is referring to when it says "8 Gigs of RAM")
- dynamic because it needs to be refreshed every 10-100ms (each bit is stored in a capacitor)
Flash
- in thumb drives and SSD
- slower that DRAM
magnetic disk
- hard drives
- very large, very cheap, very slow

Stats for my computer To find the cache sizes on the computer I used sysctl hw|grep cache

tech	access time	$/GB	in laptop
SRAM	0.5-2.5ns	$1500	4.1875 MB
DRAM	50-70ns	$25	32 GB
SSD	80,000 ns	$0.40	1 TB
HD	9,000,000 ns	$0.02	4 TB (not in the computer)

Time comparison You are working in the kitchen area and you have a question...

The answer is in your notes right in front of you
- we will call this SRAM
- say this is 10 seconds -- this gives us a conversion factor of 1ns computer time <=> 4 seconds human time
DRAM equivalent?
- 70ns, so 4 * 70 - 280 seconds, roughly 5 minutes
- you walk down the hall and ask me
SSD equivalent?
- 80,000ns, so 80,000 * 4=320,000 seconds or about 37 days
- you have time to book a flight to England and go ask a professor at Cambridge in person
HD equivalent?
- 9,000,000ns, so 9000000 * 4 = 360,000,000s, or 100,000 hours, 4166 days, 11 years
- time to finish your degree here, go to graduate school, get a Ph.D in the subject, and then spend another couple of years as a postdoc researching the question until you can answer it yourself

Note that these numbers are a little misleading because once we found the data, we can read out a very large chunk at one time... but then you will probably answer a bunch of other questions while you are doing your Ph.D.,so the comparison is still fair.

Working set theory

Denning proposed something called the "working set model"

This is based on the idea of the locality of reference

memory references in a program tend to cluster temporally and spatially
- example: loops and functions
- the cluster of instructions will change, but slowly, so we can consider most memory accesses to be local

The two kinds of locality

temporal locality - if we reference something, we will probably reference it again soon
- loops revisit the same instructions
- variables are an obvious example
spatial locality - if we access and address, we are likely to access something near it
- the next instruction or the next element of an array, or the next property of a struct

Mechanical level

vocabulary

locality of reference

Skills

Last updated 05/12/2023