Overclock.net banner

Another OS concepts thread... (registers, parameters, and blocks)

787 Views 18 Replies 4 Participants Last post by  rabidgnome229
This one comes from my "Modern Operating System Concepts" text, and it still confuses me. I thought that the register was the smallest type of memory storage closes to the actual CPU, even more so then the L1 cache? I have a hunch that the register is the information that the CPU is currently working on as well.

I looked up parameters and they are conditions or arguments for a program. This particular section is explaining how the system-call interface interprets the API and executes the proper system call. This little segment is the only explanation it provides about how this program passes on parameters to the OS:

Three general methods are used to pass parameters to the operating system. The simplest approach is to pass the parameters in registers. In some cases, however, there may be more parameters than registers. In these cases, the parameters are generally stored in a block, or table, in memory, and the address of the block is passed as a parameter in a register (Figure 2.4). This is the approach taken by Linux and Solaris. Parameters also can be placed, or pushed, onto the stack by the program and popped off the stack by the operating system. Some operating systems prefer the block or stack method, because those approaches do not limit the number or length of parameters being passed.
1 - 19 of 19 Posts
A register can be stored in the local cache, or in CPU's like the X86 or X86-64, they are stored in what is called a Register File, though it depends on the architecture of the processor. But I believe most embedded processors and microprocessors just use the local cache, as they often have high register counts (128, 256, 512 or more!). But it has been many years since I've read a hardware doc on an embedded microprocessor. A common optimization of registers is called Register Renaming, as it allows the execution pipe to execute instructions out-of-order, which leads to dual, triple, and quad-piping instructions! Quad-Piping is the main reason why the old AMD Barton processors could out-perform the Intels at the time, even though the Intels were much higher clocked!

A Register is pretty much what you think it is, a small amount of data that the CPU was working on, is currently working on, or is about to work on.

As for System Calls, old 386/486 processors needed to place the API call #ID into the EAX register and then Interrupt the CPU. This was a slow way of going about it, so I think in the 686 era Intel and AMD came up with two new instructions called SYSENTER and SYSEXIT which allows for much faster system calling without all the overhead of an Interrupt.

http://manugarg.googlepages.com/syst...nlinux2_6.html
See less See more
  • Rep+
Reactions: 1
Could you kindly explain to me exactly what a register is? From what I gather it is a table of information perhaps, relevant to the operating of a program? Is register a name for two different things; the first being the area of data closes to the CPU, the second being information housing parameters contained largely within the CPU cache and memory?

As a side question, would it be accurate to say that a CPU's Cache is usually filled largely with registers?
Quote:

Originally Posted by mothergoose729 View Post
Could you kindly explain to me exactly what a register is? From what I gather it is a table of information perhaps, relevant to the operating of a program? Is register a name for two different things; the first being the area of data closes to the CPU, the second being information housing parameters contained largely within the CPU cache and memory?
A register is just where the cpu puts values that it's working on. It's not in memory anywhere (e.g. a register doesn't have an address), it's internal to the CPU. The register file is different from the cache. Every line of the cache represents a memory location, and a register is just a temporary working value. For example, take the following C code

Code:

Code:
//...
int x, y, z;

// do stuff, put values in x and y

z = (x + y)/2;

//...
When the CPU is executing the addition, it fetches the values of x and y from memory (or the cache) and puts them into registers (say rax and rcx). It then executes an instruction that adds the contents of rax to the contents of rcx and puts the result into rax. It then divides rax by two and puts the result in rax. It then puts the value in rax into the memory location that represents z. No program variable ever is equal to the intermediate value (x+y), but the register rax is at some point. The register is just very fast temporary storage that the CPU uses to perform computations.

Quote:
As a side question, would it be accurate to say that a CPU's Cache is usually filled largely with registers?
Not really. The CPU takes values from the cache and puts them in registers. It then performs computations on the registers. When it's done it writes the result back into memory (which initially writes the change to the cache)
See less See more
  • Rep+
Reactions: 1
Quote:

Originally Posted by mothergoose729 View Post
Could you kindly explain to me exactly what a register is? From what I gather it is a table of information perhaps, relevant to the operating of a program? Is register a name for two different things; the first being the area of data closes to the CPU, the second being information housing parameters contained largely within the CPU cache and memory?
A register is only one thing, it's the same thing in both situations you mention. Registers are the fastest way to access/work on data, so using registers to pass parameters to function calls and API calls is the fastest way of making said call. This type of calling convention is often called FASTCALL. The standard calling convention (STDCALL) has you push the parameters in reverse order, then make the call:

STDCALL:
push param4
push param3
push param2
push param1
call api

FASTCALL:
There are different FASTCALL conventions, though Microsoft has you move the first 2 parameters into registers ECX and EDX, then push the rest:

push param4
push param3
mov edx, param2
mov ecx, param1
call api

Calling conventions describe the interface of called code:
  • The order in which parameters are allocated
  • Where parameters are placed (pushed on the stack or placed in registers)
  • Which registers may be used by the function
  • Whether the caller or the callee is responsible for unwinding the stack on return

http://en.wikipedia.org/wiki/X86_calling_conventions

Registers are memory locations that do not have memory addresses, however they do have memory names, such as EAX, or R1, or XMM3, or ST(4), and as explained in my first post, you can rename a Register so it points to another memory location.

There are no set rules for Registers or Calling Conventions. As a manufacturer, you can make up whatever rule or system you want. As a programmer, you can invent whatever Calling Convention you want to use. There is no right or wrong way to do these things.
See less See more
  • Rep+
Reactions: 1
So if I am understanding this correctly, the register is the place that houses the most up to date calculations and associated values of the CPU? It can contain variables, there values, and equations. The register is not housed in the CPU cache or system memory, but is instead internal to the CPU. Ok.

The books sites the existence of register(s), plural. Does this mean that there are more then one register in each CPU (like for each core?) or does this only apply to systems with multiple processors?

When the parameters of a program or a number of process is greater then what the register can hold, then that information is then allocated to the CPU cache or main memory, right?
Quote:


Registers are conceptually a special working area within the processor that are faster than memory operands and are designed to work with the processors opcodes.

Registers in an Intel or compatible processor are a very limited resource when writing assembler, you have eight general purpose registers, EAX, EBX, ECX, EDX, ESI, EDI, ESP and EBP. In most instances ESP and EBP should be left alone as they are mainly used for entry and exit of procedures.

This means affectively, you have six 32 bit registers to write you code with plus any other memory locations that are useful in the procedure. ESI and EDI can be used in the normal manner in most instances but neither can be accessed at a BYTE level, you can read the low WORD of ESI as SI and the low WORD of EDI as DI.

Understanding the size of registers and the data that you can place in them is very important when using assembler. A 32 bit Intel or compatible processor has three native data sizes that can be used by the normal integer instructions, BYTE, WORD and DWORD corresponding to 8 bit, 16 bit and 32 bit.

This can be shown in HEX notation.
BYTE 00
WORD 00 00
DWORD 00 00 00 00

In terms of registers, this corresponds to the three sizes that can be addressed with the normal integer registers. Intel and compatible processors are backwards compatible with older code that uses 8 and 16 bit registers and it is done by accessing any of the general purpose registers in three different ways. Using the EAX register as an example,

8 bit = AL or AH
16 bit = AX
32 bit = EAX


Quote:


There are three basic types of operands that can be placed in a register, immediate, memory or another register.

An IMMEDIATE operand is usually a number but it can also be a string literal in the form "a" which is converted by the assembler to its ASCII equivalent.
mov al, "a" ; string literal
mov edx, 0 ; numeric immediate

A MEMORY operand is an address in memory of some form of data.
mov al, [esi] ; copy byte at address in ESI into AL
mov edx, lpMemvar; copy variable address into EDX

A REGISTER operand is a register with a value in it.
mov ecx, edx ; copy EDX into ECX

The actions that can be performed are determined by the available opcodes, trying to move one memory operand into another directly does not work because there is no opcode in the processor to do it.
mov mVar, lpMem ; this fails, no opcode to do it.

If you have a spare registers, you make an indirect copy through that register,
mov eax, lpMem ; copy memory value into register.
mov mVar, eax ; copy register into memory value.

If you don't have a spare registers, it can be done another way but it is slightly slower,
push lpMem ; push memory value onto the stack.
pop mVar ; pop it back off as another memory value.

So on an 32-bit X86 processor, you have 6 general purpose registers, each can hold 32-bits of information, aka 4 bytes. You also have 8 80-bit registers on the floating point unit (FPU) available for floating point math, 8 64-bit registers available for the MMX processor (multi-media extensions processor), and 8 128-bit registers available for the SSE/SSE2/SSE3/SSE4 processor. But in general purpose x86 programming, you only have 6 registers to work with. Windows does not use registers for passing parameters, it uses the stack. This allows is to pass an infinite amount of parameters. If an OS chooses to use 1 or more registers to pass parameters, then additional params that would not fit into the registers would need to be passed using the stack. In Windows, the called procedure is responsible for preserving all of the registers except EAX, ECX, and EDX. This means that whatever data you place into the other registers will be preserved once control is returned to your program. However, since EAX, ECX, and EDX are not preserved, these registers can (but not always) contain completely random data. EAX will return the data that the called procude wants to return to the calling program, which means the only data a procedure can ever return is 4 bytes. So if you want to return more than 4 bytes, then the 4 bytes you return should be a memory address pointing to a table of data.
See less See more
  • Rep+
Reactions: 1
Quote:


Originally Posted by mothergoose729
View Post

So if I am understanding this correctly, the register is the place that houses the most up to date calculations and associated values of the CPU? It can contain variables, there values, and equations. The register is not housed in the CPU cache or system memory, but is instead internal to the CPU. Ok.

Yup. FYI the L1 cache is usually on-die as well, but that isn't pertinent to your questions.

Quote:


The books sites the existence of register(s), plural. Does this mean that there are more then one register in each CPU (like for each core?) or does this only apply to systems with multiple processors?

It would be impossible to get any work done with only one register, so the CPU has a register file with multiple registers. A 32-bit x86 CPU has 16 normal registers (i.e. not SSE or floating point). A 64-bit x86 CPU has 24 (IIRC). A 32-bit ARM CPU has 17 (one is a status register)

Quote:


When the parameters of a program or a number of process is greater then what the register can hold, then that information is then allocated to the CPU cache or main memory, right?

Sounds like you're almost there, but are still a bit confused. The fastest type of storage are registers, so programs use registers to pass function arguments when they can. According to the 64 bit calling convention, four registers are set aside to use to pass parameters. If there are more than four arguments to be passed, they are put on the stack.

The stack is a region of main memory. At any given time, portions of the stack may reside in one of the CPU caches, but nothing is ever 'allocated to the CPU cache.' Main memory gets allocated. When a region of main memory is read or written to, it is cached. It will then reside in the CPU cache until it is evicted.
See less See more
  • Rep+
Reactions: 1
Quote:


Originally Posted by rabidgnome229
View Post

It would be impossible to get any work done with only one register.

The Nintendo NES only had 2 registers, X and Y.

The Parallax SX has 136 8-bit registers, has no address bus or RAM, and it's only other additional memory source other than the 136 registers is a 6kb read-only flash rom to grab program code from! On top of all this, it can only access 32 registers at once, and it must constantly switch between 8 Register Banks to access the 8 different sets of upper 16 registers.

Exciting!
See less See more
Quote:


Originally Posted by rabidgnome229
View Post

The stack is a region of main memory. At any given time, portions of the stack may reside in one of the CPU caches, but nothing is ever 'allocated to the CPU cache.' Main memory gets allocated. When a region of main memory is read or written to, it is cached. It will then reside in the CPU cache until it is evicted.

So the CPU cache copies data from the main memory and keeps it there indefinitely. No information is ever passed on to the CPU cache without first residing in the memory... So just like the CPU cannot access any form of memory accept the RAM, not piece of hardware can access the CPU without first sending data through the main memory, up through the cache, and into the registers for processing?
See less See more
Quote:


Originally Posted by mothergoose729
View Post

So the CPU cache copies data from the main memory and keeps it there indefinitely.

Well, until the cache is full, then its swapped out with more recent memory accesses. Lets give an example:

mov eax, dword ptr [400000]

This is a typical instruction that will move 4 bytes into the EAX register. This will also move 16 bytes starting at offset 400000 into the cache as well. This means subsequent access to 400004, 400008, ect, will be much faster as we no longer need to transfer that data from memory, it already was. The cache has a form of AI that will attempt to manage the little bit of cache memory in the best possible way to minimize the amount of memory transfers that need to be done.

Another example:

mov eax, 50
mov dword ptr [400000], eax
add eax, 50 // eax now equals 100
mov dword ptr [400000], eax

In this example, the 2nd instruction would cause the eax register to get written to the memory address 400000, however it would more than likely just get written to the cache with the CPU having no intention of ever actually sending it to memory, the reason being, that it will just be overwritten in another 2 instructions anyways. On modern CPU's, the 2nd instruction wouldn't even execute, just waste space in the pipeline and possibly burn up 1 execution cycle.

The point is, just because you write a value to memory doesn't mean it's actually going to get written to memory, it may purely exist just in the cache. Or it may get written to memory 15 seconds later when another process causes the local cache to get flushed so it's own data can fit in it.

There are also special instructions for making memory completely skip over the cache all together. As a programmer, there are times when you *know* a set of data does not need to be cached, either for read or for write, because it will be quickly used and then never again, for example.

Quote:


No information is ever passed on to the CPU cache without first residing in the memory... So just like the CPU cannot access any form of memory accept the RAM, not piece of hardware can access the CPU without first sending data through the main memory, up through the cache, and into the registers for processing?

Devices communicate with the CPU through IO ports. IO ports are just like internet ports. A CPU has a physical limit amount of ports. The more ports, the more physical pins are needed on the CPU. This is why data controllers like USB and PCI exist. They allow a CPU to use a very small amount of IO pins to communicate with an almost unlimited number of devices. The controller itself is another CPU. The main CPU send a signal to the controller telling it which device it wants to communicate with, the controller selects that device, and then the CPU can send IO data out to the controller which will relay it to the device. It's pretty much like a Router in an internet system. Some devices can even access the system memory without going through the CPU. Sound Blaster is an example of an early piece of hardware that used this as an advantage when playing sound effects in games. By being able to access the system memory directly, it does not need it's own onboard memory to store sound data.
See less See more
  • Rep+
Reactions: 1
Ok, I see, thank you. Both main system memory and CPU cache are dynamic, so one can contain data the other doesn't yet have, or won't get, ect. Making it possible for data to go straight to cache, then registry, or from memory to registry, ect. Is is possible that a process can direct data from the hard disk or virtual memory directly to the registry, without ever residing on the CPU cache or memory?
If you want to play with registers, download a MIPS Simulator: http://pages.cs.wisc.edu/~larus/spim.html

There are dozens of sources and text on MIPS. Read about it: http://en.wikipedia.org/wiki/MIPS_architecture

Play around with it.

For your final exam, solve the 8-Queens problem: http://en.wikipedia.org/wiki/Eight_queens_puzzle

That was one of my final projects in that course. I have the solution at home...
  • Rep+
Reactions: 1
Quote:


Originally Posted by mothergoose729
View Post

Ok, I see, thank you. Both main system memory and CPU cache are dynamic, so one can contain data the other doesn't yet have, or won't get, ect. Making it possible for data to go straight to cache, then registry, or from memory to registry, ect. Is is possible that a process can direct data from the hard disk or virtual memory directly to the registry, without ever residing on the CPU cache or memory?

The way the caching system works is that the CPU attempts to read a memory address. Assuming the address is valid, first the L1 cache is checked. If the L1 cache contains the data at the requested address it returns it. If not, it asks the L2 cache. If the L2 cache contains the data it gives the block containing that data to the L1 cache, which stores it and gives the requested data to the CPU. Now both the L2 and L1 cache contain the data, and it is also in a register. If the L2 cache does not contain the requested data, it checks RAM. Same thing happens - if the data is contained in RAM it gives it to the L2 cache, the L1 cache, and the CPU. If the data isn't in RAM a pagefault is raised and the page containing data gets put into RAM, the L2 cache gets a block containing it, the L1 cache gets a smaller block containing it, and whatever register the CPU chose to put it in gets the requested data.

If the address isn't valid (refers to a page that has not been allocated) a segfault or similar condition is raised.

**EDIT**
Obviously if there is a different layout than L1 cache-> L2 cache-> RAM -> HDD this will be a bit different, but the principles stay the same
See less See more
Ok, so it is possible for data to be written straight to the registry or anywhere else, but all the levels of memory in between get a piece or copy as well, right?
Quote:


Originally Posted by mothergoose729
View Post

Ok, so it is possible for data to be written straight to the registry or anywhere else, but all the levels of memory in between get a piece or copy as well, right?

Yes, data can be written straight to memory. All levels of memory don't need to get a copy. If all levels needed to get a copy, it would slow down all the memory to the speed of the slowest one on any write.
See less See more
  • Rep+
Reactions: 1
Quote:


Originally Posted by mothergoose729
View Post

Ok, I see, thank you. Both main system memory and CPU cache are dynamic, so one can contain data the other doesn't yet have, or won't get, ect. Making it possible for data to go straight to cache, then registry, or from memory to registry, ect. Is is possible that a process can direct data from the hard disk or virtual memory directly to the registry, without ever residing on the CPU cache or memory?

What do you mean by Registry? If you're talking about the Windows Registry, it is nothing more than a file on your hard drive that stores the settings of the OS and some programs. So you don't move data into the Registry, you are actually moving it to a location on the hard drive. The only way to have data written to or read from a hard drive is through the CPU. The onyl way anything moves around in a computer is through the CPU. There are a few special case scenarios like a sound card or video card being able to access system memory for massive increases to performance.
See less See more
I meant CPU register, excuse me.
Quote:

Originally Posted by mothergoose729 View Post
Ok, so it is possible for data to be written straight to the registry or anywhere else, but all the levels of memory in between get a piece or copy as well, right?
Caching handles reads and writes differently. Think of the memory hierarchy as a pyramid. At the bottom is the huge, slow hard disk. Above that is RAM, which doesn't hold as much, but is much faster. Above that is the L2 cache, then L1, then the register file which contains the registers. When a read is performed, the system grabs the value from the highest place where it can find the requested data, and gives it to the CPU's register file. Along the way it gets put in every intermediate level. So if the data is not in the L2 or L1 cache, it is fetched from RAM and put in the caches as well as in the register file.

Writes are a bit more complicated, because different caches handle writes differently. There are two main methods of handling writes - write through and write back. In a write through cache, when a value is written it is written to the cache, and the level below the cache as well. In a write back cache, only the cache is updated. In a write back system, the cache must keep track of which values have changed, and if they are evicted from the cache the next level is updated at that time. This allows for faster writes. Reads may be slower, though, because a read may cause a changed value to be evicted, which forces the next level of memory to be updated before the read can complete.

Most of the small, fast caches use write back so when you write a value to memory, initially only the L1 cache gets updated. The change may not propagate down to RAM for a very long time, if ever
See less See more
  • Rep+
Reactions: 1
1 - 19 of 19 Posts
This is an older thread, you may not receive a response, and could be reviving an old thread. Please consider creating a new thread.
Top