Project maintainers


Name: taar
Created: Dec 16, 2017
Updated: Mar 4, 2021
SVN: No files checked in
Bugs: 1 reported / 0 solved
Star5you like it: star it!

Other project properties

Category:System on Chip
Development status:Planning
Additional info:
WishBone compliant: No
WishBone version: n/a
License: GPL

General project overview

This project comprises of a SoC type microprocessor and a microkernel-based operating system both of which are open source. Both are called Taar. The name Taar ( with a soft 'T' ) is derived from the Indian Hindi language colloquialism for the old telegram method of communication and relates to the telegram's position as a real-time method relative to the other older methods of communication.

The projects are still being designed and I welcome anyone to contribute. This document is a design specification based on which further processor and OS architecture will be defined.

The Taar processor is the first processor and the Taar second OS project to be locally designed in India. The first OS was a very simple one written by me a few years ago. The Taar processor and OS are being designed to complement each other.

In this document the processor's details are given first and then the OS'.

General processor overview

The Taar processor is meant to be a simplified, multi-core, clock-less design. It is presently meant to be used in a wearable computer, embedded and server systems. The design, beyond commonalities with all designs, is not related to any specific processor. The processor is meant to be determinate ( real-time ) from ground up.

The chip is styled as a System-on-Chip ( SoC ) whose I/O subsystems are LED display interface, USB, WiFi transciever, timer, simple 0-to-1 output pins, simple 0-to-1 input pins, ADC channel ( mic )and DAC channel ( speaker ).

Other than these, there will be a System Reset Timer which will reset the system in case of system hangs.

There will be a 3D graphics processing unit. Another thing that will be fixed is that the operating system will allow only four streams of multimedia play i.e. in four windows and each stream will be under processing by one processor core. There will be extra instructions for this.

At the moment there are less than 25 instructions but with furthering of the processor core design, the GPU design and also the kernel design more instructions will be added. Of course, only after the first round of design will the processor be ready for a FPGA implementation.

Project contributors

There used to be one contributor to the processor project, Vishal Zuluk, and his contributions were :

  1. The USB subsytem is the result of general discussions with him.

Register set for each ALU

The ALU ontains seven types of registers, as listed below along with their bit-length :

ALU busy status -------- 1 bit

Instruction ------------ 160 bits ( to contain the entire instruction )

R1 --------------------- 32 bits ( use explained later )

R2 --------------------- 32 bits ( use explained later )

R3 --------------------- 32 bits ( use explained later )

Instruction pointer ---- 32 bits ( to hold the next instruction's address )

Loop counter ---- 32 bits

Memory Management

The MMU is paging-basesd.

[ To be done ]

Instructions format

At present there are 21 instructions as given below as a single list :

load-imm-r1, load-imm-r2, load-mem-r1, load-mem-r2, load-ptr-r1, load-ptr-r2, store-r1, store-r2, store-ptr-r1, store-ptr-r2, add, sub, and, or, xor, lsh, rsh, inv, loop-continue-forever, load-loop-counter, loop-next, sysenter, sysexit

And then in further each instruction described in detail. Every Taar processor instruction is 160 bits long. The instructions are all of this fixed length and fixed format to allow the processor logic to read instructions at a determinate rate and keep things simple.

The instruction format is listed below in a vertical manner. The first line below is the first field of the instruction format from the right. The operands are being presented without names here because their usages are different for different Operation Code ( hereforth called opcode ). The numbers within the brackets are their bit-length :

Operation code -------- ( 32 bits ) Operand1 -------------- ( 32 bits ) Operand2 -------------- ( 32 bits ) Operand3 -------------- ( 32 bits ) Operand4 -------------- ( 32 bits )

The opcode field contains the precise instruction number that has to be executed by the execution core. Any user-mode program filling wrong operation code will be trapped at this point and terminated. A code section in the Main Control Program ( the kernel ) can possibly fill this field with wrong operation code. In that case, the processor entirely should halt and an appropriate external pin should become active or low. Actually only the first byte of the opcode contains the instruction number. The rest of this field should be filled with zeroes by software ( compiler ).

load-imm-r1, load-imm-r2 instructions

Copy into register R1 or R2 the immediate value from Operand1.

load-imm-r1 and load-imm-r2 :

Usage ( lsb on top ) :
-------> opcode ( load-imm-r1 or load-imm-r2)
-------> immediate value
-------> zero
-------> zero
-------> zero

Example :
---> load-imm-r1 0xfeedf00d 0, 0, 0

load-mem-r1, load-mem-r2 instructions

Copy into register R1 or R2 the value from the memory address pointed to via Operand1.

load-mem-r1 and load-mem-r2 :

Usage ( lsb on top ) :
-------> opcode ( load-mem-r1 or load-mem-r2 )
-------> the starting address
-------> the offset
-------> zero
-------> zero

Example :
---> load-mem-r1 [0xf00df00d], 20, 0, 0

load-ptr-r1, load-ptr-r2 instructions

Copy into register R1 or R2 the value from the memory address second-level-pointed-to via Operand1.

load-ptr-r1 and load-ptr-r2 :

Usage ( lsb on top ) :
-------> opcode ( load-ptr-r1 or load-ptr-r2 )
-------> the pointer address
-------> the offset into the second-level address -------> zero
-------> zero

Example :
---> load-ptr-r1 [0xfeedfade], 20, 0, 0

The load instructions and their counterpart, the store instructions ( explained next ), exist because the mathematic and logic instructions do not access memory directly. Such partitioning allows keeping the ISA simple and clean. It also allows faster mathematic and logic instruction execution in one core versus relatively slower memory access in another core.

store-r1, store-r2 instructions

Copy a word into a memory address the value from the register R1 or R2.

The mathematic and logic instructions don’t write back the result to memory after execution and therefore the code will have to use this instruction if a memory write-back is needed. Having a separate write-back will allow most instructions to access memory without too much queuing thus increasing hardware-level parallelism. A second effect is that the code that only compares and does not need the result value in memory, needn't take time for a write-back.

Usage ( lsb on top ) :
-------> opcode ( store-r1 or store-r2 )
-------> the starting address
-------> the offset
-------> zero
-------> zero

Example :
---> store-r1 [0xdead0000], 10, 0, 0

store-ptr-r1, store-ptr-r2 instructions

Copy from register R1 or R2 the value into the memory address second-level-pointed-to via Operand1.

store-ptr-r1 and store-ptr-r2 :

Usage ( lsb on top ) :
-------> opcode ( store-ptr-r1 or store-ptr-r2 )
-------> the pointer address
-------> the offset into the second-level address -------> zero
-------> zero

Example :
---> store-ptr-r1 [0xfeedfade], 20, 0, 0

The eight mathematic and logic instructions : add, sub, and, or, xor, lsh, rsh, inv

The input for these instructions are taken from the registers R1 and R2. The result of operation is copied from R3 register to R1 register which allows the continuation of the mathematic or logic instructions in a branch out.

add == addition two numbers
sub == subtraction of two numbers
and == logical and'ing of two numbers
or == logical or'ing of two numbers
xor == logical exclusive or'ing of two numbers
lsh == left shifting of a number by so many places given in R2
rsh == right shifting a number by so many places given in R2
inv == invert a number

Usage ( lsb on top ) :
-------> opcode ( the mathematic or logic instruction )
-------> equal compare value
-------> less-than compare value
-------> jump on less-than
-------> jump on greater-than

After the operation, the instruction will act as below :

a. If the result contains the equal compare value, the automatically jumped-to address is of the instruction very next to the current instruction.

b. If the result is lesser than less-than compare value, the jump on less-than field is used to automatically jump to the relevant instruction's address.

c. Otherwise, the result is considered greater than the above values and the greater-than field is used to automatically jump to the relevant instruction's address.

This system is called “Conditional Jumps”. The automatically jumped-to addresses are absolute addresses.

The design of these eight instructions could have included another word element to provide the R2 value to the instruction but perhaps addition of another word to the already five words in the instruction will provide for more latency which of course is not desired. If a change is desired in the R2 register then the load-imm-r2 or load-mem-r2 instructions can be used. This keeps the system simple.

Loop instructions

These instructions allow for program loops.

loop-continue-forever : This is a simple unconditional jump to the instruction which is the start of the loop. Its equivalent in C language is the 'do while(1)' loop.

Usage for loop-continue-forever ( lsb on top ) :
-------> opcode ( loop-continue-forever )
-------> loop start
-------> zero
-------> zero
-------> zero

load-loop-counter : Loads into the loop counter register. The counter is a non-zero number :

Usage for load-loop-counter ( lsb on top ) : -------> opcode ( load-loop-counter )
-------> counter
-------> zero
-------> zero
-------> zero

loop-next : Subtracts one from the counter and if non-zero goes back to the loop start else goes to the next instruction after the loop. Its equivalent in C language is the 'do while(counter is non-zero)' loop :

Usage for loop-next ( lsb on top ) :
-------> opcode ( loop-next )
-------> loop start
-------> zero
-------> zero
-------> zero

Thread context instructions

[ To be done ]

sysenter and sysexit instructions

The sysenter instruction changes the thread's execution path from user mode to kernel mode so that the kernel can perform various actions according to the provided arguments which should be at the top of the the thread's syscall arguments page. The sysexit instruction changes the thread's execution path back to user mode.

Usage for sysenter ( lsb on top ) :
-------> opcode ( sysenter )
-------> zero
-------> zero
-------> zero
-------> zero

Usage for sysexit ( lsb on top ) :
-------> opcode ( sysexit )
-------> zero
-------> zero
-------> zero
-------> zero

Example :
---> sysenter 0, 0, 0, 0
---> sysexit 0, 0, 0, 0

List of instructions that can be executed only in kernel mode


[ To be done ]

The Taar operating system overview

The OS as said earlier is based on microkernel architecture which means the kernel has just a few facilities which are process / thread creation and scheduling, synchronous IPC and asynchronous notification for a few things, interrupt redirection, timers and critical section synchronization. The rest things are facilitated by user-mode server processes.

Below are syscalls and other OS elements whose shape may be modified as per development in processor ISA and OS design :

Process / Thread management calls

Minimum number of pages alloted to the first thread of a process : 6 pages = 24 KB : Page directory, page table, data page, code page, thread descriptor, syscall communication page.

Minimum number of pages alloted to subsequent threads : 2 pages = 8 KB : thread descriptor, syscall communications page.

Structure of syscall communications page : At zero address will be syscall number, then arguments. At address 1023 will be error code, then return values. So unlike in other OS' like Linux, in Taar a syscall can return a number of values in this page.

[ To be done ]

The synchronous IPC calls

Each service ( Open, Send, Receive, Control etc ) can have a number of channels each of whom is associated with a separate client buffer attachment address so that the main server thread loop can assign separate threads to each message sender client thus enabling a multi-threaded server. Each separate serving thread can do a msgReply() to unblock the requesting client thread.

Example, a 8-tab-max web browser which will be served by the multi-threaded network server.

msgAddService(Service Name Word Count, Service Name, Client Buffer Attachment Address 1, Client Buffer Attachment Address 2, Client Buffer Attachment Address 3, Client Buffer Attachment Address 4, Client Buffer Attachment Address 5, Client Buffer Attachment Address 6, Client Buffer Attachment Address 7, Client Buffer Attachment Address 8, Max Buffer Size)

Service Number = msgConnectToService(Service Name, Buffer Size, Buffer Address, Call Timeout Duration)

msgSend(Service Number)

Buffer Address and Transaction Number = msgWait()

msgReply(Transaction Number)

[ To be done ]

Mutual exclusion calls

mutexTake(Mutex Number)

mutexRelease(Mutex Number)

There is no need to specially create a mutex. Every process when created is allocated with eight mutexes which can be shared with all the threads of the process.

Program executable file format


The four data types supported by Taar OS are :

  1. Word - four bytes.

  2. NameBuffer[Number Of Words]

  3. ClientBuffer[Number Of Page Tables - 1 to 3]

  4. OtherBuffer[Number Of Pages]

[ To be done ]