This Repository summarizes the work progress made in The RISC-V-based project.
A tool developed by Chipcron Private Ltd has been used to converted the design C file to a netlist(processor.v) file.
This project aims to create a system for the automation of appliances in a room based on the user's proximity. Most rooms are not equipped to automatically turn ON and OFF essential appliances like lights and fans when people enter and exit rooms respectively. This saves a lot of net electricity and helps improve cost efficiency and power saving. This particular design is more tailor-made for office and school spaces(working spaces) rather than homes.
Here we will be using a PIR sensor. A passive infrared sensor (PIR sensor) is an electronic sensor that measures infrared (IR) light radiating from objects in its field of view. They are most often used in PIR-based motion detectors. This will detect the presence of any individuals in the vicinity. If yes our board is programmed to turn on the light(s) in a defined space. If not it will turn OFF automatically. Also, we are using putting an ON-OFF main switch for the system.
The below shown is an assumed circuit(not entirely accurate).
- Open a terminal window
- Navigate to the directory when the .c file is present
- Compile the code designed using gcc and verify the output.
gcc motion.c
./a.out
The following output is observed after doing GCC compilation.
When input from the sensor and main switch are both HIGH:
When input from the sensor is LOW:
int main()
{
int led_pin,sensor_pin,led_pin_reg,i,j,reset_high;
int mask1 = 0xFFFFFFFB;
led_pin = 0; // initialize the output pin as LOW initially
led_pin_reg = led_pin*4;
asm volatile(
"and x30, x30, %1\n\t"
"or x30, x30, %0\n\t"
:
: "r"(led_pin_reg), "r"(mask1)
: "x30"
);
//for(int z=0;z<1;z++)
while(1)
{
//Off switch- if the reset_high is low it should be off fully.
asm volatile(
"andi %0, x30, 1\n\t"
:"=r"(reset_high)
:
:
);
//"Motion sensor-based room light control started"
asm volatile(
"andi %0, x30, 2\n\t"
:"=r"(sensor_pin)
:
:
);
//sensor_pin=1;
//reset_high=1;
if ((sensor_pin) && (reset_high)) {
// Motion detected, turn on the room light
//digitalWrite(LIGHT_PIN, HIGH);
//printf("Motion detected. Light turned ON.\n");
led_pin = 1;
led_pin_reg = led_pin*4;
asm volatile(
"and x30, x30, %1\n\t"
"or x30, x30, %0\n\t"
:
: "r"(led_pin_reg), "r"(mask1)
: "x30"
);
// You can add a delay here to control how long the light stays on
for (i = 0; i < 100; i++) {
}
}//end of if statement
else
{
//No motion detected or manual switch is off, turn off the room light
//digitalWrite(LIGHT_PIN, LOW);
//printf("No motion detected. Light turned OFF.\n");
led_pin = 0;
led_pin_reg = led_pin*4;
asm volatile(
"and x30, x30, %1\n\t"
"or x30, x30, %0\n\t"
:
: "r"(led_pin_reg), "r"(mask1)
: "x30"
);
}//end of else statement
}//end while loop
return 0;
}
The above C program is compiled using the RISC-V GNU toolchain and the assembly code is dumped into a text file.
Below codes are run on the terminal to get the assembly code.
riscv64-unknown-elf-gcc -mabi=ilp32 -march=rv32i -ffreestanding -nostdlib -o ./out motion.c
riscv64-unknown-elf-objdump -d -r out > asm.txt
out: file format elf32-littleriscv
Disassembly of section .text:
00010054 <main>:
10054: fd010113 addi sp,sp,-48
10058: 02812623 sw s0,44(sp)
1005c: 03010413 addi s0,sp,48
10060: ffb00793 li a5,-5
10064: fef42423 sw a5,-24(s0)
10068: fe042223 sw zero,-28(s0)
1006c: fe442783 lw a5,-28(s0)
10070: 00279793 slli a5,a5,0x2
10074: fef42023 sw a5,-32(s0)
10078: fe042783 lw a5,-32(s0)
1007c: fe842703 lw a4,-24(s0)
10080: 00ef7f33 and t5,t5,a4
10084: 00ff6f33 or t5,t5,a5
10088: 001f7793 andi a5,t5,1
1008c: fcf42e23 sw a5,-36(s0)
10090: 002f7793 andi a5,t5,2
10094: fcf42c23 sw a5,-40(s0)
10098: fd842783 lw a5,-40(s0)
1009c: 04078a63 beqz a5,100f0 <main+0x9c>
100a0: fdc42783 lw a5,-36(s0)
100a4: 04078663 beqz a5,100f0 <main+0x9c>
100a8: 00100793 li a5,1
100ac: fef42223 sw a5,-28(s0)
100b0: fe442783 lw a5,-28(s0)
100b4: 00279793 slli a5,a5,0x2
100b8: fef42023 sw a5,-32(s0)
100bc: fe042783 lw a5,-32(s0)
100c0: fe842703 lw a4,-24(s0)
100c4: 00ef7f33 and t5,t5,a4
100c8: 00ff6f33 or t5,t5,a5
100cc: fe042623 sw zero,-20(s0)
100d0: 0100006f j 100e0 <main+0x8c>
100d4: fec42783 lw a5,-20(s0)
100d8: 00178793 addi a5,a5,1
100dc: fef42623 sw a5,-20(s0)
100e0: fec42703 lw a4,-20(s0)
100e4: 06300793 li a5,99
100e8: fee7d6e3 bge a5,a4,100d4 <main+0x80>
100ec: 0240006f j 10110 <main+0xbc>
100f0: fe042223 sw zero,-28(s0)
100f4: fe442783 lw a5,-28(s0)
100f8: 00279793 slli a5,a5,0x2
100fc: fef42023 sw a5,-32(s0)
10100: fe042783 lw a5,-32(s0)
10104: fe842703 lw a4,-24(s0)
10108: 00ef7f33 and t5,t5,a4
1010c: 00ff6f33 or t5,t5,a5
10110: f79ff06f j 10088 <main+0x34>
The above assembly code was run on a Python script to find the different instructions used:
Number of different instructions: 11
List of unique instructions:
slli
sw
lw
addi
and
li
andi
or
beqz
bge
j
Now spike simulation is done using following commands.
riscv64-unknown-elf-gcc -march=rv64i -mabi=lp64 -ffreestanding -o out motion.c
spike pk out
Here, We have two inputs and only one output, so there are only four test cases and out of those four only one of them will result in the output being high and in the rest of the three cases, the output is expected as low.For the sake of simulation in spice we are not using an infinite loop but just one iteration of it.
For spike simulation, the two inputs are hard coded for the four test cases.
This is the only case in which output is of a High value.There are very small logic LOWs in the output high due to the masking we have done.
In this case the Output is shown LOW as there is no motion sensed and input sensor is LOW.
In these two cases the main switch is OFF so by default regardless of input sensor output will stay LOW.
Since the core is generated we have to check for the functionality in GTKwave. Nessasary modifications must be made in the testbench and the following observations were made.
i.e if both the input pins be high the output will be high.The delay in the output is due to a delay given in the testbench.
The output is becoming high at fe042623
as shown below.
The below screenshot illustrates the change in instructions throughout.
We see the input 11
produces a more Width in ouput waveform because there is a small delay in the design.This casusd the output High of input 11
to be of more width that compared to other inputs like 00
,01
and 10
.
The output is becoming high at fe042623
as shown below.The This is the same instruction at different instances of which output becomes 1.
When the output is becoming low the instruction is 00FF6f33
as shown below.
Here the input is 01
and hence will lead to a LOW output.
We will consider the link : https://en.wikichip.org/wiki/risc-v/registers for reference
$signal$43, $signa$45, and $signal$58 are essentially registers. $signal$43 holds the hardwired wire zero registers (x0), $signal$45 stores the stack pointer (x2) register, and $signal$58 represents the a5 register.This can be infered from the above link.
Some of the instructions in the above assembly code were tested in GTKWave and was verified.
Here we will conside the assembly code line:
10054: fd010113 addi sp,sp,-48
The $signal$45 is the stack pointer and the value at the instuction fd010113
is 000000CF which is 207.
Here we will conside the assembly code line:
10060: ffb00793 li a5,-5
The $signal$58 is a5 register and the value at instruction ffb00793
is FFFFFFFB which is -5.
1005c: 03010413 addi s0,sp,48
This instruction comes after the addi sp,sp,-48
and sw s0,44(sp)
instructions.
We can see for the instruction03010413
in the register s0 which is $signal$53 will be 000000FF which is 255.
Here we do Synthesis of our processor on yosys using the following commands:
read_liberty -lib sky130_fd_sc_hd__tt_025C_1v80_256.lib
read_verilog processor.v
synth -top wrapper
dfflibmap -liberty sky130_fd_sc_hd__tt_025C_1v80_256.lib
abc -liberty sky130_fd_sc_hd__tt_025C_1v80_256.lib
write_verilog <filename.v>
The following Command is used to run the synthesized netlist along with primitives.
iverilog -o test testbench.v synth_processor_test.v sky130_sram_1kbyte_1rw1r_32x256_8.v sky130_fd_sc_hd.v primitives.v
The following waveforms are of GLS Simulation obtained using GTKWave and the same output is obtained as Functional Verification.
The following screenshot is of the wrapper module using the following command in yosys
show wrapper
Place and Route (PnR) is the core of any ASIC implementation and Openlane flow integrates into it several key open source tools which perform each of the respective stages of PnR. Below are the stages and the respective tools that are called by openlane for the functionalities as described:
- Synthesis
- Generating gate-level netlist
- Performing cell mapping
- Performing pre-layout STA
- Floorplaning
- Defining the core area for the macro as well as the cell sites and the tracks
- Placing the macro input and output ports
- Generating the power distribution network
- Placement
- Performing global placemen
- Perfroming detailed placement to legalize the globally placed components
- Clock Tree Synthesis
- Synthesizing the clock tree
- Routing
- Performing global routing to generate a guide file for the detailed router
- Performing detailed routing
- GDSII
- Streaming out the final GDSII layout file from the routed def
Preparing the design and including the lef files: The commands to prepare the design and overwite in a existing run folder the reports and results along with the command to include the lef files is given below:
sed -i's/max_transition :0.04/max_transition :0.75'*/*.lib
make mount
%./flow.tcl -interactive
% package require openlane 0.9
% prep -design project
Logic synthesis uses the RTL netlist to perform HDL technology mapping. The synthesis process is normally performed in two major steps:
- GTECH Mapping – Consists of mapping the HDL netlist to generic gates what are used to perform logical optimization based on AIGERs and other topologies created from the generic mapped netlist.
- Technology Mapping – Consists of mapping the post-optimized GTECH netlist to standard cells described in the PDK.
To synthesize the code run the following command.
run_synthesis
Synthesis report:
Goal is to plan the silicon area and create a robust power distribution network (PDN) to power each of the individual components of the synthesized netlist. In addition, macro placement and blockages must be defined before placement occurs to ensure a legalized GDS file. In power planning we create the ring which is connected to the pads which brings power around the edges of the chip. We also include power straps to bring power to the middle of the chip using higher metal layers which reduces IR drop and electro-migration problem.
Following command helps to run floorplan:
run_floorplan
We can check the layout with magic with the following command.
magic -T /home/emil/.volare/sky130A/libs.tech/magic/sky130A.tech lef read ../../tmp/merged.nom.lef def read wrapper.def &
Place the standard cells on the floorplane rows, aligned with sites defined in the technology lef file. Placement is done in two steps: Global and Detailed. In Global placement tries to find optimal position for all cells but they may be overlapping and not aligned to rows, detailed placement takes the global placement and legalizes all of the placements trying to adhere to what the global placement wants. The next step in the OpenLANE ASIC flow is placement. The synthesized netlist is to be placed on the floorplan. Placement is perfomed in 2 stages:
- Global Placement
- Detailed Placement
run the following command to run the placement:
run_placement
Now after placement we can see the layout in the placement directory. We view it with magic.
We can check the layout with magic with the following command.
magic -T /home/emil/.volare/sky130A/libs.tech/magic/sky130A.tech lef read ../../tmp/merged.nom.lef def read wrapper.def &
Clock tree synteshsis is used to create the clock distribution network that is used to deliver the clock to all sequential elements. The main goal is to create a network with minimal skew across the chip. H-trees are a common network topology that is used to achieve this goal.
The purpose of building a clock tree is enable the clock input to reach every element and to ensure a zero clock skew. H-tree is a common methodology followed in CTS.
Following command is used to run CTS.
run_cts
The command gen_pdn
is used to get the power distribution network.
Implements the interconnect system between standard cells using the remaining available metal layers after CTS and PDN generation. The routing is performed on routing grids to ensure minimal DRC errors.
OpenLANE uses the TritonRoute tool for routing. There are 2 stages of routing:
- Global Routing
- Detailed Routing
In Global Routing Routing region is divided into rectangle grids which are represented as course 3D routes (Fastroute tool).
In Detailed Finer grids and routing guides used to implement physical wiring (TritonRoute tool).
Run the following command to run the routing
run_routing
After completion it should show the following messages.Here we see no DRC violations.
We can check the routed layout with magic with the following command.
magic -T /home/emil/.volare/sky130A/libs.tech/magic/sky130A.tech lef read ../../tmp/merged.nom.lef def read wrapper.def &
Given a Clock period of 25ns in Json file , setup slack we got after routing is 5.59ns.
1
Max Performance = ------------------------
clock period - slack(setup)
Max Performance = 51.5198 MHz
Following commands are run post routing with no DRC violations
run_magic
run_magic_spice_export
run_magic_drc
run_antenna_check
Run the following commands to do the non-interactive flow.
cd Desktop/OpenLane
make mount
./flow.tcl -design project
- Kunal Ghosh,Co-founder,VSD Corp. Pvt. Ltd.
- Mayank Kabra,Founder, Chipcron Pvt. Ltd.
- Sumanto Kar,VSD Corp.
- Alwin Shaju, Colleague, IIIT-Bangalore
- Kanish R, Colleague, IIIT-Bangalore