909 Asynchronous Self-Timed Multiplier

909 : Asynchronous Self-Timed Multiplier

Design render

How it works

This design is an exercise in a particular flavor of asynchronous logic, known as "bundled data". In bundled data logic there is no global clock; instead stages communicate when data is valid and stages are ready.

See Introduction to Asynchronous Circuit Design for a more in-depth discussion.

Why?

Asynchronous, or "self-timed", logic admits an area overhead for inter-stage control, but in return provides some benefits, including:

  • [average] cycle time is a function of the stages dynamically involved with the computation, not the worst case as it is in ("normal") synchronous logic,

  • only circuits dynamically involved with the computation toggles, potentially saving power,

  • there is essentially no simultaneous switching (unlike in synchronous logic), dramatically reducing the requirements on the supplies to provide instant current.

Request and Acknowledge

For historical reasons we denote data validity with "request" and stage readiness is signaled by the receiving stages with the acknowledge signal, denoting that data have been received (and latched). For well defined behavior it is required that request can only transition once acknowledge have transitioned (and vice versa). This means that the request/acknowledge pair will cycle through 00, 10, 11, and 01, and only these values.

Delays

It is essential that data are stable before request is asserted. This is achieved via a delay which must be longer than the longest delay of any data. The delays determine the transition speed of the overall circuit so ideally we would want to make them as tight as possible. Alas, the static timing tools are not designed for this use case and due to time-constraints we had to guess at sufficient delays. This is the major risk in this (current) design. To Be Improved In a Future Tapeout

Joins and Forks

In synchronous logic, all signals are guaranteed valid at the clock boundary and may thus the combined freely. Not so in asynchronous logic in which each state is effectively its own clock domain. Thus when we join values from multiple paths we must use appropriate synchronizers. This is true for forks as well as we can't proceed until all receivers are ready.

This example design

This design computes a sequence of r = x^2+x, for x=0,1,2,... on the outputs using the handshake protocol. We use 26-bits of internal precision, but we only have 15-bits for outputs so what is actually emitted is r ^ (r >> 15).

The very naive algorithm (with the body unrolled once) is

x = 0
loop:
  x = x + 1
  a = b = c = x
  while b != 0:
    if (b & 1) == 1:
      c += a
    a *= 2
    b /= 2
    if (b & 1) == 1:
      c += a
    a *= 2
    b /= 2
  output (c)

which was hand translated (roughly following Introduction to Asynchronous Circuit Design) into a token flow graph:

token-flow graph

Note, we use a simpler, less expensive, construction for the conditional iteration as having independent control-flow for the trivial condition is overkill.

How to test

As mentioned, the data is presented using the standard 4-phase (RTZ) protocol (idle, Req, Req+Ack, Ack, idle, ...). To test, drive ack manually (any sufficiently low frequency should be work) and observe the changes in req and data. If everything works correctly, the circuit should generate 0, 2, 6, ..., x(x+1), but presented on result as x ^ (x >> 15).

To get a continuous stream, simply tie ack to req (however it's possible that data will transition to quickly for the output bandwidth -- TBD).

External hardware

Results are directly observable from the RP2040.

IO

#InputOutputBidirectional
0ackreqresult_7
1result_0result_8
2result_1result_9
3result_2result_10
4result_3result_11
5result_4result_12
6result_5result_13
7result_6result_14

Chip location

Controller Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Analog Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Analog Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux tt_um_chip_rom (Chip ROM) tt_um_factory_test (TinyTapeout 8 Factory Test) tt_um_oscillating_bones (Oscillating Bones) tt_um_urish_charge_pump (Dickson Charge Pump) tt_um_bgr_agolmanesh (Bandgap Reference) tt_um_tnt_diff_rx (TT08 Differential Receiver test) tt_um_rejunity_vga_logo (VGA Tiny Logo (1 tile)) tt_um_tommythorn_maxbw (Asynchronous Multiplier) tt_um_mattvenn_r2r_dac_3v3 (Analog 8 bit 3.3v R2R DAC) tt_um_urish_simon (Simon Says memory game) tt_um_rebeccargb_universal_decoder (Universal Binary to Segment Decoder) tt_um_mattvenn_rgb_mixer (RGB Mixer demo5) tt_um_rebeccargb_hardware_utf8 (Hardware UTF Encoder/Decoder) tt_um_find_the_damn_issue (Find The Damn Issue) tt_um_brandonramos_VGA_Pong_with_NES_Controllers (VGA Pong with NES Controllers) tt_um_kb2ghz_xalu (4-bit minicomputer ALU) tt_um_rebeccargb_intercal_alu (INTERCAL ALU) tt_um_a1k0n_demo (Demo by a1k0n) tt_um_rburt16_bias_generator (Bias Generator) tt_um_zec_square1 ("SQUARE-1": VGA/audio demo) tt_um_jmack2201 (Sprite Bouncer with Looping Background Options) tt_um_ran_DanielZhu (Dice) tt_um_gfg_development_tinymandelbrot (TinyMandelbrot) tt_um_LnL_SoC (Lab and Lectures SoC) tt_um_htfab_pi_snake (Pi Snake) tt_um_tt08_aicd_playground (AICD Playground) tt_um_toivoh_demo (Sequential Shadows [TT08 demo competition]) tt_um_quarren42_demoscene_top (asic design is my passion) tt_um_crispy_vga (Crispy VGA) tt_um_MichaelBell_canon (TT08 Pachelbel's Canon demo) tt_um_shuangyu_top (Calculator) tt_um_wokwi_407306064811090945 (DDR throughput and flop aperature test) tt_um_08_sws (Sine Wave Synthesizer) tt_um_favoritohjs_scroller (VGA Scroller) tt_um_tt08_wirecube (Wirecube) tt_um_vga_glyph_mode (Glyph Mode) tt_um_a1k0n_vgadonut (VGA donut) tt_um_roy1707018 (RO) tt_um_analog_factory_test (TT08 Analog Factory Test) tt_um_sign_addsub (CMOS design of 4-bit Signed Adder Subtractor) tt_um_tinytapeout_logo_screensaver (VGA Screensaver with Tiny Tapeout Logo) tt_um_patater_demokit (Patater Demo Kit Waggling Rainbow on a Chip) tt_um_algofoogle_tt08_vga_fun (TT08 VGA FUN!) tt_um_simon_cipher (simon_cipher) tt_um_thexeno_rgbw_controller (Color Controller) tt_um_demosiine_sda (DemoSiine) tt_um_bytex64_munch (Munch) tt_um_alexjaeger_ringoscillator (5MHz Ring Oscillator) tt_um_cfib_demo (cfib Demoscene Entry) tt_um_wokwi_407852791999030273 (Simple 8 Bit ALU) tt_um_Richard28277 (4-bit ALU) tt_um_betz_morse_keyer (Morse Code Keyer) tt_um_nvious_graphics (nVious Graphics) tt_um_tiny_pll (Tiny PLL) tt_um_ezchips_calc (8-Bit Calculator) tt_um_hack_cpu (HACK CPU) tt_um_noritsuna_Vctrl_LC_oscillator (Voltage Controlled LC-Oscillator) tt_um_ring_divider (Divided Ring Oscillator) tt_um_2048_vga_game (2048 sliding tile puzzle game (VGA)) tt_um_morningjava_r2r_from_matt (Bucket Brigade) tt_um_ephrenm_tsal (TSAL_TT) tt_um_kapilan_alarm (Alarm Clock) tt_um_stochastic_addmultiply_CL123abc (Stochastic Multiplier, Adder and Self-Multiplier) tt_um_wokwi_407760296956596225 (tt08-octal-alu) tt_um_dlfloatmac (DL float MAC) tt_um_wakki_0123_Raw_Transistors (Raw_Transistors) tt_um_faramire_rotary_ring_wrapper (Rotary Encoder WS2812B Control) tt_um_devstdin_LDO_OSC (LDO BG IREF OSC) tt_um_frequency_counter (Frequency Counter SSD1306 OLED) tt_um_rom_test (TT08 SKY130 ROM 'YOLO' Test) tt_um_i2c_peripheral_stevej (i2c peripherals: leading zero count and fnv-1a hash) tt_um_yuri_panchul_schoolriscv_cpu_with_fibonacci_program (schoolRISCV CPU with Fibonacci program) tt_um_yuri_panchul_adder_with_flow_control (Adder with Flow Control) tt_um_brailliance (Brailliance) tt_um_nyan (nyan) tt_um_MichaelBell_mandelbrot (VGA Mandelbrot) tt_um_ssp_opamp (2-stage Opamp Designs) tt_um_fountaincoder_top_ad (pulse_add) tt_um_edwintorok (Rounding error) tt_um_mac (MAC) tt_um_dpmu (DPMU) tt_um_JAC_EE_segdecode (7 Segment Decode) tt_um_wokwi_408118380088342529 (Traffic-light-sequence) tt_um_shiftreg_test (TT08 SKY130 Shift Register 'YOLO' Test) tt_um_yuri_panchul_sea_battle_vga_game (Sea Battle) tt_um_benpayne_ps2_decoder (PS2 Decoder) tt_um_meriac_play_tune (Super Mario Tune on A Piezo Speaker) tt_um_comm_ic_bhavuk (Comm_IC) tt_um_daosvik_aesinvsbox (AES Inverse S-box) tt_um_wokwi_408216451206371329 (Logic Test) tt_um_micro_tiles_container (Micro tile container) tt_um_cattuto_sr_latch (TT08 - experiments with latch-based shift registers) tt_um_rejunity_vga_test01 (VGA Drop (audio/visual demo)) tt_um_silice (Warp) tt_um_wokwi_408231820749720577 (Abacus Lock) tt_um_jayjaywong12 (mulmul) tt_um_emmyxu_obstacle_detection (Obstacle Detection) tt_um_neural_navigators (Neural Net ASIC) tt_um_a1k0n_nyancat (VGA Nyan Cat) tt_um_rebeccargb_styler (Styler) tt_um_resfuzzy (resfuzzy) tt_um_cejmu (CEJMU Beers and Adders) tt_um_16_mic_beamformer_arghunter (16 Mic Beamformer) tt_um_pdm_pitch_filter_arghunter (PDM Pitch Filter) tt_um_pdm_correlator_arghunter (PDM Correlator) tt_um_ddc_arghunter (DDC) tt_um_i2s_to_pwm_arghunter (I2S to PWM ) tt_um_georgboecherer_vco (Analog Voltage Controlled Oscillator) tt_um_supermic_arghunter (Supermic ) tt_um_dmtd_arghunter (DMTD ) tt_um_htfab_bouncy_capsule (Bouncy Capsule) tt_um_samuelm_pwm_generator (PWM generator) tt_um_mattvenn_analog_ring_osc (Ring Oscillators) tt_um_toivoh_demo_deluxe (Sequential Shadows Deluxe [TT08 demo competition]) tt_um_vga_clock (VGA clock) tt_um_z2a_rgb_mixer (RGB Mixer demo) tt_um_faramire_stopwatch (Simple Stopwatch) tt_um_micro_tiles_container_group2 (Micro tile container (group 2)) tt_um_johshoff_metaballs (Metaballs) tt_um_top (Flame demo) tt_um_NicklausThompson_SkyKing (SkyKing Demo) tt_um_Electom_cla_4bits (4-bit CLA) tt_um_vga_cbtest (Generate VGA output for Color Blindness Test) tt_um_zoom_zoom (Zoom Zoom) tt_um_dpmunit (DPM_Unit) tt_um_clock_divider_arghunter (Clock Divider ) tt_um_dlmiles_poc_fskmodem_hdlctrx (FSK Modem +HDLC +UART (PoC)) tt_um_emilian_muxpga (TinyFPGA resubmit for TT08) tt_um_pyamnihc_dummy_counter (Dummy Counter) tt_um_whynot (Why not?) tt_um_sudana_ota5t_1 (5-T OTA) tt_um_dlmiles_tt08_poc_uart (UART) tt_um_dendraws_donut (donut) tt_um_wokwi_408237988946759681 (Counter) tt_um_tmkong_rgb_mixer (RGB Mixer) Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available