# T2R2 東京工業大学リサーチリポジトリ # Tokyo Tech Research Repository #### 論文 / 著書情報 Article / Book Information | 題目(和文) | | |-------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Title(English) | A Study of Bluetooth Low Energy Transceiver Using Ultra-Low-Power Fractional-N Digital PLL Based on Digital-to-Time Converter | | 著者(和文) | Liu Hanli | | Author(English) | Hanli Liu | | 出典(和文) | 学位:博士(学術),<br>学位授与機関:東京工業大学,<br>報告番号:甲第11024号,<br>授与年月日:2018年12月31日,<br>学位の種別:課程博士,<br>審査員:岡田 健一,髙木 茂孝,廣川 二郎,阪口 啓,伊藤 浩之,飯塚 哲也 | | Citation(English) | Degree:Doctor (Academic),<br>Conferring organization: Tokyo Institute of Technology,<br>Report number:甲第11024号,<br>Conferred date:2018/12/31,<br>Degree Type:Course doctor,<br>Examiner:,,,,, | | 学位種別(和文) | 博士論文 | | Type(English) | Doctoral Thesis | # A Study of Bluetooth Low Energy Transceiver Using Ultra-Low-Power Fractional-N Digital PLL Based on Digital-to-Time Converter by #### Hanli Liu A Ph. D. dissertaion submitted in partial satisfaction of the requirements for the degree of #### **Doctor of Philosophy** in #### **Department of Physical Electronics** in the #### **Graduate School of Science and Engineering** of #### **Tokyo Institute of Technology** Supervised by Prof. Kenichi Okada Autumn 2018 To my family, ### Acknowledgment This thesis stands on the generous help and support that I have received over the two years of the mater study and the three years of the Ph.D. study. I would like to take this chance to acknowledge the people who have helped me throughout my entire five year's integrated doctor course in Tokyo Institute of Technology. First of all, I would like to thank Prof. Kenichi Okada and Prof. Akira Matsuzawa for giving me a wonderful chance to become a member of Matsuzawa and Okada Laboratory and Tokyo Institute of Technology. I would like to give sincere thanks to my advisor, Prof. Kenichi Okada, for his valuable advice both in my research and my life in Japan. Throughout these five years, he always squeezed sufficient time from his extremely tight schedule to give me his advice and encouragement. With no doubt, without his guidance and trust, this work could not be finished. I would also like to thank Prof. Akira Matsuzawa for his kindly inviting to his house every year, which gave me many wonderful life experiences in Japan. During talking with him, he always delivery a lot of genius thoughts to me and broad my vision. I would like to thank Assist. Prof. Masaya Miyahara for his valuable advice and technical support. And also I would like to thank Yoshino Kasuga and Makiko Tsunashima for their kindly helps with applications, documents. Their advices and explanations are always very helpful. I would also like to show my gratitude to the Ph.D. committee members, Prof. Shige-taka Takagi, Prof. Jiro Hirokawa, Prof. Kei Sakaguchi, Prof. Hiroyuki Ito, and Prof. Tetsuya Iizuka for taking the time out of their busy schedule to examine my dissertation. Especially thanks to Prof. Pietro Andreani from Lund University who spends his precious time for attending my examine. I would especially like to express my gratitude to my senior, Teerachot Siriburanon, Aravind Tharayil Narayanan, Deng Wei, Rui Wu for all their kind help and rich knowledge that they have shared with me. I would thank the colleagues in Toshiba Cooperated Research and Development Center during my three months' internship. Especially thanks to my manager Akihide Sai and my advisor Hidenori Okuni who gave me lots of hands-on experiences and technical support for my assigned project. I would give thanks to my group members: Zheng Sun, Dexian Tang and Hongye Huang for their supporting work and discussions. We cooperated as a group towards the same goal which made the researches no more boring and frustrating anymore. Without their help, this work cannot be done without a doubt. Finally, I would like to thank my girlfriend, Yin Ge, for sharing the wonderful five years with me in Japan. Without her, I may not continue my Ph.D. study after I graduated with my master degree. And also thank my pet cat, Pai Gu, who always being able to make me smile, no matter how good or bad a day I am having. And I would like to express my deepest gratitude to my parents, who have always given me love and support. They give birth to me and give me a good education in life. They teach me the spirit of never surrender. And they also teach me how to enjoy life. #### **Abstract** This thesis presents an ultra-low-power wireless transceiver for Bluetooth Low-Energy standard. For achieving an ultra-low-power operation with a low sensitivity and a high blocker immunity of the presented transceiver, architecture considerations and key building blocks are discussed. A wide loop-bandwidth fractional-N DPLL plays a central role in the presented transceiver, *i.e.*, a frequency modulator for the transmitter and a local oscillator, an analog-to-digital converter, a frequency and phase synchronizer for the receiver. To obtain better jitter and spur performances of the DPLL while maintaining low power operation, techniques such as the isolated constant slope digital-to-time converter and TDC gain calibration are also discussed in this thesis. vi Abstract # **Contents** | A | Acknowledgment | | ii | | |----|----------------|----------|---------------------------------------------------|----| | Al | ostrac | et | | v | | 1 | Intr | oductio | on | 1 | | | 1.1 | Intern | et-of-Things and Its Available Wireless Standards | 1 | | | 1.2 | Blueto | ooth Low-Energy and its Applications | 3 | | | | 1.2.1 | Beacon Mode of Bluetooth Low-Energy | 4 | | | | 1.2.2 | Mesh Network of Bluetooth Low-Energy | 5 | | | 1.3 | Challe | enges for BLE Transceiver Design | 5 | | | | 1.3.1 | Transmitter Design and its Challenges | 6 | | | | 1.3.2 | Receiver Design and its Challenges | 7 | | | 1.4 | Overv | iew of the Thesis | 12 | | 2 | Fra | ctional- | N Digital PLLs for Wireless Communication | 15 | | | 2.1 | Analo | g-Type Fractional-N PLLs | 15 | | | 2.2 | Types | of Fractional-N DPLLs | 16 | | | 2.3 | Buildi | ng Blocks of Fractional-N DPLL | 19 | | | | 2.3.1 | Time-to-Digital Converter(TDC) | 19 | | | | 2.3.2 | Digital-to-Time Converter(DTC) | 23 | | | | 2.3.3 | LC-Oscillator and Oscillator Phase Noise | 28 | | 3 | Sub | -mW D | rigital PLL Using Digital-to-Time Converter | 33 | | | 3.1 | Propo | sed DPLL System Architecture | 36 | | | 3.2 | Propo | sed Isolated Constant-Slope DTC | 41 | | | | 3.2.1 | Concept of Operations | 41 | | | | 3.2.2 | Nonlinear Sources and Circuit Implementations | 44 | | | | 3.2.3 | Simulation Results | 46 | | | 3 3 | Circui | t Implementation | 40 | viii CONTENTS | | | 3.3.1 | Path-select TDC and TDC Gain Calibration | . 49 | |---|-------------|----------|-------------------------------------------------------|-------| | | | 3.3.2 | Reference Doubler and Duty Cycle Calibration | . 54 | | | | 3.3.3 | Digital Controlled Oscillator | . 54 | | | | 3.3.4 | Coarse PLL Loop | . 55 | | | 3.4 | Measu | rement Results | . 58 | | | 3.5 | Fractio | onal-N DPLL Towards $200\mu W$ | . 64 | | | 3.6 | Conclu | usion | . 66 | | 4 | Blue | tooth L | Low Energy Transceiver Using Digital PLL | 69 | | | 4.1 | DPLL- | -Centric Receiver | . 72 | | | | 4.1.1 | DPLL-based ADC with Dynamic Range Enhancement | . 72 | | | | 4.1.2 | Wide Loop-Bandwidth Fractional-N DPLL | . 82 | | | | 4.1.3 | Hybrid-loop RX with Phase and Frequency Recovery Loop | . 86 | | | 4.2 | Buildin | ng Blocks of The BLE Transceiver | . 87 | | | | 4.2.1 | Receiver Front-End Design | . 89 | | | | 4.2.2 | Single-Point Polar-TX | . 93 | | | 4.3 | Measu | rement Results | . 94 | | | 4.4 | BLE T | Transceiver Towards 5.0 | . 102 | | | 4.5 | Conclu | usion | . 105 | | 5 | Con | clusion | and Future Directions | 107 | | | 5.1 | Conclu | usion | . 107 | | | 5.2 | Future | Direction | . 109 | | | | 5.2.1 | Fractional-N DPLL | . 109 | | | | 5.2.2 | Bluetooth Low-Energy Transceiver | . 110 | | A | Pub | lication | List | 121 | | | <b>A.</b> 1 | Journa | d Papers | . 121 | | | A.2 | Interna | ational Conferences and Workshops | . 121 | | | A.3 | Domes | stic Conferences and Workshops | . 122 | | | A.4 | Co-Au | ıthor | . 123 | | | | A.4.1 | Conferences | . 123 | | | | A.4.2 | Journal Papers | . 124 | | | | A.4.3 | Domestic Conferences and Workshops | . 124 | | | | A.4.4 | Books | . 126 | # **List of Figures** | 1.1 | The concept of the Internet of Things | 2 | |------|----------------------------------------------------------------------------|----| | 1.2 | Variety types of communications that BLE supports | 6 | | 1.3 | Power consumption of the breakdown of a Texas Instruments CC26XX | | | | BLE device during a connection event | 6 | | 1.4 | The basic receiver architecture | 8 | | 1.5 | Wanted and unwanted signals down-converted by a mixer and LO | 10 | | 1.6 | (a) Conventional data digitization process by the full-range TDC (b) Phase | | | | domain diagram. | 11 | | 1.7 | Battery life of the state-of-the-art BLE RX and our target | 12 | | 2.1 | Topology of CPPLL | 16 | | 2.2 | Phase-domain fractional- <i>N</i> DPLL | 16 | | 2.3 | Divider-based fractional-N ADPL | 17 | | 2.4 | DTC-based fractional-N DPLL implementation | 18 | | 2.5 | Delay line based flash TDC | 20 | | 2.6 | Vernier-chain TDC | 21 | | 2.7 | Linear passive interpolated TDC | 21 | | 2.8 | Charge domain TDC using SAR ADC | 22 | | 2.9 | Single-slope TDC | 23 | | 2.10 | GRO TDC | 24 | | 2.11 | Concept of DTC | 24 | | 2.12 | Concept of DTC | 25 | | 2.13 | Constant slope DTC vs variable slope DTC | 26 | | 2.14 | Constant slope DTC | 27 | | 2.15 | Continues time comparator with a variable $V_{\text{TH}}$ control | 27 | | 2.16 | Three types of LC-VCO topology: (a) NMOS type LC-VCO; (b) CMOS | | | | type LC-VCO; (c) PMOS type LC-VCO | 28 | | 2.17 | LC-VCO model | 29 | | 2.18 | Output waveforms of an idea and a real oscillator | 30 | | 2.19 | Output spectrum of (a)ideal and (b) real oscillator | 31 | |-------|---------------------------------------------------------------------------------------------------------------------------|-----| | 2.20 | Phase noise spectrum of the LC oscillators | 31 | | 3.1 | (a) DPLL with the full-range TDC to perform fractional-N operation (b) | | | | DPLL with the full-range DTC to perform fractional-N operation which | | | | reduces total power consumption | 34 | | 3.2 | Detailed block diagram of the proposed ULP DPLL with the proposed | | | | 10b isolate constant-slope DTC | 36 | | 3.3 | (a) DSM-based fractional controller. (b) Phase models of MMDIV and | | | | DTC | 37 | | 3.4 | System level analysis of 1st-order and 2nd-order DSM-based fractional | | | | controller for low-power DPLL | 38 | | 3.5 | Variable-slope DTC | 39 | | 3.6 | Phase noise estimation of the DPLL in the phase domain (without frac- | | | | tional spurs) and the time-domain phase noise simulation at 2442MHz | | | | using 52MHz reference (with fractional spurs) | 40 | | 3.7 | (a) Conventional constant-slope DTC (b) operations of the conventional | | | | constant-slope DTC | 42 | | 3.8 | (a) Proposed isolated constant-slope DTC (b) concept operation of the | | | | proposed DTC | 43 | | 3.9 | $V_{\text{TH}}$ offset caused INL which induced by noisy supply | 44 | | 3.10 | Conceptual operation diagrams of proposed 10b isolated constant-slope | | | | DTC (a) $C_L$ is isolated from DAC during DAC operation (Pre-charge step) | | | | (b) Charge in $C_C$ is shorted to ground which set new $V_{ST}$ at node B (Set | | | | step) (c) Constant slope with new $V_{ST}$ is compared in inverter(Compare | 4.5 | | 2 1 1 | step) | 45 | | | Detailed circuit of (a) isolated constant-slope DTC and (b) its timing chart. | 47 | | 3.12 | Post-layout INL simulations of the proposed DTC with (a) Corner condi- | 40 | | 2 12 | tions (b) Supply voltage variations (c) Temperature variations | 49 | | | The Monte-Carlo simulations of the proposed DTC INL | 50 | | 3.14 | Simulated deterministic jitter power w/ and w/o auto-zero switch when | | | | noisy supply with different frequencies are presented, and the deterministic jitter power suppression w/ auto zero switch | 50 | | 3 15 | Path-select TDC | 51 | | | (a) Conventional TA calibration (b) TA time-offset induced gain error | 52 | | | Proposed TA gain-and-offset calibration technique (a) TA time-offset cal- | 52 | | 3.17 | ibration (b) TA gain calibration | 53 | | | ioration (b) IA gain canoration | 55 | LIST OF FIGURES xi | 3.18 | Proposed TDC gain calibration for minimizing PLL output jitter variation | | |------|--------------------------------------------------------------------------|-----| | | and PLL loop-bandwidth variation | 53 | | 3.19 | Schematic of the DCO, buffer and MMDIV | 54 | | 3.20 | Schematic of the DCO, buffer and MMDIV | 55 | | 3.21 | Always-on coarse PLL with a dead zone of ±64ps which consumes almost | | | | zero power after phase locked | 56 | | 3.22 | Simulated lock transient of the proposed coarse-DPLL | 56 | | 3.23 | Measurement result of the 4bit TDC at 52MS/s | 57 | | 3.24 | Measurement result of the proposed DTC at 52MS/s | 57 | | 3.25 | (a) Measurement result of the proposed DPLL w/o reference doubler (b) | | | | Measurement result of the proposed DPLL w/ reference doubler | 58 | | 3.26 | (a) Measurement result of the proposed DPLL w/o reference doubler with | | | | in-band fractional spur (b) Measurement result of the proposed DPLL w/ | | | | reference doubler with in-band fractional spur | 59 | | 3.27 | Measurement result of the fractional spurs vs spur frequencies | 60 | | 3.28 | (a) Measurement of the TA gain calibration under voltage variations (b) | | | | Measurement of the TA gain calibration under temperature variations | 61 | | 3.29 | Measured lock transient from an initial frequency error of 13MHz | 62 | | 3.30 | Measured power break down of the proposed fractiona-N DPLL | 62 | | 3.31 | FOM comparison with the state-of-the-art fractional DPLLs under 5mW | 64 | | 3.32 | Chip micrograph | 65 | | 3.33 | Conventional constant slope DTC | 66 | | 4.1 | (a) Low-IF receiver architecture (b) Conventional analog phase tracking | | | 1.1 | RX (c) Conventional digital phase tracking RX | 70 | | 4.2 | (a) Conventional hybrid-loop RX with the DPLL-based ADC (b) Pro- | , 0 | | | posed hybrid-loop RX with the dynamic-range enhanced DPLL-based ADC. | 71 | | 4.3 | (a) Concept of conventional open-loop DPLL-based ADC (b) Conversion | | | | diagrams. | 74 | | 4.4 | (a) Proposed closed-loop DPLL-based ADC with improved varactor lin- | | | | earity (b) Conversion diagrams | 75 | | 4.5 | The discrete-time model of the proposed DPLL-based ADC | 76 | | 4.6 | Plots of the STF, NTF and the attenuation factor with a DPLL bandwidth | | | | of 5MHz | 77 | | 4.7 | Simulated required varactor linear range vs DPLL bandwidth | 77 | | 4.8 | Schematic of the DAC feedback path | 79 | | 4.9 | Test bench of the V2F conversion gain | 79 | | 4.10 | Simulated linearity of (a) V2F conversion w/o DAC feedback (b) V2F | | |------|-------------------------------------------------------------------------------------------------|-----| | | conversion w/ ideal DAC feedback (c) nonideal DAC (d) V2F conversion w/ non-ideal DAC feedback. | 80 | | 4.11 | | 80 | | 4.11 | domain diagram | 82 | | 4.12 | _ | 02 | | 1,12 | diagram | 83 | | 4.13 | _ | 0.0 | | | posed loop-latency reduction (c) Timing chart of the loop-latency reduction. | 84 | | 4.14 | Phase noise simulations of 5MHz-BW DPLL with different loop latencies. | 85 | | | Proposed BLE RX baseband with DPLL-based ADC and phase/frequency | | | | synchronization loop | 85 | | 4.16 | Simulated result w/o frequency and phase synchronization loop. (b) Sim- | | | | ulated results w/ frequency and phase synchronization loop | 86 | | 4.17 | Proposed DPLL-centric BLE Transceiver | 88 | | 4.18 | RX front-end implementation with entire 1V-supply | 89 | | 4.19 | (a) Low power consumption source degenerated LNA with stacked gm- | | | | cell (b) LC-Oscillator | 90 | | 4.20 | (a) Small area and highly balanced stacked balun (b) EM simulation of | | | | stacked balun | 91 | | 4.21 | Block diagrams of single-point polar TX | 93 | | 4.22 | Measured DPLL phase-noise at 2441.75MHz with TX/RX off | 95 | | 4.23 | Measured stability of the RX when the large in-band blockers and the | | | | desired signal are fed to the RX | 96 | | 4.24 | Measurement result of the ADC SNDR | 97 | | 4.25 | (a) Measurement result of the PGA output w/o phase and frequency syn- | | | | chronization Loop (b) Measurement result of the PGA output w/ phase | | | | and frequency synchronization Loop | 97 | | | Measured demodulator with CDR function | 98 | | 4.27 | Measured BER with phase and frequency synchronization loop when the | | | | carrier frequency offset is presented in the TX signal | 99 | | 4.28 | (a) Measurement result of the RX ACR with and without DAC feedback | | | | loop (b) Measurement result of the out-of-band blocker tolerance 1 | 100 | | 4.29 | (a) Measurement result of the TX spectrum mask (b) Measured eye dia- | | | | gram of the single-point polar transmitter | 101 | | 4.30 | Settling time of DPLL at 0-dBm PA output when DPLL and PA start up | | | | simultaneously | 102 | | LIST OF FIGURES | xiii | |-----------------|------| | | | | 4.31 | Measured power consumptions of each building blocks | |------|-------------------------------------------------------------------------------------| | 4.32 | Chip photo of BLE TRX | | 4.33 | The present TRX RF I/O solution | | 4.34 | Harmonics from the present BLE TX | | 4.35 | RF-FE with integrated matching network and antenna switch 105 | | | | | 5.1 | Low-power DPLL-centric receiver architecture | | 5.2 | FOM comparison with the state-of-the-art fractional DPLLs under $5 \text{mW}$ . 109 | | 5.3 | (a) Proposed closed-loop DPLL-based ADC with DC offset at DAC out- | | | put (b) Conversion diagrams | # **List of Tables** | 1.1 | Transmitter Characteristics | 7 | |-----|-------------------------------------------------------------|-----| | 1.2 | Receiver Characteristics | 8 | | 3.1 | Comparison Table of The State-of-The-Art DTCs | 48 | | 3.2 | Comparison Table of The State-of-The-Art fractional-N DPLLs | 64 | | 4.1 | Comparison Table of The State-of-The-Art BLE 4.0 TR/RX | 103 | | 4.2 | Major Differences from BLE 4.2 | 104 | | 5 1 | Comparison Table of The State-of-The-Art RLF 4.0 TR/RX | 108 | ## **Chapter 1** #### Introduction # 1.1 Internet-of-Things and Its Available Wireless Standards Internet of Things (IoT) covers huge industries and application scenarios. Fig. 1.1 shows several application cases. Machines such as vehicles can be connected over the air and "talk" to each other to ensure safe driving. Devices on or inside a human's body can help monitor both the mental or physical state from a person and share those data with a doctor or the family members. Devices for our daily life such as smartphones, cameras, TVs, lights will also be connected to cooperate, which multiple application scenarios can be defined to shrink the manual setting by humans greatly. For example, multiple types of sensors can be placed inside the house, ambient light sensors and proximity sensors will automatically turn on the light and change the level of the light by sensing if there are people inside the room and how is the environment light condition. Humidity sensors, temperature sensors, and air quality sensors will control the air conditioner and the air purifier to regulate the room temperature and clean the harmful particles inside the house. By those interconnected "smart" devices in the house, the house itself can become intelligent and more comfortable for people to live in. Finally, the city infrastructures will also form a huge network, for example, devices with beacon mode will form a very accurate navigation network that can navigate the people even in the indoor scenario. All those above mentioned wireless networks will greatly benefit from the high-speed wireless internet that enabled by the 5G cellular network. In the upcoming future, the small networks such as the wireless network in the house will be connected to a micro, pico or even femto gateway devices like the Wi-Fi router, and those micro, pico or femto gateway devices will finally be connected to the internet by the 5G base stations and eventually form a massive things' network. Tons of data can be processed locally or Figure 1.1: The concept of the Internet of Things. online. Artificial intelligence (AI) can access those data and help people to manage most of the things in our daily life, in our work *etc.*. A revolutionary era is approaching if IoT and other related technology are fully deployed. After many years' researches and developments by industry companies and academic institutes, lots of wireless standards are carried out and developed year after year to realize the IoT era. Different standards are used for different cases in the complicate radio environment. Here we list some of the popular wireless standards for IoT applications: **Bluetooth Classic:** A standard operates at 2.4GHz which is located in the globally unlicensed while being regulated industrial, scientific and medical (ISM) frequency band. It utilizes the frequency hopping technique to avoid being interfered. It specifies a data rate of up to 3Mbps with a maximum communication range of around 100m. **Bluetooth Low-Energy:** A standard operates at 2.4GHz ISM-band and uses frequency hopping techniques. The latest standard of 5.0 version specifies a data rate of 125kbps, 1Mbps and 2Mbps and a minimum range of 200m outdoors and about 40 meters indoors. Other features include the beacon mode and ultra-low-energy consumption. **IEEE 802.15.4:** A standard specifies a low-rate wireless personal area networks (LR-WPANs). It is the basis for ZigBee, WirelessHART, MiWi and ISA100.11a specifications. **Zigbee:** A standard operates in the 2.4 GHz ISM-band with 250 kbps data rate. The maximum number of nodes in the Zigbee network is 1024. The communication range is approximately 200 meter. **WirelessHart:** A standard provides a robust wireless protocol for the full range of process measurement, control, and asset management applications. **DigiMesh:** A standard specifies a proprietary peer to peer networking topology for the wireless end-point connectivity. **NFC:** It operates at a center frequency of 13.56 MHz using inductive coupled devices with a data rate up to 424kbps and a range of few centimeters. **ANT:** It is used for wireless sensor networks operating at 2.4GHz ISM-band. It establishes rules for data representation, signaling, co-existence, error detection, and authentication. **EnOcean:** It is an energy harvesting wireless technology. It works at 902MHz, 868MHz, or 315Mhz frequency band. The transmitting ranges are 30m for indoor and 300m for outdoor. **Wi-Fi:** It operates at 2.4 GHz, 3.6 GHz and 4.9/5.0 GHz bands with a data rate from several Mbps to several Gbps. Common range is up to 100m and it can be extended. **RFID:** It operates at 120-150 kHz (LF), 13.56 MHz (HF), 433 MHz (UHF), 865-868 MHz (Europe), 902-928 MHz (North America) UHF, 2450-5800 MHz (microwave), 3.1-10 GHz (microwave) frequency bands with a range from 10cm to 200m. Some usage examples include road tolls, building access *etc*. **NB-IoT:** It is standardized by the 3GPP. It based on the present 3G/4G LTE networks with very narrow data bandwidth. It features at low rate and very long range. #### 1.2 Bluetooth Low-Energy and its Applications After decades of evolution from 1994, *Bluetooth*<sup>®</sup> technology becomes one of the most popular wireless standards in short and middle range communications. Together with Wi-Fi technology, it is widely adopted in almost all kinds of wireless communications. The most famous application scenario is its integration in every smart phone nowadays to replacing the conventional wire-line solutions such as the ear phones and keyboards *etc*. The Institute of Electrical and Electronics Engineer (IEEE) standardized the Bluetooth standard as IEEE 802.15.1. However it is no longer maintained by IEEE. Now a special interest group (SIG) oversees development of this standards as well as protects the trademarks. The manufacturer or the developer must satisfy Bluetooth SIG standards before releasing to the market. The standards defines two modes for different purposes. The classic mode (Bluetooth Classic) targets at a high data rate, and the low energy mode (Bluetooth Low-Energy/BLE) focus at ultra-low-power (ULP) operation. The BLE is more popular in terms of the IoT applications because of the extended battery life enabled by the supper low energy consumption. The moderate data rate can also meet most of the communication requirements. The core specification are already update to version 5.0 (BLE 5.0) [1] which added more new features for IoT applications than the previous version 4.2 (BLE 4.0) [2]. This thesis is based on BLE 4.0 because the work was started before BLE 5.0 has been released in the year of 2017. The physical layer specifications are shown in detail in [1, 2]. BLE devices support four different roles, which behave difference when different roles are selected: - **Role 1**: It performs an advertiser which is connectible and can operate as a slave in the connection. For example, a thermometer sensor. - **Role 2**: It performs a master device that scans for advertisers and can start and initiate the connections. It can establish many connections simultaneously. For example, the computer or the smart-phone. - **Role 3**: It performs a broadcaster that is a non-connectible advertiser. Good examples are the tag for asset tracking or the tag for the pet ID. - Role 4: It performs an observer scans for advertisements, while it couldn't initiate the connections by itself. For example, a display that receives the temperature data and displays it. The first two roles are connectible and the last two roles are non-connectible. The variety of the roles present a good support for different low-range medium/low-data rate application scenarios. It is another reason that BLE standard becomes one of the most popular wireless technology for IoT applications. #### 1.2.1 Beacon Mode of Bluetooth Low-Energy In wireless technology, the beacon mode is the concept that a battery driven device keep broadcasting small pieces of information to the surrounded or passing-by devices. These small pieces of information may include: 1. the ambient data, such as temperature, air pressure, humidity *etc.*; 2. the micro-location data, such as asset tracking, retail *etc.*; 3. orientation data, such as acceleration, rotation, speed *etc.*. Usually, the transmitted data is a static one. However, it also may be the dynamic one, depending on the application. With the aforementioned BLE technology, the beacons can run for years without changing the battery. The BLE is ideal solution for beacon mode. Not only because of the low-power operation but also the BLE eco-system is already deployed in most of the smart-phone and other BLE embedded devices. As such, the beacon mode will be one of a important technology for IoT and BLE will be a suitable standard to support this technology achieving better performance. #### 1.2.2 Mesh Network of Bluetooth Low-Energy As shown in Fig. 1.2, BLE supports a variety of wireless communications. It supports point-to-point communication, such as the audio transmitting and receiving, the video transmitting and receiving. It also supports broadcasting communication, such as beacon advertising. Recently, in the BLE 5.0, the Bluetooth SIG group added another feature to the present BLE standard, which is the availability of the mesh network. It will be the future of the communication topology for IoT applications. The mesh network is originated from the concept of massive network connection. It is a network topology in which a device (we name it a "node") transmits its data, and at the same time serves as a relay for other nodes. The routers are practiced to yield the most efficient data path for effective data communication. If a hardware failure occurs, many routers are available to continue the conversation which maximally ensures the connectivity and the data security. #### 1.3 Challenges for BLE Transceiver Design Fig. 1.3 shows an example of a BLE transient power profile from a commercial CC26XX SoC from TI. When the receiver is not active, the power is dominate by the leakage power/sleep power. When the desired signal is detected, the receiver is waken up by off-chip triggers and some pre-process procedures starts to run before transmitting and receiving the desired signal. After the pre-process operation, the RX and TX starts to work which receives and transmits data by a sequence decided by the application. The power consumption will be mainly dominated by the TX and RX active power if multiple transmitting and receiving steps are required. After finishing the receiving and transmitting steps, the TRX will do some post process and enter the sleep mode to save the power. Typical coin battery is compact for its size, however, the energy it contains is also limited by its size. For example, a SR44 alkaline coin battery contains an energy of 150mAh. Hence, the TX and RX active power should be minimized for the BLE applications. Figure 1.2: Variety types of communications that BLE supports. Figure 1.3: Power consumption of the breakdown of a Texas Instruments CC26XX BLE device during a connection event. #### 1.3.1 Transmitter Design and its Challenges Table 1.1 shows the specifications used to evaluate the BLE transmitter (TX). 40 channels with 2MHz spacing is assigned while the data rate is only 1Mbps with a 0.5 modulation | Channels | | K=0~39 | | |------------------------------------|---------------------------------------|-------------------------|--| | | | (2402MHz+K*2MHz) | | | Modulation Sc | hama | GFSK | | | Wiodulation 50 | Modulation Scheme | | | | Eraguanay Day | ziation | 250kHz | | | Frequency Dev | /lation | (Minimum Value >185kHz) | | | Crosshala Data | | 1Mbps | | | Symbole K | Symbole Rate | | | | Transmit Power | | -20dBm to 10dBm | | | In-band Spur Emission <sup>a</sup> | $ M^b-N^c =2MHz$ | <-20dBm | | | in-band Sput Ellission | M <sup>b</sup> -N <sup>c</sup> ≥3MHz | <-30dBm | | | Harmonic Emission | 2nd Harmonic | <-41dBm | | | Trainfonc Emission | 3rd Harmonic | <-41dBm | | | Center Frequency Drift | Max. Value | ±50kHz | | | Center Prequency Difft | Rate | 400Hz/μs | | Table 1.1: Transmitter Characteristics index. Most of the signal energy concentrates between -500kHz to +500kHz. This large channel spacings between each channel can greatly relax the near channel interference which relax the power consumption for the receiver (RX) design. A center frequency drift tolerance is 50kHz which maximumly relaxes the local oscillator (LO) specifications. Even a free running oscillator could be adopted in the design by this relaxed condition. However, in order to reduce the influence to other receiving devices such as Wi-Fi RX, the second harmonics should be greatly suppressed. A -41dBm of the 2nd harmonic of the transmitting frequency should be satisfied, which brings challenges when TX delivers a 10dBm signal. The harmonic suppression ratio should be over 51dBc to satisfy the FCC regulation. #### 1.3.2 Receiver Design and its Challenges Table 1.2 lists the most common requirements for the BLE RX design. A minimum sensitivity of -70dBm is required for the TX. However, most of the commercial applications required a less than -90dBm sensitivity. The improved sensitivity not only improves the receiving range but also help reduce the transmitting power level. The reduced power level from TX can greatly reduce its influence on other surrounded receivers. The max- <sup>&</sup>lt;sup>a</sup> An adjacent channel power is specified for channels at least 2 MHz from the carrier. Power is integrated in 1MHz bandwidth. <sup>&</sup>lt;sup>b</sup> Center frequency. <sup>&</sup>lt;sup>c</sup> Adjacent channel frequency. | Sensitivity | | <-70dBm | |-----------------------------------------|------------------|-------------| | Max. Input I | Max. Input Power | | | PER <sup>a</sup> (BE | R) | 30.8%(0.1%) | | | 0MHz | -21dB | | Adjacent Channel Rejection <sup>b</sup> | 1MHz | -15dB | | | 2MHz | 17dB | | | ≥3MHz | 27dB | | | 30MHz-2000MHz | -30dBm | | Blocker Power <sup>b</sup> | 2000MHz-2400MHz | -35dBm | | BIOCKEI FOWEI | 2500MHz-3000MHz | -35dBm | | | 3000MHz-12.75GHz | -30dBm | Table 1.2: Receiver Characteristics <sup>&</sup>lt;sup>b</sup> The desired signal is used for the measurements. Figure 1.4: The basic receiver architecture imum input power tolerance is required as -10dBm which results a minimum dynamic range requirement of -70dBm-(-10dBm)=60dB for the entire receiver. The blocker performances are separated as two kinds: the in-band blocker tolerance/the adjacent channel rejection (ACR) and the out-band blocker tolerance. They are measured using a desired signal ( $S_{\rm desire}$ ) as the transmitting signal from TX. The blocker signals are specified as ( $S_{\rm blocker}$ ) with different offset frequency from $S_{\rm desire}$ . $S_{\rm desire}$ is specified as 3dB higher than the specified sensitivity level of -70dBm which is -67dBm. $S_{\rm desire}$ and $S_{\rm blocker}$ are both specified as 1Mbps GFSK signals with modulation index of 0.5 and BT of 0.5. The modulating data for $S_{\rm desire}$ is PRBS9 code while PRBS15 is used for $S_{\rm blocker}$ . <sup>&</sup>lt;sup>a</sup> Packet error rate. From this table, the design challenges can be explained when we roughly derive the electrical specifications for the RX design. The basic architecture of the modern RX is shown in Fig.1.4. A radio frequency (RF) front-end (FE) is followed after the antenna which amplify the desired signal $S_{\rm desire}$ with minimum added noise while suppressing the interference signal $S_{\rm blocker}$ . It also performs an frequency translator by using a LO that shift the center frequency of the $S_{\rm desire}$ to a lower frequency in order to relax the requirements of further stages. An analog to digital converter (ADC) can be placed after the RF-FE to convert the analog signal into digital signal. The digital modem will do a post process to decode the analog baseband signal into 0 and 1 which can be recognized by digital computers to translate into all kinds of virtual information. The RX sensitivity is mainly decided by the noise from the RF-FE and required signal to noise ratio ( $SNR_{\rm modem}$ ) from the digital modem to satisfy the specified bit error rate (BER). For a FSK baseband signal such as BLE, a 12dB can be an idea estimation of $SNR_{\rm modem}$ . For a -95dBm sensitivity, the required noise figure of the RF-FE will be: $$NF = S ensitivity - 10\log_{10}(kT \cdot B) - SNR_{\text{modem}} = 7dB$$ (1.1) where B is the bandwidth of the baseband signal. This puts a stringent noise requirement on the amplifier design in RF-FE which will burn significant amount of current to suppress the noise from the antenna (kT noise). Another challenge comes from the large input power of -10dBm. Because the conversion gain of the input $I_{RF}$ to the output $V_{BB}$ cannot be absolute linear as shown in Fig.1.4, the large input signal will cause desensitization of the RX gain as well as produce the intermodulation signals such as 3rd-order intermodulation distortion (IMD3). The intercept point (IP) is used to evaluate this performance. The 3rd-order IP can be compute as: $$IIP3 = -10dBm + SNR_{\text{modem}}/2 = -4dBm$$ (1.2) The larger IIP3 is, the larger power budget will be required to maintain the linearity. Those noise and linearity requirements can be integrated into one specification of RX dynamic range. Here a dynamic range of -95dBm-(-10dBm)=85dBm is required for the RX. Assume the conversion gain of the RF-FE is $G_{RF}$ and the SNDR of the ADC is SNR<sub>ADC</sub>. We can have the following two equations: $$SNR_{ADC} + G_{RF} - NF - SNR_{modem} = 85dBm$$ $$\Rightarrow SNR_{ADC} + G_{RF} = 104dB$$ (1.3) Such a high dynamic range will either require a high gain of RF-FE or a high resolution Figure 1.5: Wanted and unwanted signals down-converted by a mixer and LO. of the ADC. If a high $G_{RF}$ is designed, not only the increased gain itself requires more current but also the linearity is hard to maintain with limited power budget as shown in Eq. 1.2. If we enlarge the effective resolution of ADC, the power will also increase significantly. Another challenge for RX is the blocker immunity especially the ACR. Basically, analog filters are embedded to filter out the additional energy from the blockers. If a linear RF-FE and an idea ADC is assumed, the suppression required for ACR at 3MHz offset will be at least 39dB which means a 3rd-order low-pass filter (LPF) is required for a 500kHz bandwidth. Notice that any non-idea effects from the RF-FE and ADC will greatly increase the required suppression from the LPF which increase the power consumption. Last but not least is the nonideality from the LO. Fig.1.5 shows the frequency translation process inside the RF-FE. A mixer is acted as a multiplier that move the high frequency signals to a lower frequency by using a LO signal. As shown in Fig.1.6(a), if an ideal LO is used, the wanted signal $S_{\text{wanted}}$ will be down-converted to a intermediate frequency ( $f_{\text{IF}}$ ) while the unwanted signal such as a blocker is down-converted to $f_{\text{unwanted}} - f_{\text{LO}} + f_{\text{IF}}$ . However, the local oscillator such as a oscillator or a phase locked loop (PLL) will produce noise around the idea LO frequency, *i.e.*, phase noise. As shown in Fig.1.6(b), the unwanted signal will also be down-converted to $f_{\text{IF}}$ and the component $S_{\text{mx,noise}}$ cannot be distinguished after conversion. This process is well known as reciprocal mixing. Because of the nonideality from the LO, the phase noise level and the spur energy from the LO should be well constrained at blocker frequencies. For the ACR at 3MHz, the phase noise at 3MHz offset frequency is calculated as: $$PN_{\text{ACR} \geqslant 3\text{MHz}} = S_{\text{desired}} - S_{\text{blocker}} - SNR_{\text{modem}} - 10\log_{10}(B) = -99\text{dBc/Hz}$$ (1.4) The spurs from the LO, such as the fractional spur and the reference spur, will also cause reciprocal mixing which leads to potential degradation on the SNR of the desired Figure 1.6: (a) Conventional data digitization process by the full-range TDC (b) Phase domain diagram. signal. For the offset frequency more than 3MHz, the maximum spur level can be calculated as: $$Spur_{ACR \ge 3MHz} = S_{desired} - S_{blocker} - SNR_{modem} = -39dBc$$ (1.5) As a conclusion, the main challenge for the BLE RX is from the limited power budget when low sensitivity and high blocker tolerance are simultaneously required. Fig. 1.7 shows the battery life of the state-of-the-art RX using a coin battery (SR44). A duty-cycle of 600 $\mu$ s over 1s (0.06%) is assumed to enable and disable the RX. Due to the small duty cycle, the leakage power in sleep mode of the RX should be taken into consideration. From the data sheet of TI's most recent BLE SoC series (CC26XX), the leakage power can be assumed around 100nW in sleep mode. And the battery can support the receiver work for 190 years in the sleep mode, which can be negligible for RX operation. If we Figure 1.7: Battery life of the state-of-the-art BLE RX and our target. assume the the RX requires a zero start-up time, the RX active power will dominate the RX system power. An active power of under 2.5mW can support the RX system keep working for 19 years. #### 1.4 Overview of the Thesis The aim of this thesis is to investigate and achieve a fully integrated BLE transceiver using advanced CMOS technology toward the future IoT technology. The thesis is organized as follows: Chapter 1 begins with an overview of the wireless standards for IoT applications and carries out the importance of the BLE standard. Chapter 1 also analyzed the design challenges for the BLE TRX design when considering the PHY specifications. The reciprocal mixing effect is explained in detail for the RX design, and the corresponded phase noise and spur performances of the DPLL are calculated and given. Chapter 2 introduces some fundamentals and essential features of both the analog-type fractional-N PLL and the digital-type fractional-N PLL. The focus of Chapter 2 is on the digital-type PLL toward advanced CMOS technology. Different DPLL architectures are introduced to show the trade-offs between the power, jitter, locking time, etc. for each DPLL architecture. Then, various time-to-digital converter (TDC) and DTC architectures are introduced and reviewed for their power, jitter, and linearity trade-offs. Finally, the phase noise of the LC oscillator is introduced for understanding the noise to the phase noise conversion mechanism. Chapter 3 presents the proposed ULP fractional-N DPLL, which achieves sub-mW operation, a worst-case fractional spur of -56 dBc, and a jitter-power FOM of -246 dB. A 1st-order DSM-based fractional-division controller is discussed and analyzed for demonstrating its potential to support the sub-mW operation for a DPLL with a good jitter performance. A 10-b isolated constant-slope DTC is proposed to demonstrate the linearity and power efficiency improvements over the conventional DTC architectures. The gain-and-offset calibration of the time amplifier (TA) is introduced to help minimize the inband phase noise degradation by both TA gain error and the TA offset. Finally, the whole fractional-N DPLL design is carried out and measured with the proposed techniques. The comparison with the state-of-the-art fractional-N DPLL is summarized for demonstrating the merits of using the proposed techniques. Chapter 4 introduces a BLE transceiver that utilized the proposed DPLL above as a central component. The embedded low-power wide-bandwidth fractional-N DPLL performs as 1) DPLL-based analog-to-digital converter (ADC) for the RX; 2) local oscillator (LO) for the RX; 3) phase and frequency synchronizer for the RX; 4) direct frequency modulator for the TX. The multi-function of the DPLL minimizes the power consumption of the TRX. The wide-bandwidth DPLL supports the single-path demodulation method for reducing the conventional I/Q branches to only I channel in RX. Hence, the number of the required analog baseband circuits is reduced by half, which helps reduce the RX power consumption significantly. Besides, the dynamic range of the DPLL-based ADC is greatly enhanced by 18 dB thanks to the proposed DAC feedback structure. It substantially improves the sensitivity and blocker performances. Finally, the entire BLE TRX is introduced and evaluated, which achieved the recorded low power consumption when compared with other state-of-the-art BLE TXs/RXs. Chapter 5 is the conclusions for the thesis and the presented studies. Finally, future works are discussed for further developing the presented researches in this thesis. ### **Chapter 2** # Fractional-N Digital PLLs for Wireless Communication #### 2.1 Analog-Type Fractional-N PLLs Due to the limited quality factor of the resonating circuits and unpredicted control gain, a free-running oscillator along can not used as a local oscillator frequency source nor the clock synthesizer. By using a negative feedback system and a high quality factor reference clock to control the free running oscillator, a frequency synthesizer/phase locked loop (PLL) can be realized. The analog PLL, comparing with its recent counterpart digital PLL(DPLL), still have better performance in terms of the phase noise performance. The Charge Pump PLL(CPPLL) as one of the most popular analog PLLs is widely researched and developed. The topology of a CPPLL is shown in Fig. 2.1. The PFD(phase/frequency detector) will compare the phase and frequency error between reference and feedback clock and generates the frequency/phase error. The charge pump will converted the frequency/phase error to a loop filter and produce the correction signal to the VCO. The feedback signal will be divided by a frequency divider. If this divider operates at a integer value, it is called integer-N PLL. If a fractional divider is applied to the feedback path, it is called fractional-N PLL. The number of the pole at the origin in the open loop transfer function will decide which type of the control system is. If the PLL system has only one pole introduced by the oscillator, we call it type-I PLL. Type-I systems always show a good response speed comparing to higher type PLLs and very stable. However, it has a worse phase noise performance. The type-II PLL which is shown in Fig. 2.1 demonstrate an additional pole at origin introduced by the integrator capacitor C2. The static phase error will go to zero when the loop is locked, which exhibits a good phase noise performance. Figure 2.1: Topology of CPPLL. Figure 2.2: Phase-domain fractional-N DPLL. #### 2.2 Types of Fractional-N DPLLs Analog charge-pump PLL (CPPLL) has been the most preferred PLL architecture to implement fractional-N frequency synthesizing. CPPLL can realize the excellent jitter and spurious performance with small power consumption, *i.e.*, high FOM. However, an analog PLL loop filter requires a large area and is also very difficult to reconfigure. Further more, for advanced CMOS technology, digital circuits are more preferred to replace the analog counterparts such as filters, calibrations *etc*. To overcome these shortcomings, digital fractional-N PLL is investigated and studied which requires no large capacitors. Thanks to their intensive digital implementation, loop characteristics are much easier to reconfigure. Also, the digital PLL is much easier to shift from one process to another. Fig. 2.2 shows the first proposal of the fractional-N DPLL. It uses a counter and a Figure 2.3: Divider-based fractional-*N* ADPL. time to digital converter (TDC) to count the phase values from the oscillator. A reference clock is used to perform a clean timer to read out the counted phases. The integer counter which works at oscillator frequency, will count every oscillator period. Its output can be represented as a integer phase. However, if only counter is implemented, there will always a residue phase information within one oscillator period cannot be extracted due to the limited resolution of the integer counter. Hence, a fractional counter with much finer resolution is required to cover one oscillator period to assist the integer counter operation. The fractional counter is also named as TDC. The integer phase and fractional phase of the oscillator within one reference period is quantized and synchronized by reference clock. Because a frequency control word (FCW) is more commonly used than the phase control word (PCW) in frequency synthesizer applications, the integer phase and fractional phase will be differentiate into the integer frequency and fractional frequency information by a digital differentiator. Then it compares with the desired FCW at PLL input. The error information is the frequency error and the phase error is produced after integration operation inside the digital loop filter (DLF). A digital controlled oscillator (DCO) is implemented which the oscillation frequency is controlled by digital codes. This architecture is well known for its pure phase domain operation and all the phase information from DCO can be derived by the counter and TDC. Another famous fractional-N DPLL architecture is shown in Fig. 2.3. This architecture is very similar to a CPPLL using the delta-sigma modulator (DSM) and a multi-modulus divider (MMDIV) at the feedback path. The MMDIV is a frequency divider as wel as a phase integrator. The FCW is input to the DSM oversampled by a clock at reference clock rate. The long bit width FCW will be modulated to a short bit width FCW which has a same length of the MMDIV control code width. The modulated output will dither the phase integration ratio of the MMDIV. In average, the MMDIV will produce an average phase integration ratio that should equal to the integration of FCW every refer- Figure 2.4: DTC-based fractional-N DPLL implementation. ence clock. If the phase from MMDIV is not equal to the reference phase, the TDC will produce a corresponded phase error. The DLF will filter the phase error and control the DCO accordingly. The TDC is similar as a phase frequency detector (PFD) and charge pump in CPPLL. The MMDIV also translate the frequency of DCO to a reference rate, hence the frequency error is also detected at TDC simultaneously. This architecture is known as the divider based DPLL. There are several differences in designing these two architectures. The phase domain DPLL requires a absolute synchronized operation of the counter and the TDC to count the DCO phase within one reference period. And the TDC range should be exactly one DCO period. Any mismatch in the TDC range and DCO period will cause significant fractional spurs in fractional mode. As for the divider based DPLL, the TDC range should be over one DCO period at least if a 1st order DSM is applied. A higher DSM order will results in a much more wider TDC range. However, the TDC may not synchronize with any other circuits. For phase domain DPLL, the TDC and counter are both working at DCO frequency. While for divider based architecture, the frequency is lower at TDC input to a rate of the reference frequency. As mentioned above, for the divider based fractional-N DPLL, the TDC will require a quantization range of over at least one DCO period. As a phase calculator, the linearity is very important for achieving lower fractional spurs. This will cause large power consumptions for TDC design which is not preferred in low power applications such as BLE *etc*. Fundamentally, this large phase error is caused by the quantization error from the MMDIV due to the finite phase resolution of one DCO period. It performs an integer integration operation. So, if we can minimize this error, the phase error at TDC input could be reduced. Since the MMDIV is an integer phase integrator, a fractional integrator with much higher resolution is required. A digital to time converter (DTC) is proposed to realize fractional time resolution while the digital integrator is used to integrate the fractional phase, as shown in Fig. 2.4. As a results, the DTC can reduce the quantization error from the MMDIV to several DTC resolution time hence a narrow-range TDC can be implemented. This greatly cut the TDC power consumption. At a extreme case, even 1bit TDC (BBPD) can be used for fractional operation. However, as discussed in ref. [3], a BBPD suffers from the very limited quantizing range and an ill defined gain. The limited quantizing range causes the slow convergence speed of the PLL by the well known cycle-slip effect and also causes slow settling speed of the DTC gain calibration (LMS). A multi-level TDC is desired for achieving faster converge speed and an well defined PD gain, which is more preferable for wireless applications [3]. # 2.3 Building Blocks of Fractional-N DPLL ## **2.3.1** Time-to-Digital Converter(TDC) As one of the most important important building blocks in DPLL operation, time to digital converter (TDC) converts the phase information of the DCO into digital codes with sufficient accuracy, *i.e.*, the effective resolution of TDC. As for a data converter, the power consumption, the quantization noise, the linearity and the full scale of the TDC are important factors to take care and must be well optimized to satisfied the system requirement. In recent years, several different types of TDCs have been developed to achieve high resolution and good linearity, *i.e.*, the flash type, the charge-based type and the noise shaping type. #### Flash Type TDC Flash type TDC is the most commonly seen TDC because of its simplicity in design and its resolution improvement from the advanced CMOS technology. Just like flash type ADC, analog interpolation is required to generate multiple references to compare with. The flash type ADC is using resistor string to interpolate a clean reference voltage and the signal is compared with each interpolated voltage reference using a voltage comparator. The voltage drop across the each of the resistor is the designed voltage resolution. Using the same mechanism, the flash type TDC uses a CMOS buffer chain to interpolate a clean reference clock. It is also called the "delay line based TDC" as shown in Fig. 2.5. The reference clock $CK_1$ is interpolated by the buffers with a intrinsic delay of $t_1$ to generate Figure 2.5: Delay line based flash TDC. multiple clock edges. Then, $CK_2$ compares with each edge to derive the lead and lag relationship by using high precision D flipe-flop (DFF) logics. The thermal codes are converted into binary output by means of a decoder. It is easily notice that the buffer delay shrink with the technology evolution which improve its resolution. However, even with the 7nm technology, the maximum achievable resolution is around 5ps. For typical high performance fractional-N DPLL, sub-ps resolution is required for lower in-band phase noise. Furthermore, the buffer delay mismatch will greatly degrade the INL performance if a large phase quantizing range is desired. In order to improve the resolution, vernier chain TDC is proposed as shown in Fig. 2.6. As compared with delay-line based TDC, another buffer chain is inserted at $CK_2$ path. By doing so, the minimum quantization steps becomes $t_1 - t_2$ which can be very small regardless to the technology. However, a large quantization range will cause a very long buffer chains at both paths. It degrades the linearity while consuming large power. Another technique to interpolate the time is to use a so-called "Local Passive Interpolation (LPI) TDC" [4] as shown in Fig. 2.7. A resistor chain is inserted between the input and the output of the delay cell to acquire more phases from one delay unit. The internal phases are more robust in PVT than the vernier-chain TDC. However, higher linearity requires smaller resistors which greatly increases power dissipation. Other methods to increase the flash type TDC resolution are reported in [3, 5, 6] by using time amplifier (TA). TA amplifies the input time difference proportional to its gain which effectively increases the TDC time resolution. However, the TA suffers from gain and offset variation across PVT which makes it less practical. Figure 2.6: Vernier-chain TDC. Figure 2.7: Linear passive interpolated TDC. ## **Charge-based Type TDC** The charge based TDC is another choice to realize high performance TDC [7, 8]. One of the simplest way is to convert the charge information into the digital quantity using a real world ADC [8] as shown in Fig. 2.8. PFD and charge pumps are used to derive the input phase difference and convert the phase error into charge information. The SAR ADC followed can quantize the charge information on its capacitor into digital codes. The conversion gain is decided by the charge pump gain, and its resolution is decided by the charge pump current and the ADC resolution. The merits of this architecture is that the charge pump shares integrating capacitor with the capacitor-DAC (CDAC) inside the SAR ADC, and its resolution can be lower to sub-ps with sufficient amount of quantization range. However, the two individual charge pumps required for the ADC for differential signals suffer from mismatches. Furthermore, the up-down current mismatches are also critical factor that will degrades the linearity performance. Figure 2.8: Charge domain TDC using SAR ADC. Single Slope TDC (SS-TDC) is proposed to realize high resolution, good linearity with lower power consumption [7] as shown in Fig. 2.9. Instead of using two charge pumps, a single current source are used to generate a slope $V_F$ that proportional to the phase error $\Phi_F$ , and the "amplifier" slope $V_{RAMP}$ by using N:1 ratio capacitors $C_R$ and $C_F$ . When the current source starts charging $C_R$ , a start signal is generated and a counter value running at $f_{CNT}$ is recorded as CNT(N). When $V_{RAMP}$ reaches the coltage level of $V_F$ , a stop signal is generated by a comparator. The counter value of CNT(N+1) is recorded. Hence, the time difference $\Delta T_{ERR}$ can be compute using the following equation: $$\Delta T_{\text{ERR}} = \left(CNT(N+1) - CNT(N)\right) \cdot \frac{1}{f_{\text{CNT}}} \cdot \frac{1}{N}$$ (2.1) This method shares the current source in two conversion steps which cancels the nonlinearity from the charging process. It achieves a high linearity with good power efficiency. However, the resolution is depends on how fast the counter runs, this potentially limits the maximum achievable resolution from a high reference clock. ## **Noise-Shaping TDC** As widely known, noise shaping technique can improve the effective resolution by shaping the noise out of the spectrum of interest. Just like noise shaping ADC that shapes the low frequency quantization noise power to the high frequencies, the noise shaping TDC can shape the jitter power to the higher frequencies. The most famous architecture of noise shaping TDC was proposed in [9] known as the gate-ring oscillator (GRO) TDC. As shown in Fig. 2.10, when *Enable* signal is high, the counter works to count the phases of the oscillator. The counted data is sampled at the falling edge of *Enable* and then reset Figure 2.9: Single-slope TDC. the counter. The phase information from *Enable* is reserved using the gating inverters even the counter is reset. Consequently, the next counting can start from the previous stored residue phase which results a continues phase counting over a time period. The followed differentiator after counter will shape the quantization noise from the oscillator and counter in a first order manner. However, the charge sharing issue and the current leakage from the *Enable* path will greatly affect the TDC linearity. Furthermore, the GRO TDC also suffers a deadzone issue when *Enable* pulse is very small. ## **2.3.2** Digital-to-Time Converter(DTC) As discussed above, to reduce the dynamic range of the TDC, digital to time converter (DTC) used to assist the operation of the TDC with narrower dynamic range. It improves the TDC resolution and linearity while achieving high energy efficiency on phase quantization process. As like TDC, the power consumption, linearity, resolution and noise are also important for DTC design. In general, DTC is a circuit that generates delays instead of quantizing phase errors. It can be designed in a much more efficient way than the TDC. To get a delayed rising edge, a slope with a slew rate of $SR_{Ramp}$ and a comparator with a threshold of $V_{TH}$ are required Figure 2.10: GRO TDC. Figure 2.11: Concept of DTC. as shown in Fig. 2.11. The delay can be written as: $$t_{\text{delay}} = \frac{V_{\text{TH}} - V_{\text{Ref,p}}}{SR_{\text{Ramp}}} \tag{2.2}$$ From Eq. (2.2), the delay can be varied by changing either the comparator $V_{\rm TH}$ , the starting voltage of the slope $V_{\rm Ref,p}$ or the slew rate of the slope $SR_{\rm Ramp}$ . Each of which has different design trade-offs and we will discuss it later. ## Variable Slew Rate (Variable-Slope) DTC As for the most commonly seen DTC in [10] shown in Fig. 2.12, it utilized a so called variable slope method to acquire different delays. A current source is used to generate the ramp signal triggered by input. A digital controlled capacitor bank is used to vary the Figure 2.12: Concept of DTC. slew rate of: $$SR_{\text{Ramp}} = \frac{I_{\text{current}}}{C_{\text{bank}}} \tag{2.3}$$ We can write the achievable delay range $(DR_{DTC})$ as: $$DR_{\rm DTC} = V_{\rm TH} / \frac{I_{\rm Current}}{C_{\rm bank}} = V_{\rm TH} \cdot \frac{C_{\rm bank}}{I_{\rm Current}}$$ (2.4) If we assume the current is constant during charging and discharging the $C_{\text{bank}}$ , we can write the DTC power consumption as: $$P_{\rm DTC} = V_{\rm DD} \cdot I_{\rm Current} \cdot f \cdot V_{\rm DD} / \frac{I_{\rm Current}}{C_{\rm bank}} = V_{\rm DD}^2 \cdot C_{\rm bank} \cdot f$$ (2.5) f is the operation frequency of the DTC. From the above equaltion, double the delay range while keeping a same power consumption, $C_{\text{bank}}$ cannot be changed as shown in Eq. (2.5). Hence, $I_{\text{NMOS}}$ should be halved as shown in Eq. (2.4). The variance of the timing jitter of the DTC can be analyzed like in [11]. We can write the jitter variance equation of the DTC as: $$\sigma_{\text{DTC}}^{2} = \frac{4kT\gamma_{\text{Current}}DR_{\text{DTC}}}{I_{\text{Current}}(VDD - V_{\text{TH}})} + \frac{kTC_{\text{bank}}}{I_{\text{Current}}^{2}}$$ $$= \frac{kTC_{\text{bank}}}{I_{\text{Current}}^{2}} \cdot \left(1 + \frac{4\gamma_{\text{Current}}}{VDD/V_{\text{TH}} - 1}\right)$$ (2.6) where k is the Boltzmann constant, T is the temperature. Now we found if we reduced the current by half, the jitter will double. This is the well know delay range and jitter trade-off for delay elements. To reduce the jitter from DTC, the current and load capacitor bank should increase simultaneously while leaving there ratio the same. Figure 2.13: Constant slope DTC vs variable slope DTC. ## Variable Starting Voltage (Constant-Slope) DTC The variable slope architecture shows a simplicity in design while bringing a major problem. As shown in Fig. 2.13(a), a comparator outputs reversed edge from its input at the threshold voltage. Internally, the comparator is a open-loop amplifier who has a very high gain around its reference voltage. Any small voltage difference between the IN and $V_{\rm TH}$ will be amplified and the output will be saturated. It the gain is infinite, no matter what the input slew rate is, the output will instantaneously drops to zero at the first input, and the second stage will produce its corresponded rising edge. However, practically, the gain can not be infinite. The finite gain will produce a finite slew rate at the output of the first stage if the input slew rate varies. As a results, there is a dependency of the variable delay and the input slew rate. This is a code dependent quantity which is very difficult to remove. It is the major nonlinear source for the variable slope DTC. However, as shown in Fig. 2.13(b), if we vary the starting voltage while using the same ramp for every comparison, the nonlinearity due to the comparator imperfection can be fundamentally removed. This is the so called constant-slope DTC. As shown from Fig. 2.11, the starting voltages could be changes as well to generate code dependent delays. This methods achieves excellent linearity comparing with conventional variable delay DTCs [12]. Figure 2.14: Constant slope DTC. Figure 2.15: Continues time comparator with a variable $V_{\text{TH}}$ control. ## Variable Threshold (Ramp-Division) DTC From Eq. (2.2), the last thing we can do is to vary the threshold of the comparator. The slope still has a constant slew rate during the comparison which mitigates the finite gain induced nonlinearity. Fig. 2.15 shows a continues time comparator design, one of the comparator input is connected to a DAC which generated code dependent voltage. The $V_{\text{TH}}$ decides the threshold of the comparator. Another side is connected to the constant ramp. When the ramp reaches the $V_{\text{TH}}$ , a rising edge will be produced. However, practically, the common-mode voltage of the comparator will also vary according to the variable Figure 2.16: Three types of LC-VCO topology: (a) NMOS type LC-VCO; (b) CMOS type LC-VCO; (c) PMOS type LC-VCO. $V_{\rm TH}$ . The common mode voltage will greatly affect the intrinsic delay of the comparator. This intrinsic delay could vary more than several ps which greatly degrade the linearity. It is even worse than the finite gain effect from the comparator. ## 2.3.3 LC-Oscillator and Oscillator Phase Noise Frequency synthesizer is one of the key building block in TRX systems. There are two frequency sources running at different frequencies. The long term stabilized reference source is utilizing the crystal oscillator and is usually running at lower frequency(several tens of MHz). It will provide a purely reference clock for the synthesizer circuit. While the short term stabilized oscillator is used for generating the several hundreds MHz and Multi-GHz frequencies. For this high frequency output signal source, the phase noise during its oscillation is important since it will contribute to the final synthesizer output jitter performance. The output frequency should also be controlled considering multichannels selection and PVT variation. There are two type of oscillators *i.e* ring oscillator and LC-VCO(inductor-capacitor resonator based voltage controlled oscillator). The ring oscillator always has a worse phase noise and worse supply sensitivity compare with LC-VCO, which is not very suitable for providing wireless communication carrier frequency. The following content will focus on the oscillator phase noise performance of LC-VCO. Fig. 2.16 shows three basic types of LC-VCO widely used in wireless communication Figure 2.17: LC-VCO model systems. In this section, we will mainly focus on these topologies owing to its simplicity of design and optimization, robustness and good phase noise performance. NMOS type LC-VCO utilizes two NMOS transistor as cross coupled pair to maintain the oscillation while the CMOS and PMOS type LC-VCOs utilize complementary cross coupled pairs and PMOS cross coupled pair respectively. Each of them has their own pros and cons. First, we will discuss about the start up issue of these three topologies. Every LC-VCO can be modeled as a negative feed back system as shown in Fig. 2.17. Assume the inductor and the capacitor in the LC tank is idea, the lossy part from them will form a equivalent conductance which is expressed by $g_{\text{tank}}$ . The active part of the circuit(cross-coupled pair) will generated current pulse every oscillation cycle as shown in Fig. 2.17 to replenish the energy loss inside the LC tank. The oscillation could be maintained through the entire oscillation period. We could model this behavior of the active devices as a negative conductance $-g_{\text{active}}$ since it is actually producing energy in its operation as shown in Fig. 2.17. The $P_{\text{bias}}$ is the power provided by the current bias device, which determines the total power dissipation of the circuit. The $P_{\text{active}}$ is the power wasted dissipated only by the active cross-coupled pair transistors. $P_{\text{delivery}}$ is the delivered power from the active device. The $P_{\text{delivery}}/P_{\text{bias}}$ is the power efficiency of the active device. We model the small-signal loop gain of the oscillator as at least $\beta_{min}$ can be expressed as: $$\beta_{\min} = \frac{g_{\text{active,max}}}{g_{\text{tank}}} \tag{2.7}$$ Mathematically, to satisfy the start-up condition, we could make the $\beta$ to 1 which means the energy produced by active device just compensate the energy loss from the LC tank. However, to make a safety margin in real engineering, we need a minimum $\beta$ *i.e.* $\beta_{\min}$ ( $\beta_{\min}$ =3). Figure 2.18: Output waveforms of an idea and a real oscillator. Second, we will discuss about the power consumption and general phase noise performance of the LC-VCO. The first question is what is the phase noise in oscillators? To answer this question, let's look at the time domain signal waveform output from the oscillator as shown in Fig. 2.18. The ideal oscillator output is expressed as an perfect sinusoid waveform: $$x_{\text{ideal}}(t) = A\cos(\omega_c t)$$ (2.8) In contrast, the real oscillator output waveform will deviate from the ideal zero crossing point as shown in Fig. 2.18 mainly due to the circuit noise inside the oscillator. This deviation of zero crossing point will affect the instantaneous frequency of the oscillator. From phase domain point of view, the oscillator behaves like an integrator of oscillation frequency and those frequency variation will becomes the phase error comparing with the integration of an ideal sine wave. Since we don't want this phase perturbation happen, we call it "phase noise". We express this real oscillator waveform as: $$x_{\text{real}}(t) = A\cos(\omega_c t + \Phi_n(t)) \tag{2.9}$$ Where the $\Phi_n(t)$ is called phase noise. We transform the time domain waveform in to the frequency domain as shown in Fig. 2.19. For quantifying the phase noise, we consider a 1-Hz bandwidth of the spectrum at an offset of $\delta_f$ , measure the power in side the bandwidth, and then normalize the result to the power of the carrier frequency $f_c$ . The unit is expressed as dBc/Hz which expresses the power spectrum density of the signal skirt. Generally speaking the limitation of the phase noise depends on the quality factor of the LC tank( $Q_{tank}$ ) and the output signal power from Leeson's famous phase noise Figure 2.19: Output spectrum of (a)ideal and (b) real oscillator. Figure 2.20: Phase noise spectrum of the LC oscillators. equation [13]: $$L(f_m) = 10log\left[\frac{1}{2}\left(\left(\frac{f_0}{2Q_{\text{tank}}f_m}\right)^2 + 1\right)\left(\frac{f_c}{f_m} + 1\right)\left(\frac{FkT}{P_s}\right)\right]$$ (2.10) In the equation, $f_0$ is the output frequency, $Q_{tank}$ is the loaded tank Q, $f_M$ is the offset from the output frequency, $f_c$ is the flicker noise corner frequency, F is the noise factor of the active part, k is Boltzmann's constant, T is the absolute temperature and $P_S$ is the oscillator output power of the desired frequency. This famous equation gives the designer insight of the phase noise of the oscillator and optimization direction. The author also got a very famous plot of the phase noise for LC-VCO as shown in Fig. 2.20. Fig. 2.20 shows three regions *i.e.* $\frac{1}{f^3}$ , $\frac{1}{f^2}$ and $\frac{1}{f^0}$ phase noise region. However, this analysis is based on Linear Time Invariant(LTI) method which is hardly accurately predicting the nonlinear oscillator behavior. There are many paper talked about the analysis method based on LTV model [14–19]. One of the most famous methods is described in Thomas H. Lee's paper [14], which is the Impulse Transfer Function(ISF) method. This method express the periodic transfer function of the nonlinear oscillator and its expressions vary with the time in one period. The noise injected in different time in different node of the oscillator will multiply different transfer function, and adding to the output phase noise. Using this theory, the three different region can be explained clearly. The $\frac{1}{f^3}$ region is mainly due to the up-conversion of the DC term in the active device noise and the flicker corner frequency is determined by the DC component ISF value and rms ISF value. The $\frac{1}{f^2}$ region is mainly determined by the up-conversion of the thermal noise part from the active device. Finally the $\frac{1}{f^0}$ is the amplitude noise floor of the oscillator phase noise. # **Chapter 3** # Sub-mW Digital PLL Using Digital-to-Time Converter Internet-of-things (IoT) shows great potentials for enhancing the communication capabilities for millions of people around the world. It enables us to communicate with the personal devices, nearby sensor nodes, machines and even city infrastructures. Integrated wireless transceiver (TRX) is the key to realize such wireless connections. Ultra-low-power (ULP) TRXs will be key elements in a variety of short-range wireless standards, e.g., BLE, Zigbee, WPAN/WBAN and Wi-Fi network. The radio-frequency phase-lock loop (RF-PLL), as one of the most important elements in TRX, consumes a significant amount of power [20] due to the phase noise and spurious requirements. Hence a reduction in PLL's power will greatly lower the ULP TRX power consumption. The Digital PLL (DPLL) [3, 21–28], which takes advantage of the scaling of CMOS technology, is more promising than its analog counterpart in advanced CMOS process. It shrinks the required chip size while providing easily-accessed analog/digital inputs/outputs (IO) which can be used for digital-intensive calibrations and modulations. While the benefits of DPLL are obvious, there are still barriers for realizing a sub-mW fractional-N DPLL. In the initial proposal of DPLL design [27] as shown in Fig. 3.1(a), a full-range time-to-digital converter (TDC) and a counter (CNT) are utilized as the fractional and integer phase quantizer which measures the phase difference between the DCO and REF. The TDC is required to cover at least one DCO cycle. The power consumption will increase as we enlarge the TDC quantization range while maintaining a good linearity and resolution. This fundamental trade-off makes it very difficult to realize the low-power operation with good jitter and spurious performance [22, 27]. In Fig. 3.1(b), instead of using a full-range TDC, a full-range digital-to-time converter (DTC) can be placed in the REF path. It is controlled by fractional part of FCW, i.e., $(FCW_{frac})$ , Figure 3.1: (a) DPLL with the full-range TDC to perform fractional-*N* operation (b) DPLL with the full-range DTC to perform fractional-*N* operation which reduces total power consumption. and produces a reference with a fractional phase REF<sub>frac</sub> to the TDC input. It minimizes the phase error between REF<sub>frac</sub> and DCO [3, 23–26]. It basically mimics the operation of the fractional-N analog sub-sampling PLL [29–31]. This phase-prediction mechanism helps shrink the TDC range to only several DTC LSBs. As a result, even bang-bang phase detector (BBPD) [26] can be used for achieving the fractional-N operation. In contrast with the TDC which quantizes time difference, DTC generates variable delays. Owing to this, the DTC consumes much less power than TDC when the same linearity and resolution are presented. With the help of the low-power DTC, a sub-mW DPLL is realized for the first time in [23]. However, the DTC also suffers from poor linearity and resolution when considering the limited power budget. The INL of the DTC generates fractional spurs due to the periodic phase modulation. In [23], an DPLL of 860µW is realized with a worst fractional spur of -37dBc and 1.71ps rms jitter. It could potentially degrade the transmitter (TX) EVM, the receiver (RX) sensitivity as well as the RX blocker tolerance. In [24], a DTC phase dithering technique is utilized to scramble the INL periodicity, which spreads the spur power into a white spectrum. The fractional spurs can be reduced by the dithering while it degrades the in-band phase noise. As a result, a 1.98ps rms jitter is achieved with a 670µW power. Because the DTC linearity is the greatest contributor of fractional spurs, a highly-linear DTC with small power consumption is highly demanded. Constantslope charging method is proposed [12] to fundamentally improve the DTC linearity. This method mitigates the nonlinearity arising from the inverter-based comparator. However, the integrated digital to analog converter (DAC) consumes significant amount of power. Another issue of conventional DTC is that the $V_{\rm TH}$ of the inverter-based comparator directly suffers from the supply variation which greatly degrades the linearity. In order to keep the linear operation, the comparison should be independent from $V_{\rm TH}$ . TDC resolution is also important to minimize the jitter of the DPLL. A time amplifier (TA) [3] can serve this purpose to improve the TDC resolution. However, the narrow-range TDC can only quantize a limited phase difference which will significantly slow down the phase locking process [3, 26]. The lock-up time of the DPLL is also critical for frequency hopping applications such as BLE, hence it needs to be minimized. The DPLL presented in this paper uses a delta-sigma modulator (DSM) and multimodulus divider (MMDIV) in the feedback path for realizing fractional-N operation. A DTC is used for cancelling the quantization noise produced by MMDIV [26]. The analysis done in this paper reveals that a first order DSM working in conjunction with a highly linear DTC is capable of realizing low-jitter fractional-N PLL with low power consumption, thus realizing high FOM. An isolated constant-slope DTC is proposed in this paper, which is capable of providing high linearity with low power consumption. While in the pre-charge and compare steps are combined in the conventional constant-slope DTC [12], in the proposed isolated constant-slope DTC, the pre-charge and the compare steps are isolated in order to maintain high linearity in a noisy supply environment and assure lower power consumption. A TA-based TDC [3] is adopted to achieve high TDC resolution to improve in-band phase noise. To speed up phase lock process, an always-on coarse DPLL is proposed. The DPLL achieves a fast locking while the coarse DPLL consumes almost zero power after phase lock is achieved. The proposed fractional-N DPLL achieves a 535fs jitter and an in-band fractional spur of -56dBc with only 0.98mW power, thanks to the proposed DTC. It is also capable of 0.65mW power operation while achieving a 1.00ps jitter and a -50dBc spur. Figure 3.2: Detailed block diagram of the proposed ULP DPLL with the proposed 10b isolate constant-slope DTC. # 3.1 Proposed DPLL System Architecture To realize low-power, low-jitter and low-spur performances simultaneously in a fractional-*N* DPLL, the selection of the architecture and the specifications of each building block will be crucial in system level design. Fig./ 3.1(b) shows the most common way to realize the sub-mW fractional-*N* DPLL [23, 24], it basically mimics the operation of the analog sub-sampling operation [29–32]. In order to save the power consumption, CNT is completely shut down after phase locked [24] which equivalently means the FLL is turned off in the analog sub-sampling PLL. This will potentially lead the PLL to lock to a wrong frequency if the large frequency and phase disturbance are presented during its operation. Another issue of [23, 24] is that the poor resolution of the TDC causes degradation on in-band phase noise hence the jitter performance of the DPLL. Furthermore, the DTC linearity limits the spur performance which needs the special technique to deal with [24] and it consumes power as well as sacrifices jitter performance. A detailed architecture-level block diagram of the proposed DPLL is shown in Fig. 3.2. To improve the in-band phase noise, a 4b 2ps-resolution TA-TDC [3] and a reference doubler are implemented. The duty cycle issue of the doubler [33] is calibrated by using the method proposed in [34]. A proposed 10b isolated constant-slope DTC supports Figure 3.3: (a) DSM-based fractional controller. (b) Phase models of MMDIV and DTC. the narrow-range TDC operation. Theoretically, with the assistence of the fine-resolution DTC, the range of the TDC can be reduced to 1 bit. However, in pretice, 1-bit TDC or BBPD will cause other issues. One issue is that it significantly degrades the locking speed of the DPLL. Another improtant issue is that it degrades the calibratuion convergency speed for the DTC as discussed in [3]. A multi-modulus divider (MMDIV) is used to perform the phase accumulation and frequency division simultaneously. The frequency/phase of DCO will be monitored by MMDIV, hence any phase/frequency change will be reflected at TDC input. Both MMDIV and DTC are controlled by the DSM-based controller [26]. Fig. 3.3 shows the implementations and the mathematic models of the DSM-based fractional controller, MMDIV and DTC. In Fig. 3.3(a), $FCW_{\text{int+frac}}$ is separated into 7b MSB ( $FCW_{\text{int}}$ ) and 16b LSB ( $FCW_{\text{frac}}$ ). $FCW_{\text{frac}}$ is modulated by a 1st-order DSM that produces one bit output $F_{\text{DSM}}$ , and $F_{\text{DSM}}$ is added with the $FCW_{\text{int}}$ . As shown in Fig. 3.3(b), the phase domain model of MMDIV performs as an integrator. The $FCW_{\text{int}} + F_{\text{DSM}}$ is accumulated by MMDIV into 7b phase control code $PCW_{\text{int}}$ . Then, $PCW_{\text{int}}$ is multiplied with $2\pi$ inside MMDIV where $2\pi$ is normalized to one DCO cycle, which produce the integer phase $\phi_{\text{int}} = 2\pi \cdot PCW_{\text{int}}$ . The fractional frequency error (DSM quantization error) $Q_{\text{frac}}$ is extracted by subtracting $F_{\text{DSM}}$ from $FCW_{\text{frac}}$ as shown in Fig. 3.3(a), then it is Figure 3.4: System level analysis of 1st-order and 2nd-order DSM-based fractional controller for low-power DPLL. accumulated into fractional phase error $PCW_{\text{frac}}$ by a digital accumulator. It is multiplied with a $2\pi$ phase at DTC to acquire the fractional phase $\phi_{\text{frac}} = 2\pi \cdot PCW_{\text{frac}}$ . Finally, $\phi_{\text{int}}$ and $\phi_{\text{frac}}$ sum at DTC and produce the desired fractional feedback phase of $\phi_{\text{FB}} = \phi_{\text{int}} + \phi_{\text{frac}} = 2\pi \cdot PCW_{\text{int+frac}}$ . To realize fractional-N phase and frequency synthesis, the reference (REF) phase $\phi_{\text{REF}}$ should be exactly equal to $2\pi \cdot PCW_{\text{int+frac}}$ . After phase locked, the frequency will be automatically locked to $FCW_{\text{int+frac}} \cdot f_{\text{REF}}$ . As compared with the 2nd-order DSM used in [26, 31], a 1st-order DSM-based architecture can accept a much simpler digital implementation, which directly saves power consumption and area. Furthermore, the higher order DSM-based controller will increase the required dynamic range of the DTC [26, 31], and potentially increases jitter contribution. However, the higher order DSM randomizes the DTC control code, which spreads the spur energy caused by DTC INL into a white spectrum which can be filtered by PLL loop bandwidth. In order to discuss the power, jitter and spur trade-offs using the different orders of the DSM, we compared the performance differences between the 1st-order DSM and the 2nd-order DSM. As the DSM order increasing, the $PCW_{int}$ will jump larger between each code, which increases the required DTC delay range. The 1st-order DSM only requires a DTC with $2\pi$ range while the 2nd-order DSM doubles the required DTC range to $4\pi$ . As a most commonly used DTC architecture shown in Fig. 3.5, it utilizes a variable slope method for delay generation. Figure 3.5: Variable-slope DTC. As discussed in Chapter. 2.3.2, doubling the delay range $DR_{DTC}$ will halve $I_{NMOS}$ . Eq. (2.6) shows that it will also double the jitter contribution if $I_{NMOS}$ is halved. This is a trade-off between the delay range and jitter of the delay element such as a DTC. For the low-power design, the jitter from each component should be optimized in consideration of the power budget. For a system level estimation and a rough transistor-level simulation, a worst case rms jitter of 0.7ps is expected at DTC output for the 1st-order DSM while the rms jitter will become 1.4ps using the 2nd-order DSM. All blocks shown in Fig. 3.2 are modeled with pre-determined parameters, and only the DTC as well as the order of the DSM are changed. The DTC INL is modeled in a sinusoid shape with a look-up table. We sweep the relative INL of both DTCs, and record the rms jitter and the worst spurs (in-band fractional spurs) of the DPLL as shown in Fig. 3.4. The DPLL loop bandwidth is optimized to 600kHz at a 52MHz reference and the first in-band fractional spur is located at 200kHz. At a very small INL, the 1st-order DSM case demonstrates around 190fs better in rms jitter. This is expected because the periodic jitter is not dominant in output jitter and the doubled DTC jitter contributes more to the output jitter in the 2nd-order DSM case. However, when we increase the INL over 0.4%, the periodic jitter caused by spurs becomes dominant and the output jitter in the 1st-order DSM case becomes worse than in the 2nd-order DSM case. For achieving better FOM performance, the rms jitter should be kept as low as possible, which motivate us using the 1st-order DSM. The INL specification of DTC is from 0.05% to 0.4% which contributes to a 2.8dB improvement in FOM for a 1mW operation of the DPLL. The difference in FOM will increase if the DTC jitter becomes larger, which is a common situation in a noisy SoC environment. To estimate the phase noise, both phase-domain and time-domain methods are used. As shown in Fig. 3.6, there are four major noise contributors, *i.e.*, TA noise, TDC quantization noise, DTC noise and VCO noise. The well-known reference noise is included in the TDC quantization noise. The red line shows the output phase noise of the Figure 3.6: Phase noise estimation of the DPLL in the phase domain (without fractional spurs) and the time-domain phase noise simulation at 2442MHz using 52MHz reference (with fractional spurs). fractional-*N* DPLL. The black line shows the time-domain simulation results of the proposed fractional-*N* DPLL which operates at 2442MHz using a 52MHz reference clock. The fractional spurs are expected to be at around 2MHz and its harmonics can be observed in the figure as well. The time-domain simulation matches with the phase domain simulation which proves to be a good estimation of the overall output phase noise. The overall power consumption is estimated as 1mW at a 52MHz reference according to the post-layout simulation. To find out the FOM limitation using this architecture, let's assume keep the jitter remaining the same as in Fig. 3.6 of 430fs and calculate the minimum power consumption of the major components which contribute jitters. Eq. (2.4), Eq. (2.5) and Eq. (2.6) are used to calculate the minimum power of $30\mu$ W without considering the linearity. For the minimum power of an oscillator, a FOM of -195dBc will results an power of around $60\mu$ W without considering the start-up condition. With very finer DTC resolution, TDC range can be further narrowed to 1bit without considering locking speed to reduce the power to $0\mu$ W. The power of digital circuits can be considered to be $0\mu$ W with advanced technology. A $100\mu$ W is assigned for the doubler as in simulation. The total power will be only $180\mu$ W and the theoretical limit for FOM will be -255dB. # 3.2 Proposed Isolated Constant-Slope DTC ## 3.2.1 Concept of Operations Since the DTC linearity performance will greatly influence on the jitter and spur performances of the proposed DPLL using a 1st-order DSM, the DTC design becomes more challenging than TDC at a limited power budget. Constant-slope charging method is proposed in [12] to mitigate the inverter-induced nonlinearity. It demonstrates a fundamental improvement in the linearity of the delay generation over the conventional variable-slope method [10, 26, 35] which generates delays by using variable slew-rate (SR) slopes at the input of the inverter-based comparator. In the concept of the original constant-slope DTC, the digital controlled delays are acquired by varying the starting voltages $V_{\rm ST}$ of the slopes generated by a fixed current source. $V_{\rm ST}$ is acquired by pre-charging the loading capacitor $C_{\rm L}$ using a digital-to-converter (DAC) before the input signal triggering the current source. Since the charging slopes across the inverter $V_{\text{TH,inv}}$ shares the same slewrate(SR), the inverter-induced nonlinearity will be mitigated [12]. However, in order to acquire the desired jitter and delay range, the ratio of charging current and load capacitor should be kept the same while their absolute values should be increased as explained in Eq. (2.4) and Eq. (2.6). A significant amount of the energy $E_{\text{PreChg}} = C_{\text{L}} \cdot V_{\text{ST}}^2/2$ will be consumed to acquire $V_{ST}$ caused by a large load capacitance $C_L$ . Furthermore, because a charge current cannot be fully turned on instantaneously, different $V_{\rm ST}$ will cause different start-up behavior for a practical current source as explained in [12]. Any higher $V_{ST}$ will significantly degrade the INL of the DTC. The original constant-slope DTC consumes almost 1mW power on DAC for $V_{ST}$ settling at a 55MHz clock and utilizes a 1.2V for current source VDD in order to achieve an INL of 0.15% [12]. Fig. 3.8 shows the proposed 10b DTC utilizing constant-slope method [21]. Instead of varying the $V_{\rm ST}$ of the constant-slopes which will potentially distort the current source, a ramp division architecture is adopted instead as shown in Fig. 3.8(b) whose $V_{\rm TH}(n)$ of the comparator is shifted. The comparator will output a corresponding edge at $t_{\rm n}$ and producing the delay of $t_{\rm n}-t_0$ . By always using the same ramp generated by a current source, the linearity degradation from the current source will be mitigated and any higher supply voltage is not required. Furthermore, the ramp information can be used for nearly $600 \, \text{mV}/1 \, \text{V} = 60\%$ comparing with only $200 \, \text{mV}/1.2 \, \text{V} \approx 17\%$ in [12]. The $V_{\rm TH}(n)$ is shifted by isolating the pre-charge step with the comparing step by using a series capacitor $C_{\rm C}$ and DAC. $C_{\rm C}$ is small enough not to degrade the pre-charge time on both sides of the $C_{\rm C}$ by DAC and $\Phi_3$ . The reduced pre-charging capacitance can minimize the DAC power consumption even in a high-speed operation. Another important issue raised from the comparator, which essentially just a simple inverter in most of the state-of-the-art DTCs Figure 3.7: (a) Conventional constant-slope DTC (b) operations of the conventional constant-slope DTC. [10, 12, 26, 35]. As shown in Fig. 3.9, other circuits in the same supply line will cause ripples because of the off-chip supply environment such as the series off-chip inductors and resistors. Those ripples will remain because of the limited area for decoupling capacitor and limited power budget for on-chip regulators in IoT applications, and will strongly couple to the threshold of the inverter-based comparator. The linearity will be greatly influenced by the threshold variation and degrades the DPLL jitter performance when the 1st-order DSM architecture is utilized. In the proposed architecture, this issue is solved by auto-zero switch $\Phi_3$ which mitigates the inverter $V_{\text{TH,inv}}$ offset every conversion, hence greatly improves the INL. Fig. 3.10 shows the conceptual operation of the proposed DTC. In pre-charge step as shown in Fig. 3.10(a), the node A and node B of $C_{\rm C}$ will discharge from the saturated Figure 3.8: (a) Proposed isolated constant-slope DTC (b) concept operation of the proposed DTC. voltage of previous slopes. Node A will discharge to the desired DAC voltage and node B will discharge to $V_{\rm TH,inv}$ . As we noticed that the pre-charge step is actually a discharge process which causes no extra power consumption from DAC. The pre-charge speed will be limited by the $R_{\rm DAC}$ and $C_{\rm C}$ . Because $C_{\rm C}$ is small, $R_{\rm DAC}$ can be chosen to be large to minimize the power consumption from the DAC. In set step as shown in Fig. 3.10(b), the $\Phi_1$ and $\Phi_4$ are closed to short node A to 0V. Node B will drop the same amount of voltage which results in a new $V_{\rm ST} = V_{\rm TH,inv} - V_{\rm DAC}$ at inverter input. The conversion time of the set step is minimized for not degrading the operation speed of the proposed DTC as well as minimizing the leakage charge from $\Phi_3$ . At the final step shown in Fig. 3.10(c), $\Phi_1$ is closed and $\Phi_5$ is triggered by input rising edge. The current source starts to charge $C_{\rm load}$ to acquire a ramp at node A from 0V to VDD, and node B will copy the ramp of node Figure 3.9: $V_{\text{TH}}$ offset caused INL which induced by noisy supply. A while starting from $V_{\rm ST}$ generated by set step. The rising edge will reach the decision point of the inverter-based comparator and produces variable delays. If we consider a supply noise environment as shown in Fig. 3.9, a $V_{\rm n,Supply}$ is presented at the supply line. The digital controlled $V_{\rm TH}(n)$ can be written as: $$V_{\text{TH}}(n) = V_{\text{TH,inv}}(t_{\text{N+1}}) - (V_{\text{TH,inv}}(t_{\text{N}}) - V_{\text{DAC}}(n))$$ $$= \alpha(f) \cdot V_{\text{n,Supply}} + V_{\text{DAC}}(n)$$ (3.1) where $\alpha(f)$ is a frequency dependent factor with a value of less than 1 and it also depends on the difference between $t_{N+1}$ and $t_N$ . Ideally, if $t_{N+1} = t_N$ , $\alpha(f)$ will be 0. The variable delay for each control code will be: $$t_{\rm n} = V_{\rm TH}(n) \cdot \frac{C_{\rm L}}{I_{\rm current}} = (\alpha(f) \cdot V_{\rm n, Supply} + V_{\rm DAC}(n)) \cdot \frac{C_{\rm L}}{I_{\rm current}}$$ (3.2) Eq. (3.2) shows that the delay is no longer determined by the inverter threshold thanks to the auto-zero function in an ideal condition. ## 3.2.2 Nonlinear Sources and Circuit Implementations Since the linearity of the DTC affects both the rms jitter and fractional spurs of the proposed DPLL, the linearity degradation from nonlinear sources should be minimized. The detailed DTC core implementation is shown in Fig. 3.11(a). A cascode current source is adopted to improve the current source linearity. As shown in Fig. 3.8(b), the slope will be interpolated by $V_{\text{TH}}(n)$ . Any nonlinearity in the slope will transfer to the DTC INL. Long channel devices of $M_{\text{N1}}$ and $M_{\text{N2}}$ are chosen to minimize this error. Since the utilized slope information as shown in Fig. 3.8(b) contributes most of the nonlinearity of Figure 3.10: Conceptual operation diagrams of proposed 10b isolated constant-slope DTC (a) $C_L$ is isolated from DAC during DAC operation (Pre-charge step) (b) Charge in $C_C$ is shorted to ground which set new $V_{ST}$ at node B (Set step) (c) Constant slope with new $V_{ST}$ is compared in inverter(Compare step). the proposed DTC, any improvement in the current source linearity will directly improve the DTC linearity. Another major nonlinear source is from the junction capacitors $C_{Par1}$ and $C_{Par2}$ at node X and node B where all transistors connected to these nodes will contribute to $C_{Par1}$ and $C_{Par2}$ . $C_{Para1}(V_X)$ will be negligible if $C_L$ is sufficiently larger and will not degrade the INL. While as for $C_{Par2}$ , it acts as a voltage divider capacitor in series with $C_C$ . In other words, the slope at node B in the compare step will not follow the slope at node A exactly. The waveform distortion at node B will degrade the INL of the DTC if the value of $C_{\rm C}$ is not properly sized. A large $C_{\rm C}$ is desired to minimize the effect from $C_{\rm Par2}$ . However, a large $C_{\rm C}$ will potentially increase the settling time and the power consumption of the DAC, which limits the maximum operation frequency. $C_{\rm C}$ is optimized to a value of 100fF when considering sufficient margin to cover PVT variations of the above issues. Furthermore, the node A always drops from $V_{\rm DAC}(n)$ to 0V, hence, $C_{\rm Par2}$ is a function of $V_{\rm DAC}(n)$ , *i.e.*, $C_{\rm Para2}(V_{\rm DAC}(n))$ . This dependency limits the maximum output range from DAC. In this design, an optimized range of 350mV is chosen for the better DTC linearity when both the current source linearity and effects from $C_{\rm Par2}$ are considered. One more major source of nonlinearity is from the leakage current $I_{\text{leak}}$ of the autozero switch $\Phi_3$ during set step. $\Phi_3$ is opened to hold $V_{\text{ST}}$ values before the slope arrives. However, due to the limited off-resistance $r_{\text{off}}$ of the CMOS switch, the current will leak from node C to node B from the inverter's supply. It charges $C_{\text{Para2}}$ and $C_{\text{C}}$ simultaneously and causes nonlinear error voltages at node B. This error is highly depending on $V_{\text{DAC}}(n)$ as well as the operation time of the set step and compare step ( $t_{\text{N+1}} - t_{\text{N}}$ ). To minimize this error, two switches are implemented in series to increase the effective $r_{\text{off}}$ while the $t_{\text{N+1}} - t_{\text{N}}$ is minimized to 1/10 of the DTC period. The shortened $t_{\text{N+1}} - t_{\text{N}}$ also contributes to the supply noise suppression. Last but not least is the nonlinearity from the 10b resistor-DAC (RDAC). 5b binary code is designed for LSB to save the chip area, while 5b thermal code is designed for MSB to maintain a good linearity. The mismatch of the resistors and the non-ideal reference voltage will cause the linearity degradation of the RDAC. It will directly transfer to DTC INL. Other non-major nonlinear sources such as charge sharing of CMOS switches can be minimized by proper sizing of the transistors in simulations. The detailed timing chart of the proposed DTC is shown in Fig. 3.11(b). During the auto-zero switch is closed in pre-charge step, node B will be set to the inverter $V_{\rm TH1}$ , which is around 500mV. At node C, the voltage will also be shorted to node B and producing a 500mV. If the second inverter $V_{\rm TH2}$ is also around 500mV, the DTC output OUT will produce multiple zeros and ones due to the noise. To maintain a robust operation, an LVT inverter, whose $V_{\rm TH}$ = 300mV, is placed at the output of the DTC. ## 3.2.3 Simulation Results The simulated results of the proposed DTC are shown in Fig. 3.12. In typical-typical corner with 1.0V VDD and a temperature of $25^{\circ}$ C, it achieves a 10b range of 560ps with a 550fs resolution. The peak INL is 0.05% (200fs) at 52MS/s with $140\mu W$ power. Figure 3.11: Detailed circuit of (a) isolated constant-slope DTC and (b) its timing chart. | | This work | [12] | [35] | [26] | [10] | |-------------------------|----------------|----------|----------|----------|----------| | Architecture | Isolated | constant | Variable | Variable | Variable | | | constant slope | slope | slope | slope | slope | | Technology | 65nm | 65nm | 65nm | 65nm | 28nm | | Delay range | 593ps | 189ps | 186ps | 338ps | 563ps | | Resolution | 580fs | 185fs | 4700fs | 330fs | 550fs | | INL | 870fs | 328fs | 1900fs | 3000 | 990fs | | | (0.15%) | (0.17%) | (1%) | (1%) | (0.18%) | | Worst Jitter | 630fs | 210fs | 300fs | 400fs | 250fs | | <b>Supply Rejection</b> | Yes | No | No | No | No | | Power(mW) | 0.14 | 0.8+1.0 | 0.22 | 2.2 | 0.5 | | | @52MHz | @55MHz | @48MHz | @40MHz | @40MHz | Table 3.1: Comparison Table of The State-of-The-Art DTCs Other corner and temperature conditions are also applied as shown in Fig. 3.12(a) and Fig. 3.12(c). The worst case is observed while using a 0.9V supply voltage as shown in Fig. 3.12(b) because of the linearity degradation of the current source. The post-layout Monte-Carlo simulations are performed as shown in Fig.3.13. This simulations indicate a +450fs peak INL. To evaluate the effect from the supply noise, the deterministic jitter variance $\sigma^2$ with and without auto-zero offset switch are shown in Fig. 3.2.3 when noise is added. In the post-layout simulation, $20\text{mV}_{pp}$ sine waves with different frequencies are applied to the supply of DTC core. The jitter variance as well as the corresponding suppression in dB with and without auto-zero function are recorded. From the simulation results, the suppression is larger if the supply noise frequency is lower. This matches with the analysis from above Eq. (3.1). The smaller the noise frequency, the smaller will be the $V_{\text{TH,inv}}(t_{\text{N+1}}) - V_{\text{TH,inv}}(t_{\text{N}})$ . The simulated jitter of the DTC is around 560fs. The current source contributes 82% of the jitter, while the other noise source, such as the DAC noise, the switching noise and the inverter noise, contribute a total of around 18%. The proposed isolated constant-slope DTC can be used in many applications other than low-power fractional-N DPLL. For example, it can be used for ultra-low-jitter low-spur fractional-N DPLLs. It can be also applied to time-domain ADCs and digital LDOs which require highly-linear digital controlled delay units. Figure 3.12: Post-layout INL simulations of the proposed DTC with (a) Corner conditions (b) Supply voltage variations (c) Temperature variations. # 3.3 Circuit Implementation ## 3.3.1 Path-select TDC and TDC Gain Calibration A path-select TA-TDC is implemented shown in Fig. 4.12. A BBPD derives the sign of the phase error after TA, a path select logic is used to switch the up and down paths of Figure 3.13: The Monte-Carlo simulations of the proposed DTC INL. Figure 3.14: Simulated deterministic jitter power w/ and w/o auto-zero switch when noisy supply with different frequencies are presented, and the deterministic jitter power suppression w/ auto zero switch. TA output based on the results from BBPD. If $T_{ON}$ leads $T_{OP}$ , the up-down switch will be transparent for both signals. If $T_{ON}$ lags $T_{OP}$ , the up-down switch will switch the two signal paths to avoid outputting all zeros. The up-down switch function can make sure both lead and lag conditions between $T_{ON}$ and $T_{OP}$ quantized by a 3b 16ps-resolution coarse TDC. At the output of the TDC, the quantizer output will be combined to 4b with the BBPD output. As compared with [3] which adopted two 3b quantizers for the same purpose, the path-select technique saves almost half of the power and area. Notice that in order to properly switch up and down signals, the BBPD should output selection signal Figure 3.15: Path-select TDC. before $T_{OP}$ and $T_{ON}$ coming to the up-down switch. While in practice, the BBPD takes $\Delta t_{BBPD}$ to derive the selection signal. Two extra delays of $\Delta t_{SEL}$ are added before two inputs of the up-down switch. If $\Delta t_{SEL}$ is longer than $\Delta t_{BBPD}$ , $T_{OP}$ and $T_{ON}$ can be properly switched. However, the path mismatch introduced by two extra delays bring a time error at 0 code and cause INL degradation. The TA can minimize the time error using its gain. In order to completely mitigate this issue, a constant offset code is added at the TDC output. After phase locked, the TDC will not use codes around 0 to avoid the potential linearity degradation. This constant offset is decided by the jitter of the DPLL itself. A post-layout simulation of the path-select TDC shows a +50fs/-220fs peak INL with around 2.1ps/LSB resolution. As widely known, TA gain $G_{TA}$ is very sensitive to PVT variation [5, 6, 36]. $G_{TA}$ will influence on the PLL phase noise in two directions, the first is the loop bandwidth and the second is the in-band phase noise due to the quantization noise. As shown in Fig. 4.12, the loop bandwidth can be compensated by LMS calibration [37]. However, the in-band phase noise will be influenced by the effective resolution $t_{res}$ of the TDC [8], where in this design it is the ratio between the resolution of the coarse TDC (a buffer delay) and $G_{TA}$ . Hence, the in-band phase noise will not be improved by the LMS calibration of TDC gain. In post-layout simulations of this work, a buffer delay varies from 16.3ps to 16.9ps across the temperature variation from -40°C to 100°C while $G_{\rm TA}$ varies from 9.7 to 6.2. Hence, $t_{\rm res}$ varies from 1.7ps/LSB to 2.7ps/LSB. In addition, buffer delay varies from 17.1ps to 16.1ps when supply varies from 0.9V to 1.1V while $G_{\rm TA}$ varies from 9.0 to 7.0. $t_{\rm res}$ varies from 1.9ps/LSB to 2.3ps/LSB. If $G_{\rm TA}$ is calibrated in both cases, $t_{\rm res}$ will be stabilized at around 2.1ps/LSB across the temperature and supply variations. However, the TA gain calibration will not be effective to the corner conditions where $t_{\rm res}$ will still vary from 1.8ps to 2.5ps after calibration in FF and SS condition, respectively. Figure 3.16: (a) Conventional TA calibration (b) TA time-offset induced gain error. The conventional TA gain calibration [6] is done by inserting a delay of $\Delta \tau$ at input and then computing the delay at the output. However, the process mismatch induced time offset $\epsilon_{TA}$ will cause an extra gain error of $G_{ERR} = \epsilon_{TA}/\Delta \tau$ . In a low power design of the TA, $\epsilon_{TA}$ can be as large as $\pm 140$ ps in a Mont-Carlo simulation. Due to the limited linear amplification range of TA, $\Delta \tau$ can not be too large. Hence, a $\Delta \tau$ of 27ps will result in a $G_{ERR}$ of over $\pm 65\%$ , if a worst time offset is presented. The conventional offset calibration [36] utilizes a replica TA to compute the offset time, but introduces the area overhead and the mismatch between original TA and replica TA. In Fig. 3.17, a gain-and-offset calibration is proposed. Firstly, $\epsilon_{TA}$ is calibrated by a 0 time delay at the input. The output of TA should be 0 as well if $\epsilon_{TA} = 0$ . If $\epsilon_{TA} \neq 0$ , BBPD will detect the errors and adjust the capacitor bank at TA output. After the offset calibration, the gain calibration begins as shown in Fig. 3.17(b). By the proposed two-step calibration, the $G_{ERR}$ will be minimized from 64.5% to 6.25% in the simulation. Even though TA gain calibration affects the linear range of the gain itself, a sufficient margin of the linear range is designed to ensure a good linearity of the TDC within its quantization range in this work. In Fig. 3.17(a), a gain-and-offset calibration is proposed. In the first step, $\epsilon_{TA}$ is calibrated by a zero time delay at the input. The output of TA will also produce zero Figure 3.17: Proposed TA gain-and-offset calibration technique (a) TA time-offset calibration (b) TA gain calibration. Figure 3.18: Proposed TDC gain calibration for minimizing PLL output jitter variation and PLL loop-bandwidth variation. delays at the output if $\epsilon_{TA} = 0$ . If $\epsilon_{TA} \neq 0$ , BBPD will detect the errors and adjust the 5b capacitor bank attached to TA output. After the offset calibration, the gain calibration Figure 3.19: Schematic of the DCO, buffer and MMDIV. begins as shown in Fig. 3.17(b). By the proposed two-step calibration, the $G_{\text{ERR,jitter}}$ will be minimized from 64.5% to 6.25% in the simulation. Since the $G_{\text{ERR,jitter}}$ is reduced by the proposed calibration, the jitter of the DPLL does not suffer from the $G_{\text{TA}}$ variation. A complete TDC calibration engine is shown in Fig. 3.18. The gain error $G_{\text{ERR,jitter}}$ is firstly calibrated in the foreground because it is mainly induced by process mismatch. Then, the loop gain error $G_{\text{ERR,loop}}$ is calibrated though LMS running at background. ### 3.3.2 Reference Doubler and Duty Cycle Calibration To improve the in-band phase noise, a reference doubler are implemented, as shown in Fig. 3.19. The duty cycle issue of the doubler [33] is calibrated by using the method proposed in [34]. The rising edge of the input reference is delayed by one-fourth of a reference cycle, and the PD is used to detect the phase error of each edge. The digital logic is used to adjust the delays of each edge to make perfect alignments. After duty cycle calibration, the input and the output of the DL1 has a time difference of exact one-fourth reference cycle delay. The XOR operation of the input and the output of the DL1 generates a doubled reference cycle. ### 3.3.3 Digital Controlled Oscillator Fig. 3.20 shows the implementation of the digital controlled oscillator (DCO), DCO buffer and the MMDIV. In order to consume a low current consumption while maintaining a robust oscillation, a current reuse CMOS-type architecture is implemented with a 0.8V VDD. A 2.5nH inductor with a Q of around 20 is designed with EM verifications. In post layout simulation, a $260\mu$ W is consumed to achieve a -115dBc/Hz at 1MHz offset at the oscillation frequency of 2.44GHz. 4b coarse switched-capacitor bank and 6b medium bank are implemented to cover a range of around 800MHz. A fine bank of 7b is designed Figure 3.20: Schematic of the DCO, buffer and MMDIV. for fine frequency and phase locking with a resolution of around 80kHz. To further improve the fine bank resolution, a 3b 1st-order DSM is used to dither the LSB of the fine bank, where the high frequency dithering clock is directly taken from the middle stage of the MMDIV to save the power consumption from an extra high frequency divider. The DCO buffer is a biased inverter with current source, the bias is controlled to improve the buffer current efficiency. The post layout simulation shows the buffer power of a $30\mu$ W and the MMDIV of a $98\mu$ W when 0.8V supply is applied. ### 3.3.4 Coarse PLL Loop For narrow-range TDC, the phase-locked time is generally very long when a large frequency error is presented. For a DPLL with a bang-bang PD [26], a 1ms is reported for phase lock in case of large frequency step. A lock time of around $40\mu$ s is required for a TDC with 16ps range even no frequency error is presented [3]. In DPLLs using the narrow-range TDC [23, 24, 38], the frequency locked loop is shut down for further saving power consumption, which makes the loop easily suffer from large frequency and phase jump. In our design, to lock the frequency of DPLL, a 4b coarse bank of the DCO with Figure 3.21: Always-on coarse PLL with a dead zone of $\pm 64$ ps which consumes almost zero power after phase locked. Figure 3.22: Simulated lock transient of the proposed coarse-DPLL. a resolution of 50MHz/LSB is controlled by auto frequency control (AFC) function. The rest of the frequency error will be covered by a medium bank with a resolution of 1.25MHz/LSB, and a fine bank with a resolution of 7kHz/LSB thanks to the dither operation of DSM. However, even the frequency is locked very closed to the desired frequency, the large phase error may present at the narrow-range TDC. The narrow-range TDC will be saturated, and an open-loop gain will become zero. The saturated narrow-range TDC will degrade the converge speed of main PLL. In our design, an always-on coarse-DPLL shown in Fig. 3.21 works simultaneously with the main PLL loop. A dead-zone logic is inserted after the phase/frequency detector (PFD) which produces an enable signal to the counter running at DCO frequency. When the magnitude of the phase error is larger than Figure 3.23: Measurement result of the 4bit TDC at 52MS/s. Figure 3.24: Measurement result of the proposed DTC at 52MS/s. the dead-zone of 64ps, the counter will be triggered by EN signal which has the same length of the phase error. The 4b narrow-range TDC will be saturated by the large phase error and the main PLL will be idle. When the phase error is sufficiently small and within narrow-range TDC quantization range, the main loop will dominate the phase lock of the residue phase error. The EN signal is low after the phase locked and the counter will stop working. The coarse loop filter will be automatically disabled and the gating logic will minimize the digital power consumption from the loop filter. Since this loop will never be turned off, the DPLL will not suffer from the sudden large frequency and phase jump. The simulated power consumption of the coarse-DPLL is $5\mu$ W after phase locked. The transient simulation result of the coarse-DPLL is shown in Fig. 3.22. A frequency error of 13MHz is assumed before phase locked, the coarse-DPLL only takes $3\mu$ s to assist the main PLL frequency and phase locking process. Figure 3.25: (a) Measurement result of the proposed DPLL w/o reference doubler (b) Measurement result of the proposed DPLL w/ reference doubler. ### 3.4 Measurement Results The proposed fractional-N DPLL prototype was fabricated in a 65nm CMOS process. The chip photograph of the fractional-N DPLL is shown in Fig. 3.32. The proposed Figure 3.26: (a) Measurement result of the proposed DPLL w/o reference doubler with inband fractional spur (b) Measurement result of the proposed DPLL w/ reference doubler with in-band fractional spur. Figure 3.27: Measurement result of the fractional spurs vs spur frequencies. 10b isolated constant-slope DTC and path-select TDC are also fabricated in 65nm CMOS process as individual test circuits for INL measurement. In Fig. 3.23, the path-select TDC realized a 4b with 2.15ps resolution at 52MS/s. The peak INL is around 0.65ps thanks to the reduction of TDC range by the assist of DTC. The linear operation of the TA and TA gain also helps to reduce the linearity degradation coming from the coarse quantizer. Sub-ps resolution DTC is not easy to measure due to the finite sampling frequency of the oscilloscope. A frequency-domain based measurement method was introduced in [39]. Fig. 3.24 shows the measurement results of the proposed 10b DTC. The DTC achieves a 580fs time resolution with a peak INL of 0.87ps. It corresponds to an effective resolution of 9.4b in terms of linearity performance, and the DTC only consumes $140\mu$ W at 52MS/s. The detailed comparison of the proposed DTC with the state-of-the-art DTC is listed in Table 3.1. Among the DTC architectures, our proposed DTC achieves the best linearity while consuming the lowest power consumption. The fractional-N DPLL is measured under one of the BLE channels at 2442MHz as shown in Fig.3.26. To save the power consumption, reference doubler is bypassed by the MUX logic. The 26MHz reference is directly used for DTC, TDC and digital circuits. The phase noise plot is shown in Fig. 3.26(a) and an integrated jitter from 10kHz to 10MHz of 1.00ps is achieved. The measured fractional spurs are shown in Fig. 3.27 by sweeping the FCW. A worst-case spur of -50dBc is achieved without reference doubler. The power consumption is extremely low for the achieved spur and jitter performances, which can be adopted to BLE applications. To boost the effective resolution of the TDC by increasing Figure 3.28: (a) Measurement of the TA gain calibration under voltage variations (b) Measurement of the TA gain calibration under temperature variations. the sampling frequency, reference doubler is utilized. Fig. 3.26(b) shows the phase noise plot at the same BLE channel of 2442MHz. An integrated jitter of 535fs is achieved. Figure 3.29: Measured lock transient from an initial frequency error of 13MHz. Figure 3.30: Measured power break down of the proposed fractiona-N DPLL. The integrated phase noise from 10kHz to 10MHz is -44dBc, which is demanded for IEEE802.11b/g/n applications. When a small FCW of 47.000112 is used, the measured worst integrated jitter is 590fs. However, for the target applications, such as BLE and Wi-Fi, such a small fractional number is not required. A worst fractional spur of -56dBc is measured as shown in Fig. 3.27 with the reference doubler. The detailed power breakdown of each building block in the signal path of the proposed DPLL is shown in Fig. 3.30. The supply voltage for DTC and TDC are 1.0V to maintain a good linearity. A 0.8V supply are assigned to the DCO and the digital parts, which include the DCO buffer, MMDIV and the synthesized digital circuits, in order to keep a low power consumption. At a sampling rate of 26MS/s, the DTC and TDC consumes $98\mu W$ and $80\mu W$ . The DCO bias is optimized for very low power operation of $285\mu W$ . The digital parts consume $190\mu W$ with a $20\mu W$ calibration. The total power is 0.65mW for the jitter performance in Fig. 3.26(a). For a sampling rate of 52MS/s, the DTC and TDC consumes $142\mu W$ and $140\mu W$ . An additional power of $112\mu W$ from the reference doubler is consumed to double 26MHz to 52MHz with a 1.0V supply. The DCO bias is optimized for achieving a better out-of-band phase noise. The digital parts consumes $283\mu W$ with a $40\mu W$ calibration. The total consumed power is 0.98mW for the jitter performance in Fig. 3.26(b). To demonstrate the effectiveness of the proposed TDC gain calibration, in-band phase noises under voltage and temperature variations are measured as shown in Fig. 3.28. The phase noise of DPLL in integer-*N* mode with wide loop-bandwidth is measured where the in-band phase noise is purely decided by the TDC resolution. When the supply varies from 0.9V to 1.1V, as shown in Fig. 3.28(a), the in-band phase noise at 500kHz offset frequency degrades around 2.5dB. While after the proposed gain calibration, the phase noise at 500kHz offset frequency varies only 0.5dB. When increasing the temperature from -40°C to 80°C, as shown in Fig. 3.28(a), the in-band phase noise varies from -112dBc/Hz to -108dBc/Hz at 500kHz offset frequency without the calibration scheme. The in-band phase noise varies from -111dBc/Hz to -109dBc/Hz at 500kHz offset frequency with the calibration. Fig. 3.29 shows the measured phase locking transient of the DPLL. A 13MHz frequency error is an input to the FCW of the DPLL, which is over twice of the entire frequency coverage of the DCO fine bank. With the help of the proposed coarse PLL, a measured lock-up time of 4.2μs is achieved when the DPLL locks to the 54kHz away from the target frequency. The fast phase converges speed can be adopted in frequency hopping applications [40] such as BLE. The frequency hopping will cause the LMS calibration of the DTC gain to re-lock again for the new synthesized frequency. The simulated re-lock time for the LMS calibration takes less than 15μs to converge to a 0.3% gain error with 13MHz frequency jump. However, even if the LMS does not converge to the final value, the PLL will still lock to the target frequency without any issue while the fractional spur will be degraded during settling. The detailed performance comparison with the state-of-the-art fractional-*N* DPLLs are shown in Table3.2. Fig. 5.2 compares the FOM performance when only fractional-*N* DPLLs under 5mW are included. The proposed DPLL achieves a 10dB better FOM than the conventional sub-mW DPLLs. | Reference | This Work | | [22] | [23] | [24] | [25] | [3] | [26] | |-------------------------------------|----------------------------------------------|----------------------|-------------------|-----------------|----------------|---------------|----------------|-----------------| | Technology | 65nm | | 28nm | 40nm | 40nm | 28nm | 65nm | 130nm | | Architecture | Isolated<br>constant-slope<br>DTC + 4bit TDC | | Full range<br>TDC | VS*-DTC<br>+TDC | VS-DTC<br>+TDC | VS-DTC<br>TDC | VS-DTC<br>+TDC | VS-DTC<br>+BBPD | | Ref. frequency<br>(MHz) | 26<br>w/<br>Doubler | 26<br>w/o<br>Doubler | 40 | 32 | N.A. | 40 | 50 | 40 | | Frequency<br>(GHz) | 2.0-2.8 | | 2.05<br>-2.55 | 2.1<br>-2.7 | 1.8<br>-2.5 | 2.7<br>-4.33 | 4.4<br>-5.2 | 2.9<br>-4.0 | | _ ` / | | | -2.33 | -2.1 | -2.3 | -4.33 | -3.2 | -4.0 | | Integrated<br>jitter(ps) | 0.53 | 1.00 | 0.86 | 1.71 | 1.98 | 0.16 | 0.49 | 0.56 | | In-band<br>fractional spur<br>(dBc) | -56 | -50 | N.A. | -37 | -56 | -54 | -51.5 | -42 | | Power(mW) | 0.98 | 0.65 | 1.6 | 0.86 | 0.67 | 8.2 | 3.7 | 4.5 | | Ref.<br>spur(dBc) | -72 | -68 | -78 | -70 | -62 | -78 | -69 | -72 | | FOM(dB) | -246 | -242 | -239.3 | -236 | -236 | -246.8 | -240.5 | -241.3 | | Active area (mm²) | 0.23 | | 0.33 | 0.2 | 0.18 | N.A. | 0.22 | 0.22 | Table 3.2: Comparison Table of The State-of-The-Art fractional-N DPLLs <sup>\*</sup>Variable slope. Figure 3.31: FOM comparison with the state-of-the-art fractional DPLLs under 5mW. ### 3.5 Fractional-N DPLL Towards $200\mu W$ As referred to Fig.3.30 and Fig.4.31, the power consumption from the low-power fractional-N DPLL is still a significant portion of the whole TRX power consumption. Further Figure 3.32: Chip micrograph. reduction is needed. In the power consumption measurement of Fig.3.30, The DTC is still consumed a significant amount of power because of the fore-mentioned delay range, power and jitter trade-offs. Most of the power is consumed by the current source due to the slope generation [12] as shown in Fig.3.33. However, we can observe that after the comparison, the current still charges the load capacitor which consumes power. This period of the generated slope is not required for the delay generation at all. The redundant charging wastes the energy. We can minimize the power consumption by the cut off the current source after comparison finished. As calculated, this method will cut out 33% of the overall power consumption. The digital circuits power also dominates the power. The most energy consuming part comes from the MMDIV due to the high-frequency operation of the first two stages. If MMDIV can be removed, about $100\mu$ W can be reduced. Sub-sampling PLL architecture could be considered by directly sample the DCO edge at TDC input. However, lose MMDIV means the frequency and phase locking process will be affected by lack of the phase and frequency translation. Any large frequency and phase change will cause a slow locking process of the PLL without MMDIV. This challenges the design of the frequency and phase assisted blocks to speed-up locking process. To achieve $200\mu$ W power consumption, the reference clock should be lowered to further reduce TDC, DTC, and digital circuits power. A potential solution would be combining the fractional-N subsampling DPLL and the fractional-N sampling DPLL. The sampling mode can be used for assisting the phase locking. After the sampling DPLL locks the phase, the loop can be automatically switched to a subsampling mode, which removes the power consumption from the MMDIV. Figure 3.33: Conventional constant slope DTC. Last but not the least is to cut the power consumption from the oscillator. The lowest power consumption is decided by the inductor quality factor and the gm we can be realized by the active components. If the quality factor is too low for the passive components, the small current cannot guarantee the startup condition. To increase the gm, a higher current is required. By reduce the supply voltage and increase the transistor size is the most obvious way to increase device gm. However, the large transistor will create more parasitic capacitors which degrade the maximum oscillation frequency. The gm boosting technique may also be considered while the transformer design will deteriorate the quality factor further. ### 3.6 Conclusion To realize sub-mW fractional-*N* DPLL with low jitter and low spurs, the 1st-order DSM-based fractional controller works in conjunction with a highly linear DTC is introduced. The rms jitter can be improved comparing to using higher-order DSM while a DTC with high linearity is required. To realize a linear and high-energy efficient DTC, an isolated constant-slope method is proposed. Thanks to the isolated operation of DTC, the proposed DTC can potentially work at a high sampling frequency with small power consumption while maintaining good linearity with high energy efficiency. Furthermore, the auto-zero offset switch mitigates part of the supply noise, which can improve the linearity in SoC environment. The proposed fractional-*N* DPLL achieves good fractional spurs while 3.6 Conclusion 67 maintaining a low jitter performance and low power, which proves the linearity and power efficiency of the DTC. The gain calibration of TA demonstrates a steady in-band phase noise of the DPLL over the temperature variations. The measurement of lock time proves the effectiveness of the always-on coarse PLL in the feedback loop. ## **Chapter 4** # Bluetooth Low Energy Transceiver Using Digital PLL In a wireless world, the radio frequency (RF) transceiver (TRX) plays a major role in connecting devices over the air. Because the TRX consumes a significant amount power in a wireless chip, an ultra-low-power (ULP) operation is especially important in Internet-of-Things (IoT) applications. Bluetooth Low-Energy (BLE) is one of the most popular wireless standards for IoT applications. BLE TRX requires a very long battery life, which means its power consumption should be minimized as much as possible. In addition, a low receiver (RX) sensitivity is needed in order to increase the communication range. The BLE RX should also tolerate strong interference in order to keep working even in a crowded wireless environment. Low-IF and zero-IF architectures [41–43] are among the most common architectures for modern narrow-band RXs, as shown in Fig. 4.1(a). They achieve excellent sensitivity and blocker tolerance by utilizing both I and Q channels to demodulate the Gaussian frequency-shift keying (GFSK) data. They also correct the carrier frequency offset in a short period of time due to the I/Q operation. This is very improtant for TDD system that usually has short preamble. However, using both of the branches consumes significant amounts of power and area. Sliding-IF (SIF) is another popular architecture for low-power design, although this architecture causes severe image problems [44–47]. Ref. [48] proposed a hybrid-loop receiver to improve the blocker tolerance from the SIF phase-to-digital converter (SIF-PDC) architecture [44]. PLL-based phase tracking demodulator can be implemented to demodulate the 2-FSK signal, as shown in Fig. 4.1(b). A mixer is used as a phase detector to detect the phase/frequency variation from the RF input. The loop filter of the PLL after the phase detector can be reused as the low-pass filter for the reciever to reject the noise and the blockers. The control voltage from the loop filter shows Figure 4.1: (a) Low-IF receiver architecture (b) Conventional analog phase tracking RX (c) Conventional digital phase tracking RX . the frequency variations from the input RF signal, which represents the desired basedband data. Finally, 0/1 data can be extracted by analog processing. The analog demodulation saves area and power from the RX. However, due to the narrow BW operation of the PLL loop, the dynamic range of the PLL-baed demodulator is low. This significantly limits the sensitivity level. Also, the VCO phase will track the input signal frequency, which results a poor phase noise. This degrades the blocker performances due to the reciprocal mixing effect. Furthermore, due to the stability of the PLL, the order of the loop filter can not be high, which limits the blocker performance as well. In Fig. 4.1(c), a digitally controlled oscillator (DCO)-based phase-tracking RX is proposed [49] to improve the Figure 4.2: (a) Conventional hybrid-loop RX with the DPLL-based ADC (b) Proposed hybrid-loop RX with the dynamic-range enhanced DPLL-based ADC. power efficiency by adopting the single-path demodulation method, which is similar to the architecture shown in Fig. 4.1(b). This RX increase the digital. It takes the advantages of the digital process in CMOS technology. However, the issues are similar to its analog version as explained in [50]. In order to realize single-path downconversion demodulation, the GFSK constellation is transformed into a differential phase-shift keying (DPSK) constellation at the RX mixer output by shifting the RX local oscillator (LO) frequency by 250 kHz from its carrier frequency [48]. A digital PLL (DPLL) is used as an LO, and an analog-to-digital converter (ADC) is used to digitize the analog-baseband (ABB) data from ADC path, as shown in Fig. 4.2(a). This mitigates the power consumption from the Q-channel and two ADCs. However, the signal-to-noise plus distortion ratio (SNDR) of the ADC path suffers from highly-nonlinear varactor gain as well as gain variation due to process, voltage, and temperature (PVT) variation. The RX sensitivity level and the interference tolerances are degraded due to the SNDR degradation of the ADC path. The RX also suffers from an unknown carrier phase by using only I-channel for data demodulation, which decreases the signal-to-noise ratio (SNR) of the demodulated data. Furthermore, the ADC path consumes a lot of power from the time-to-digital converter (TDC) because of its linearity and resolution requirements. In the present study, we attempt to address the above issues by the proposed techniques. These techniques are verified by presenting a 2.3-mW BLE RX achieving a sensitivity of -94 dBm with all blocker performances satisfied, and a 5.0-mW single-point direct frequency-modulation (DFM) TX with an FSK error of 1.89% at an output of 0 dBm in a 65-nm CMOS process. The digital-to-analog converter (DAC) feedback path is proposed in the DPLL-based ADC to mitigate the linearity degradation and the gain variation from the varactor. This greatly improves the dynamic range of the DPLL-based ADC as shown in Fig. 4.2(b). As such, the RX sensitivity level and the interference tolerance performances are improved. Digital-to-time converter (DTC)-assisted fractional-N DPLL is implemented. Thanks to the reduced range of the TDC by utilizing a DTC, the TDC achieves a fine resolution with low power consumption, which improves the in-band phase noise. The highly-linear constant-slope DTC operation ensures a good fractional spur performance. A 5-MHz bandwidth (BW) is realized by utilizing the proposed loop-latency reduction technique. The single-path demodulation is supported by a phase-and-frequency synchronization loop in the digital domain when carrier frequency offset is presented. ### 4.1 DPLL-Centric Receiver ### **4.1.1 DPLL-based ADC with Dynamic Range Enhancement** The dynamic range of the DPLL-based ADC has considerable influences on the sensitivity level and the blocker tolerances in the hybrid-loop architecture, so, it needs to be improved. The DPLL-based ADC uses an oscillator and a varactor as a voltage-to-frequency (V2F) converter, and the DPLL performs as a frequency and phase quantizer. Fig. 4.3(a) shows the conventional implementation of the DPLL-based ADC [48]. Fig. 4.3(b) demonstrates the concept of the digitization process, the ABB data $V_{\rm ABB}$ modulates the varactor in the oscillator. If the oscillator is free running, $V_{\rm ABB}$ will produce a frequency disturbance of $K_{\rm VCO} \cdot V_{\rm ABB}$ . However, due to the negative feedback operation of the DPLL loop, the DPLL could sufficiently suppress this disturbance and correct it at the digital capacitor bank (PLL path). The compensated frequency of $K_{\rm DCO} \cdot D_{\rm OUT}$ almost equals $K_{\rm VCO} \cdot V_{\rm ABB}$ . Hence, $D_{\rm OUT}$ can be used as ADC data. In addition, the $K_{\rm DCO} \cdot D_{\rm OUT}$ cancels $K_{\rm VCO} \cdot V_{\rm ABB}$ , which produces a stable oscillator output frequency of $f_{\rm OSC} = K_{\rm VCO} \cdot V_{\rm ABB} - K_{\rm DCO} \cdot D_{\rm OUT} + f_{\rm LO} \approx f_{\rm LO}$ . Hence, $f_{\rm OSC}$ can be used as a local oscillator (LO), as shown by the LO path in Fig. 4.2(b) as well. The relation between the amplitude of the input $V_{\rm ABB}$ and the output $D_{\rm OUT}$ is: $$|D_{\text{OUT}}| = \left| \frac{K_{\text{VCO}}}{K_{\text{DCO}}} \cdot V_{\text{ABB}} \right| \tag{4.1}$$ The voltage-to-digital (V2D) conversion strongly depends on the varactor gain $K_{VCO}$ and the digital capacitor bank gain $K_{DCO}$ . This operation is conducted in an open-loop manner from an ADC viewpoint and easily suffers from the non-ideality of the loop components. One of the major problems comes from the non-linearity of the varactor at $V_{ABB}$ input, as shown in Fig. 4.3(b). Because of the full-range output from a programmable gain amplifier (PGA), the varactor gain varies a lot as the input voltage changes. The conversion spurious-free dynamic range (SFDR) is degraded due to the intermodulation distortion (IMD) and the harmonic distortion (HD) when performing V2F conversion. Moreover, the DC voltage of $V_{ABB}$ also influences the SFDR as the linearity becomes much worse at both ends. From Eq. (4.1), a larger $K_{VCO}$ is desired for achieving a better SNR of V2D conversion. However, the larger the $K_{VCO}$ is, the worse the linearity will be. Furthermore, the varactor gain variation due to the PVT potentially degrades the SNR performance. As a result, the dynamic-range of the DPLL-based ADC is greatly degraded due to the open-loop operation. For achieving better V2F linearity, a varactor array can be implemented with resistor-interpolated voltage biases [48]. Sixteen varactor banks are used to achieve a K<sub>VCO</sub> of 800kHz/V, which consumes a large chip area and produces the large parasitic capacitance of the LC oscillator. In simulation, an SFDR of only 44dB is achieved by this linearization technique, and the SFDR will become worse under PVT variation. As shown in Fig. 4.4(a), the proposed DPLL-based ADC works in a closed-loop manner. A DAC is connected to the output of the DPLL, a pre-distortion signal of $V_{\rm FB}$ is fed back to the varactor input. Then, $V_{\rm ABB}$ is subtracted with $V_{\rm FB}$ by a signal adder at the varactor input. Due to the negative feedback of the DPLL, the loop forces $V_{\rm FB} \approx V_{\rm ABB}$ . The voltage range of $V_{\rm ABB}$ is attenuated to $V_{\rm tune} = V_{\rm ABB} - V_{\rm FB}$ at the varactor input, as shown in Fig. 4.4(b). If the DPLL BW was very large, due to the large feedback gain of the DPLL, $V_{\rm tune}$ will be forced to be a DC value. Hence, the V2D conversion is not degraded by the varactor non-linearity. The DAC feedback path also performs as a PLL path and locks the oscillator phase to the reference for acquiring better phase noise. The digital capacitor bank path is used as a frequency locked loop (FLL) to lock the frequency at the correct BLE channels and will be turned off after the frequency is locked. This Figure 4.3: (a) Concept of conventional open-loop DPLL-based ADC (b) Conversion diagrams. ensures that the DC voltage of the DAC is always around 0.5V [26]. Fig. 4.5 demonstrates the operation principles of the proposed DPLL-based ADC. The frequency of the DPLL is locked to the required LO frequency of $f_{\rm RX,LO}$ at LO path, the $V_{\rm ABB}$ is extracted from the $V_{\rm RF}$ signal by the downconversion mixer, a low-pass filter (LPF), and a PGA. The $V_{\rm ABB}$ is input to the DPLL-based ADC at the ADC path, and is digitized to $D_{\rm OUT}$ . In Fig. 4.5, two loops are presented for the downconversion process and the digitization process. The downconversion loop consists of a mixer, an LPF, a PGA, and an oscillator. The PGA output of $V_{\rm ABB}$ will control the oscillator frequency though an input varactor. The oscillator frequency is also controlled by the negative feedback loop of the DPLL. The downconversion loop and the DPLL independently control the oscillator frequency to be synchronized with each input. Conflicts will occur if both loops have Figure 4.4: (a) Proposed closed-loop DPLL-based ADC with improved varactor linearity (b) Conversion diagrams. comparable BW. Analysis in [48] shows that the DPLL with a wider BW than that of the LPF can properly stabilize two loops, *i.e.*, a stabilized $f_{\rm RX,LO}$ can be realized. However, excessively increasing the BW of the DPLL will decrease the stability of the DPLL due to the limited sampling frequency and the loop latency. This will cause a large peaking near the DPLL BW which degrades the phase noise. As shown in the discrete-time model of the proposed ADC path in Fig. 4.5, the varactor input voltage is $V_{\text{tune}}$ and the quantization noise of the DPLL quantizer is $Q_n$ . We have: $$\frac{D_{\text{OUT}}}{V_{\text{ABB}}} = \frac{H_{\text{OL,DPLL}}(z)}{H_{\text{OL,DPLL}}(z) + 1} = \frac{T_{\text{REF}}^2 K_{\text{OSC}}(K_{\text{P}}(1 - z^{-1}) + K_{\text{I}})}{t_{\text{RES}} N(1 - z^{-1})^2 + T_{\text{REF}}^2 K_{\text{OSC}}(K_{\text{P}}(1 - z^{-1}) + K_{\text{I}})}$$ (4.2) Figure 4.5: The discrete-time model of the proposed DPLL-based ADC. $$\frac{D_{\text{OUT}}}{Q_{\text{n}}} = \frac{1}{H_{\text{OL,DPLL}}(z) + 1}$$ $$= \frac{t_{\text{RES}}N(1 - z^{-1})^{2}}{t_{\text{RES}}N(1 - z^{-1})^{2} + T_{\text{RFF}}^{2}K_{\text{OSC}}(K_{\text{P}}(1 - z^{-1}) + K_{\text{I}})}$$ (4.3) $H_{\text{OL,DPLL}}(z)$ is the open-loop transfer function of the DPLL, N is the divide ratio of the frequency divider, and $t_{\text{RES}}$ is the time-resolution of the TDC. Eq. (4.2) is the signal transfer function (STF) of the proposed ADC with a low-pass characteristic that has the same BW as the DPLL, as shown in Fig. 4.6. In Eq. (4.2), the factor of $K_{\rm VCO}/K_{\rm DCO}$ is removed, as compared with that in Eq. (4.1). The varactor gain dependency for $D_{\rm OUT}$ is completely mitigated by this closed-loop operation. Eq. (4.3) shows the noise transfer function (NTF) of the proposed ADC, which has a high-pass characteristic up to the DPLL BW. The NTF has a 2nd-order noise shaping around the DC to provide more suppression of the quantization noise. We can also write the attenuation factor from the signal input $V_{\rm ABB}$ to the varactor input $V_{\rm tune}$ : $$\frac{V_{\text{tune}}}{V_{\text{ABB}}} = \frac{1}{H_{\text{OL,DPLL}}(z) + 1}$$ $$= \frac{t_{\text{RES}}N(1 - z^{-1})^2}{t_{\text{RES}}N(1 - z^{-1})^2 + T_{\text{RFF}}^2K_{\text{OSC}}(K_{\text{P}}(1 - z^{-1}) + K_{\text{I}})}$$ (4.4) Since the DPLL has a finite BW, the $V_{\text{tune}}$ still has some amplitude instead of a DC value. As the signal frequency becomes higher, the attenuation will be smaller as shown in Fig. Figure 4.6: Plots of the STF, NTF and the attenuation factor with a DPLL bandwidth of 5MHz. Figure 4.7: Simulated required varactor linear range vs DPLL bandwidth. 4.6. Therefore, a wide-BW DPLL is preferred in order to help reduce the amplitude of $V_{\text{tune}}$ . This mitigates the varactor non-linearity and improves the SFDR performance of the ADC. The BW of the DPLL is decided by considering the required linear range of the varactor and the phase margin (PM) of the DPLL at a large BW. Fig. 4.7 shows the simulated results of the required linear range for the varactor. As explained by Eq. (4.4), a larger BW of the DPLL can help reduce the required linear range from the varactor. For $V_{\rm ABB}$ with a maximum peak-to-peak amplitude of 500 mV at 750 kHz, the attenuation of $V_{\rm ABB}$ is very weak for a DPLL BW of 1 MHz. This results in a linear range requirement of 280 mV for the varactor. The required range can be decreased to 80 mV at the DPLL BW of 5 MHz. Ideally, this range can be realized by a single varactor without the linearisation techniques. However, as the DPLL BW keeps increasing, the PM will be degraded. In the present study, the DPLL BW is designed to be 5 MHz and an estimated PM of 70° ensures the stability of the DPLL with sufficient margin. Another advantage of wide-BW operation is the attenuation of the adjacent interference outside the baseband signal BW. If the BW is narrow, the LO frequency $f_{RX,LO}$ may be pulled to the blocker frequency due to the large signal strength of $V_{\text{tune}}$ . Fig. 4.8 shows a schematic of the proposed DAC feedback path, in which a resistor DAC (RDAC) is used to convert $D_{OUT}$ to the analog signal. An operational amplifier (OPA) is used to perform the linear addition at node X, and the common-mode voltage of $V_{\text{tune}}$ is set to $V_{\text{COM}}$ . The RDAC converts the varactor gain of $K_{\rm VCO}$ into the digitally controlled gain of $K_{\rm OSC}$ . A large $K_{\rm OSC}$ will degrade the quantization noise of the oscillator [51] which worsens the phase noise of the DPLL. A reduced $K_{VCO}$ will help reduce the required bits of the RDAC. However, a smaller $K_{VCO}$ will cause a smaller frequency coverage. If the DPLL suffers from a large frequency drift, it will easily fail to lock. On the other hand, the quantization noise from the RDAC will also degrade the SNR of the DPLL-based ADC. From simulation results of DPLL-based ADC, an 8-bit RDAC is enough for achieving a SNR of 48 dB. The 8-bit RDAC convert the optimized $K_{VCO}$ of 4 MHz/V into digitally controlled gain of $K_{OSC}$ =4 MHz/V×2<sup>-8</sup> V/LSB=16 kHz/LSB for minimizing the phase noise degradation. The thermal noise from the RDAC is small enough and will not degrade the phase noise of the DPLL. Because the distortions result from the nonlinearity of the RDAC will appear directly at the ADC output. It is desired to have a good linearity of the RDAC to improve the SFDR of the ADC. For the process used in this work, an 8-bit RDAC with over 54-dB SFDR can be applicable when delivering an output of 500 mV<sub>pp</sub>. The DPLL BW is calibrated by the least mean square (LMS) algorithm in the background, as shown in Fig. 4.4(a). After calibration, $H_{OL,DPLL}(z)$ will be maintained constant, regardless of the $K_{OSC}$ variation, and the SNR of the V2D conversion will no longer suffer from the varactor gain variation caused by PVT in the conventional open-loop design. The linearity improvement with the proposed DAC feedback is validated by carrying out IMD simulations of the V2F conversion gain on the non-ideal model shown in Fig. 4.9, the results of which are shown in Fig. 4.10. The non-linear DAC shown in Fig. 4.9 is modeled using curve fitting from post-layout simulations. It must be noted that the effects of noise are not included in the aforementioned model in order to ensure accurate char- Figure 4.8: Schematic of the DAC feedback path. Figure 4.9: Test bench of the V2F conversion gain. acterization of the DAC non-linearity. For characterizing the non-linearity of the DAC, IMD simulations are carried out with two ideal sinusoidal test signals ( $V_{\rm TEST}$ ), each with an amplitude of 250 mV<sub>pp</sub>. Without the proposed DAC feedback (excluding the shaded DAC Feedback block in Fig. 4.9), the presence of a non-linear varactor with large input amplitude limits the linearity of the V2F conversion gain. This is evident from the IMD2 and IMD3 simulation results on the V2F conversion gain shown in Fig. 4.10(a). For evaluating the V2F conversion gain linearity with the proposed DAC feedback, the DPLL model is included as an ideal delay cell with a delay value based on system simulation. The simulation results based on an ideal DAC is shown in Fig. 4.10(b), which shows a significant improvement in V2F conversion gain linearity with 22 dB improvement in IMD2 and 30 dB improvement in IMD3 as compared to the system without DAC feedback. However, DAC also contributes nonlinearity in the feedback path. The simulated non-linearity of the non-ideal DAC is presented in Fig. 4.10(c) and the simulation carried out using this non-ideal DAC feedback reveals that the V2F conversion gain linearity is limited by IMD2. However, the degradation in IMD2 while using the non-ideal DAC as Figure 4.10: Simulated linearity of (a) V2F conversion w/o DAC feedback (b) V2F conversion w/ ideal DAC feedback (c) nonideal DAC (d) V2F conversion w/ non-ideal DAC feedback. compared to the ideal DAC is observed to be under 4 dB in the simulation results presented in Fig. 4.10(d), which is still 18 dB better as compared to the system without DAC feedback. Note that the linearity degradation from the DAC will not significantly degrade the SNDR performance of the DPLL-based ADC. To gain a more detailed look at the DPLL quantizer, a detailed phase domain block diagram is shown in Fig. 4.11(a). Conventionally, only TDC is used as the phase quantizer inside the DPLL [52]. The LC-oscillator consists of a voltage-to-phase (V2P) portion and a digital-to-phase (D2P) portion. The V2P portion will convert the input $V_{\rm ABB}$ to $\Phi_{\rm V2P}$ , and the D2P portion will convert $D_{\rm OUT}$ to $\Phi_{\rm D2P}$ . $\Phi_{\rm V2P}$ will be subtracted by $\Phi_{\rm D2P}$ , which 81 will produce a phase variation of $\Phi_{OSC}$ at the LC oscillator output. In order to realize fractional frequency synthesizing at the BLE channels, the fractional controller is used to dither the multi-modulus divider (MMD) to generate the target fractional phase of $N.f \cdot 2\pi$ , where $2\pi$ represents one DCO period. This dither operation generates a large peak-to-peak quantization noise of $\Phi_{Q,DIV} = 2\pi$ , and introduces a large output phase error $\Phi_{TDC} = \Phi_{OSC} + \Phi_{O,DIV}$ at TDC input as shown in Fig. 4.11(b). Therefore, a TDC range of over $2\pi$ is required. Just like ADC, the TDC will consume a significant amount of power due to the resolution and linearity requirement. The poor resolution of TDC will degrade the in-band phase noise, and the nonlinearity of the TDC will produce in-band fractional spurs. The adjacent channel rejection (ACR) performance will be degraded by the in-band phase noise and fractional spurs due to reciprocal mixing [53]. In this design, a wide BW is required for the DPLL, which requires a spur level of less than -40 dBc and a phase noise of less than -99 dBc/Hz at 3 MHz offset when considering the most stringent ACR performance at 3 MHz from system simulations. To maintain a sufficient design margin, a 5 MHz-BW DPLL with a worst-case fractional spur of -50 dBc and a -110 dBc/Hz inband phase noise will require a resolution of 2.5 ps and a normalized integral nonlinearity (INL) of less than 0.5%, according to the system simulation. These requirements will easily cause a power consumption of more than 1 mW for TDC alone [8]. However, from simulation results, an input signal ( $V_{ABB}$ ) with 500 m $V_{pp}$ will only produce a $\Phi_{OSC}$ with a maximum phase variation of $0.2\pi$ . A TDC with a range of $2\pi$ causes a waste of TDC range and greatly degrades the power efficiency of the DPLL-based ADC. Since the TDC resolution and linearity are both important for both ADC operation and fractional-N DPLL operation. The TDC resolution and linearity should be enhanced with less power overhead. In the present study, a digital-to-time converter (DTC) is used to reduce the required TDC range [23, 24, 26, 54] as shown in Fig. 4.12(a), which helps improve the resolution and the linearity of the TDC without power overhead. The fractional controller will produce a pre-distorted phase signal that copies $\Phi_{O,DIV}$ and will control the DTC to produce $\Phi_{DTC}$ . The DTC will add a quantization noise of $\Phi_{O,DTC}$ to its output. As a result, the input at TDC will be $\Phi_{TDC} = \Phi_{OSC} + \Phi_{Q,DIV} - \Phi_{DTC} + \Phi_{Q,DTC} = \Phi_{OUT} + \Phi_{Q,DTC}$ . Since $\Phi_{O,DTC}$ is much less than $\Phi_{O,DIV}$ , the TDC is only required to quantize a phase variation of $0.2\pi$ . To leave some safety margin, a TDC with a range of around $0.4\pi$ is designed with a 2.5 ps resolution. The power consumption of the TDC is 150µW in postlayout simulation. he constant-slope charging method [12] is utilized to fundamentally improve the linearity of the DTC. The DTC achieves a normalized INL of 0.3% with 1 ps resolution in the post-layout simulation. The proposed DPLL-based ADC achieves a power consumption of around 1.0mW and a simulated SNR of 48dB and SFDR of 54dB thanks to the DAC feedback path and TDC resolution enhancement technique. Figure 4.11: (a) Conventional data digitization process by the full-range TDC (b) Phase domain diagram. ### 4.1.2 Wide Loop-Bandwidth Fractional-N DPLL As 5MHz BW for the DPLL is specified by the DPLL-based ADC, it brings various challenges, such as the phase noise, the fractional spurs and the power consumption. The standard 26MHz reference would only be able to support around 2.6MHz BW for a type-II PLL due to the Gardner's limit [55]. Reference doubler technique is adopted to double the 26MHz to 52MHz. A time-amplifier is used to improve the coarse TDC resolution from 20ps to 2.5ps. Less than -110dBc/Hz in-band phase noise can be achieved at the 52MHz reference. Another factor that limits the wide-BW operation is the forward loop latency from the TDC input to the digital loop filter (DLF) output as shown in Fig. 4.13(a). The larger D in $Z^{-D}$ is, the worse stability at wider BW will be. In [7], a latency of $3T_{REF}$ limits its maximum BW to around 4MHz at 50MHz reference clock. In conventional work [56], the DLF is separated into two paths, *i.e.*, the proportional path ( $K_P$ path) and the integral path ( $K_P$ path) without digital summing at the DLF output. Both paths are fed into different varactors in the VCO though DACs. This reduces the latency from the TDC inputs to the oscillator interfaces. However, the gains of two paths will vary according to the PVT Figure 4.12: (a) Proposed data digitization process assisted by DTC (b) Phase domain diagram. variation which requires two gain calibration units. Moreover, the lack of retiming at proportional path will produce glitches hence worsen the phase noise. In our proposed technique in Fig. 4.13(b), the 5b coarse quantizer output is retimed by its own output. As shown in Fig. 4.13(c), after the quantizer finishes the quantization and acquires the thermal bit data (Raw Data), the quantizer output clock $T_{\rm OP}$ is reused as TDC clock (TDC CLK) to retime Raw Data at the TDC decoder. It produces the 5bit TDC data with aligned clock. The operation of the proposed TDC is like a bang-bang phase detector (BBPD), which produces only 0/1 data and has a very low latency. Overall, the proposed TDC has a latency of $\Delta T_{\rm TDC} + \Delta T_1 \approx 2$ ns, in which $\Delta T_1 > \Delta T_{\rm Dec}$ where $\Delta T_{\rm Dec}$ is the operation time of the decoder. To reduce the DLF latency, the TDC CLK is further reused as DLF clock (DLF CLK) with a $\Delta T_2$ delay. If $\Delta T_2 > \Delta T_{\rm DLF}$ where $\Delta T_{\rm DLF}$ is the operation time of the DLF, the DCO code is still retimed by the same clock edge of the TDC output. As a result, Figure 4.13: (a) The forward-loop of DPLL with D· $T_{REF}$ latency (b) TDC with the proposed loop-latency reduction (c) Timing chart of the loop-latency reduction. the reduced loop latency $\Delta T_{\rm Latency}$ is $\Delta T_{\rm TDC} + \Delta T_1 + \Delta T_2 < 0.5 T_{\rm REF}$ . Because all the data from TDC and DLF are both retimed by a clean edge, the glitches are removed and the phase noise will not be degraded. To demonstrate the effect of the loop latency reduction, different latencies are added to the forward PLL loop as shown in Fig. 4.14. With a $3T_{\rm REF}$ latency, large jitter peaking will appear adjacent to the corner frequency of the DPLL BW. While using the proposed loop latency compensation, the phase noise peaking is Figure 4.14: Phase noise simulations of 5MHz-BW DPLL with different loop latencies. Figure 4.15: Proposed BLE RX baseband with DPLL-based ADC and phase/frequency synchronization loop completely eliminated even at 5MHz and the integrated phase noise is improved by more than 12dB from a $3T_{\rm REF}$ . An in-band phase noise of -110dBc/Hz and a worst fractional spur of less than -50dBc is achieved at 5MHz BW with a 1.05mW power. Figure 4.16: Simulated result w/o frequency and phase synchronization loop. (b) Simulated results w/ frequency and phase synchronization loop. ### 4.1.3 Hybrid-loop RX with Phase and Frequency Recovery Loop As mentioned earlier, the single-path downconversion method [48] reduces by half the energy consumption and the area from the ABB and ADC in the RX. However, the unknown carrier phase and frequency will degrade the SNR of the down-converted signal. If there is a constant-phase mismatch between the LO and $V_{\rm RF}$ , the amplitude of the down-converted ABB signal $V_{\rm ABB}$ will be degraded. $V_{\rm ABB}$ is digitized to $D_{\rm DBB}$ and is further processed by a DPSK decoder in the digital baseband (DBB) to acquire the 0/1 data. With the noise associated with the decoder inthe DBB, the threshold will be a Gaussian distribution instead of a constant value. The reduced amplitude of $V_{\rm ABB}$ and the noise will significantly degrade the bit error rate (BER) of the RX. Fig. 4.15 shows the proposed RX baseband, and a phase and frequency synchronization loop is implemented to improve the SNR of the down-converted signal. Fig. 4.16 shows the simulated results with and without the synchronization loop. The worst-case phase shift of $\pi/2$ is assumed in the I-channel signal as discussed in [48]. As shown in Fig. 4.16(a), without synchronization, the amplitude of the down-converted signal of $V_{ABB}$ is greatly degraded. As a result, $D_{DBB}$ will be falsely decoded. With the synchronization, a timing error detector (TED) is placed after the FIR filter to detect the amplitude degradation. When the amplitude of $V_{ABB}$ is recovered to its maximum value, we have: $$(x[n \cdot T_S] - x[(n-1) \cdot T_S]) \cdot (x[(n-0.5) \cdot T_S]) = 0$$ (4.5) where $T_S$ is the 13-MHz sampling clock, and $x[(n-0.5) \cdot T_S]$ is the half-symbol point between the current symbol $x[n \cdot T_S]$ and the previous symbol $x[(n-1) \cdot T_S]$ , as shown in Fig. 4.16(b). The detected phase error was filtered and transferred into the control code and was added with the DPLL frequency control word (FCW) to instantaneously change the DPLL phase by varying the output frequency. The $V_{ABB}$ amplitude is significantly recovered as shown in Fig. 4.16(b). A settling time of six data symbols is achieved in the simulation. However, due to the long delay from $V_{RF}$ to the TED input in Fig. 4.15, which is mainly dominated by the 4th-order LPF, the settling time of the phase and frequency recovery loop will be degraded if a large carrier frequency offset is presented. When simulated with a carrier frequency offset of $\pm 100$ kHz, nearly 30 $\mu$ s is required for proper settling. This excessive settling time exceed the 8-symbol preamble time required by the BLE specification. The settling time can be satisfied by dynamically changing the BW of the LPF. When the receiving signal is detected, a large BW of the LPF is adjusted to minimize the delay from $V_{RF}$ to the TED input for fast settling of the synchronization loop, while the ACR performance will be degraded. After the synchronization loop is settled, the BW of the LPF is minimized. In the proposed architecture, there is a trade-off between the required preambles and the ACR performance. In the present study, the LPF is optimized for better ACR performance. Another issue is that the large interference will cause additional noise in Eq. (4.5), which will degrade the BER performance. However, a higher-order LPF can be adopted to suppress this extra noise. ### 4.2 Building Blocks of The BLE Transceiver Fig. 4.17 shows the proposed BLE TRX, which uses multiple loops for supporting the GFSK data modulation and demodulation. The proposed RX adopts the concept of single-channel demodulation by transferring FSK to DPSK constellation [48]. DPLL-based ADC is used as LO source as well as the ADC to perform the digitization. The DAC Figure 4.17: Proposed DPLL-centric BLE Transceiver. Figure 4.18: RX front-end implementation with entire 1V-supply. feedback path is used to improve the dynamic range of the DPLL-based ADC. The synchronization path is utilized to synchronize the phase and frequency between the LO and RX input (RX IN). The reference doubler and loop latency reduction techniques are used to support 5MHz-BW operation of the DPLL. The coarse TDC and gated loop filter is used to increase the phase locking speed while saving the energy after phase locked by the PLL path [21]. The digitized data $D_{\rm OUT}$ with 52MHz sampling rate will be decimated 4 times by a cascaded integrator-comb (CIC) decimation filter. The power can be reduced because of the multiplierless structure of the CIC filter. However, the magnitude response of the CIC filter has a low attenuation in the passband region. Hence, a CIC-compensation filter is required to compensate this attenuation in order to get a flat in-band response. The CIC-compensation filter has 27-taps, and the overall channel-select filter achieves a 1MHz bandwidth with 10dB stop-band attenuation. The filtered data will be further processed by a symbol timing recovery block [57], which recovers the symbol timing and sends the correct timing to the DPSK decoder. The polar modulation path is served for the frequency modulation in TX. All those functions are completed by the low-power fractional-N DPLL acting as a center component in the TRX. The reusing of the low-power DPLL cuts a significant amount of power and aggressively minimizes the TRX power consumption without sacrificing the performances. ### **4.2.1** Receiver Front-End Design As the highest power consuming part in a ULP transceiver, the power consumption of the radio frequency front-end and the analog front-end should be minimized. As reported in [58, 59], various low-power front-end structures have been proposed for ULP transceivers. The low noise amplifier (LNA) is the most power hungry component due to its noise, Figure 4.19: (a) Low power consumption source degenerated LNA with stacked gm-cell (b) LC-Oscillator. linearity and gain requirements. In [48], the differential LNA alone consumes 0.97 mW from a low supply-voltage of 0.6 V. To improve the power efficiency of the LNA, a supply of 0.6 V is utilized instead of 1.1 V for other analog circuits. However, the LNA requires an additional DC-DC converter and a low-dropout (LDO) regulator to acquire a 0.6 V supply. In order to avoid using a different supply-voltage while maintaining its power efficiency, a new LNA topology is highly demanded. Fig. 4.18 shows the entire RX-FE Figure 4.20: (a) Small area and highly balanced stacked balun (b) EM simulation of stacked balun. implementation. To realize a significant power reduction while achieving required performance without lowering the supply voltage, a new current-reused single-to-differential LNA is proposed, as shown in Fig. 4.19(a). With a fully on-chip matching network, a single-ended LNA with a stacked differential transconductance amplifier (gm-cell) is implemented. A balun is inserted between the LNA and the gm-cell in order to perform both inductive loading of the LNA and a single-to-differential converter. The proposed topology can share the same supply voltage with other analog building blocks without the need for an additional low-voltage supply to maintain the current efficiency. The input signal is amplified in the voltage domain by the LNA and transformed into a differential signal by the passive balun. To save chip area, a stacked single-to-differential balun architecture is adopted [60]. As shown in Fig. 4.20(a), this balun is composed of three turns of primary windings by the top metal (M9) and four turns of secondary windings by M8. This stacked balun structure has a high coupling factor utilizing the same area as a single inductor. With the center tap of secondary windings connected to ground, the single-ended input signal can be transformed into differential signals, which are directly connected to the stacked differential gm-cell inputs. In electromagnetic simulation, the phase imbalance between the differential ports is only 0.9° and the amplitude imbalance is less than 0.1 dB at the operating frequency of interest as shown in Fig. 4.20(b). To ensure all transistors operate in the linear region, the biases and the transistor size of the gm-cell and the LNA are optimized in simulations. The total current flows in the gm-cell and its bias condition decide the drain voltage (VDD<sub>LNA</sub>) for the LNA transistor. Consequently, the DC current is reused between the gm-cell and the source degenerated LNA. In the case of the mismatch between the two branches in the gm-cell, a 10-pF capacitor is implemented at the LNA's VDD to realize AC ground. Using a 1-V supply, the power consumption of this stacked structure is only 0.7 mW. With fully on-chip impedance matching, the minimum noise figure of this LNA with a stacked gm-cell is 4 dB. With an inverter-based gm-cell, the RF signal can be transformed into the current domain, which relaxes the linearity requirements for mixers and analog front-end. A passive double-balanced mixer is implemented to avoid the flicker noise and the power overhead from an active mixer in voltage-domain. A 4th-order LPF with a BW of 750 kHz is implemented for higher blocker rejection. The gain of the receiver chain can be controlled to allow different input levels as shown in Fig. 4.18. The switch capacitor bank ( $C_{\text{BANK}}$ ) is used for accurately controlling the passband of the LNA, as shown in Fig. 4.19(a). Gain control technique [61] is used to digitally control the LNA gain. The measured gain of the LNA and Gm-cell can be adjusted from 12 dB to 46 dB, and the PGA have a measured gain control range of 28 dB. The measured 1-dB compression point of the RX is -14.2/-22.0/-45.5 dBm and the measured in-band IIP3 of the RX is -3.5/-11.5/-32.5 dBm in the low/medium/high gain setting of the LNA. The out-of-band IIP3 (OBIIP3) of +2 dBm is measured by feeding Two-Tone signals at 2.5000GHz and Figure 4.21: Block diagrams of single-point polar TX. 2.5661GHz to the RX input with an LO frequency of 2.434GHz. The OBIIP2 of +58 dBm is measured by feeding Two-Tone signals at 2.5000GHz and 2.5001GHz to the RX input with an LO frequency of 2.434GHz. The detailed oscillator implementation is shown in Fig. 4.19(b). A CMOS-type LC architecture is utilized for the low-power operation. Both the digital capacitor bank and the varactor bank are implemented for the DPLL-based ADC. The varactor bank consists of four identical varactor cells to perform over a 100 mV linear range. The wide BW DPLL operation relaxes the oscillator phase noise requirement. The simulated phase noise is -110 dBc/Hz at a 1-MHz offset with a power consumption of 0.21 mW. The tuning range of the oscillator is designed from 2.2 GHz to 2.6 GHz to cover the 80 MHz BLE band. ### 4.2.2 Single-Point Polar-TX The DFM-TX [20, 62, 63] draws a lot of researchers' attentions in BLE applications because of its simplicity comparing with the Cartesian-TX when performing the FSK modulation. Wide-BW DPLL is capable of realizing wider TX modulation BW. However, the DPLL requires additional power to increase BW. In the present study, thanks to the low-power wide-BW DPLL with low spurs and good in-band phase noise proposed in Section II-B, the single-point DFM TX with low power consumption can be realized. The pulling effect from the PA to the oscillator at PA start-up becomes severe if the oscillator and PA work at the same frequency. This effect becomes dominate at a very large output power of the PA and will degrade the settling time of the DPLL. The wide-BW operation of the DPLL can help reduce the frequency settling time of the DPLL at the PA start-up. Fig. 4.21 shows the single-point DFM TX design. Class-D PA [64] is implemented to improve the power efficiency while it results in a large third order harmonic at PA output. Hence, Off-chip filter is used to suppress this harmonic. For test purpose, the 1-Mbps data is generated from the data pattern generator which is not synchronized with the on-chip reference clock of 26 MHz. In order to avoid the meta-stability, two D flip-flops (DFF) working at 13 MHz are used to retime the 1-Mbps TX data. The encoder transfers the 1-bit of information into 10-bit signed fixed-point number. The data is filtered by a digital GFSK filter with a BT of 0.5 and a modulation index of 0.5. The GFSK filter output will be normalized using a constant gain of $\eta$ , which yields a modulation code of $c_{\text{mod}} = y \cdot \eta$ . The output will be added to the FCW that has a 6-bit integer part and 18-bit fractional part. The fractional-N DPLL has a gain of $K_{\text{DPLL}}$ =52 MHz/2<sup>18</sup> LSB ≈200 Hz/LSB at the FCW input. At the output of the PLL, $f_{\text{out}} = (FCW + c_{\text{mod}}) \cdot K_{\text{DPLL}} = f_{\text{LO}} + c_{\text{mod}} \cdot K_{\text{DPLL}}$ ### 4.3 Measurement Results The prototype of the proposed BLE TRX is implemented in a 65-nm CMOS technology. The chip micrograph is shown in Fig. 4.32. The measured phase noise of the fractional-N DPLL is shown in Fig. 4.22. The DPLL achieves a phase noise of -110 dBc/Hz at 1-MHz offset frequency with around a 5-MHz BW at a frequency of 2441.75 MHz, while no significant jitter peaking is observed thanks to the loop-latency reduction technique. The worst-case in-band phase noise is -108 dBc/Hz at 1-MHz offset under 80°C. The measured worst in-band fractional spur of DPLL over all BLE channels is -51.7 dBc. To validate the input power tolerance of the RX, different levels of the BLE signals are added at the LNA input port as shown in Fig. 4.22. An input power of up to -10 dBm at 2442 MHz is added at the LNA input in order to demonstrate the specified maximum input power. With the gain adaptation of the LNA and the PGA as well as the wide-BW DPLL operation, the PLL remains locked even with -10 dBm as the input. The integrated phase noise degrades around 1 dB at the desired input of -67 dBm. The single-path downconversion RX is stabilized by the 4th-order LPF and the 5-MHz wide-BW DPLL loop as explained in Section II-A when large in-band blockers are presented. The phase noise of the DPLL can be the indicator of the stability of the RX, which shows the stability of the LO frequency. In Fig. 4.23, the desired signal of -67 dBm at 2442 MHz and different levels of in-band blockers at ±1 MHz/±2 MHz/±3 MHz are fed to the RX input at a fixed RX gain. To satisfy the ACR specification, the required levels of in-band blockers are -82/-50/-40 dBm at the adjacent frequency of ±1 MHz/±2 MHz/±3 MHz. For the blocker at 1 MHz with -40 dBm, which is 42 dB higher than the BLE Figure 4.22: Measured DPLL phase-noise at 2441.75MHz with TX/RX off. specification, the LO frequency is still stable, as shown in Fig. 4.23(a). The blocker at -1 MHz as shown in Fig. 4.23(d) has the biggest impact on the RX system, as it suffers from less suppression from the LPF due to the shifted RX LO frequency of 250 kHz for the single-path downconversion demodulation method [48]. However, this -50-dBm blocker level is still much higher than the requirement of -82 dBm in the BLE specification. A -20-dBm blocker power at ±2 MHz will degrade the stability of the RX and generate noises to the lower offset frequency, as shown in Fig. 4.23(b) and Fig. 4.23(e). However, sufficient margin is left for the ACR specification. With the help of the 4th-order LPF, the blockers at ±3 MHz will not degrade the stability of the loop even with -20 dBm power, as shwon in Fig. 4.23(c) and Fig. 4.23(f). The higher-order LPF can be adopted to achieve better RX stability and higher blocker tolerance while more power is required. In order to evaluate the dynamic-range of the DPLL-based ADC, a pure sine wave at 250 kHz is given. The ADC output is monitored using a 10-bit DAC to save pins, and DFFT is performed to calculate the SNDR performance. In order to verify the improvement of the proposed method, the DPLL-based ADC can be configured as either the conventional open-loop type with a digital capacitor control path or the proposed close-loop type. In the conventional open-loop method, a maximum SNDR of 25 dB is achieved at an input of approximately -18 dBFS input. As the input increase further, the varactor linearity will become worse and the SFDR will degrade dramatically as the input becomes larger. After we close the loop by the DAC feedback path, the SNDR continues Figure 4.23: Measured stability of the RX when the large in-band blockers and the desired signal are fed to the RX. Figure 4.24: Measurement result of the ADC SNDR. Figure 4.25: (a) Measurement result of the PGA output w/o phase and frequency synchronization Loop (b) Measurement result of the PGA output w/ phase and frequency synchronization Loop. Figure 4.26: Measured demodulator with CDR function. to increase, even after -18 dBFS, and reaches around 43 dB at an input of -6 dBFS. The SNDR starts to degrade after -6 dBFS due to the saturation of the TDC code and the linearity degradation of the varactor. The linearity improvement by the DAC feedback path enhances the dynamic range of the ADC by 18 dB, *i.e.*, an improvement of 3 effective bits. The dynamic range improvement directly improves the sensitivity and the in-band blocker tolerance. The phase and frequency synchronization loop is evaluated by being turned on/off, as shown in Fig. 4.29. When there is no synchronization, even a very small phase and frequency error will degrade the amplitude of the down-converted data at PGA output, as shown in Fig. 4.25(a). The analog data could not be distinguished at digital baseband, and the decoded data will be wrong as shown in question marks. With the synchronization shown in Fig. 4.25(b), the amplitude of the data is recovered. When the carrier frequency offset is presented as shown in Fig. 4.27, the BER is measured at the desired input power of -67 dBm. The synchronization loop can cover a range of $\pm 100$ kHz when the BER requirement of 0.1% can still be satisfied. If a large blocker of -40 dBm is associated with the desired -67 dBm signal, the synchronization loop is affected and the coverage decreasing to $\pm 50$ kHz. The digital baseband is evaluated in Fig. 4.26. The PGA output, Figure 4.27: Measured BER with phase and frequency synchronization loop when the carrier frequency offset is presented in the TX signal. decoded data and recovered data clock are measured using an oscilloscope. Due to the constellation transform from GFSK to DPSK, the TX data can be read out as shown in Fig. 4.26. The symbol recovery circuit after the FIR filter extracts the correct sampling clock and provide the recovered clock to the DPSK decoder. The sensitivity is measured by evaluating the BER performance. The data points of the recovered data and the recovered clock shown in Fig. 4.26 are exported from the oscilloscope. A total of 10,000 symbols are recoded for the data post-processing performed using Matlab. The BER is computed by comparison with PRBS9 data. A sensitivity of -94 dBm is achieved when the BER is still below 0.1%. The blocker performances are measured by setting the desired signal to -67 dBm and applying different levels of blocker power. The maximum tolerant blocker level is measured when the blocker power makes the BER over 0.1%. The ACR, as one of the most important specifications for BLE RX, is shown in Fig. 4.28(a). To demonstrate the dynamic-range improvement of the DPLL-based ADC with and without the DAC feedback path, the ACRs of both cases are measured. Without the DAC feedback path, the ACR drops below the specified value in BLE standard at -3 MHz. With the DAC feedback path, the ACR is improved by almost 9 dB at -3 MHz and all points satisfy the BLE standard with a sufficient margin. The out-of-band blocker performance is measured using the same method as the ACR measurement shown in Fig. 4.28(b). This performance is mainly limited by the out-of-band rejection of the matching network and the RX linearity. The single-point DFM TX is measured using the vector signal analyzer. The spectrum from the PA output at the BLE channel of 2434 MHz is shown in Fig. 4.29(a). The eye pattern is measured as shown in Fig. 4.29(b). The TX achieves a 1.89% FSK error. The Figure 4.28: (a) Measurement result of the RX ACR with and without DAC feedback loop (b) Measurement result of the out-of-band blocker tolerance. measured worst-case GFSK modulation deviation for a 11110000 data pattern, *i.e.*, $\Delta f_1$ , is $\pm 249$ kHz. As for the measured worst-case GFSK modulation deviation for a 10101010 data pattern, *i.e.*, $\Delta f_2$ , the result shows a deviation of $\pm 215$ kHz. The measured HD2 and HD3 is -43.0 dBm and -41.4 dBm for a PA output of 0-dBm. The settling time of the DPLL is measured at the PA output of 0 dBm when turn on the enable signal of the DPLL and PA simultaneously. Due to the large DPLL BW, the mutual pulling effect of oscillator and PA is reduced as compared with [42, 43, 46], and a settling time of less than 5 $\mu$ s is achieved, as shown in Fig. 4.30. The measured power consumption breakdowns of the Figure 4.29: (a) Measurement result of the TX spectrum mask (b) Measured eye diagram of the single-point polar transmitter. RX and TX, including the DBB are shown in Fig. 4.31. A power consumption of 2.6 mW is achieved for the RX at maximum gain while 5.2 mW is consumed for the TX when delivering 0 dBm output power. A detailed comparison with the state-of-the-art BLE 4.0 TX/RX is shown in Table I. The RX consumes less power while achieving better blocker performance. Figure 4.30: Settling time of DPLL at 0-dBm PA output when DPLL and PA start up simultaneously. Figure 4.31: Measured power consumptions of each building blocks. ### 4.4 BLE Transceiver Towards 5.0 As we already mentioned in Chapter 1.2, the current work is based on 4.0 standards. The major differences from the 4.2 standards are shown in the following Table-4.2. Multi data rates are added to support long-range mode with 125/500kbps data rate and a high data rate mode of 2Mbps. In order to maintain a link range of over 200m, the transmitter power is also increase to 20dBm(100mW) from a 10dBm(10mW). A ten | | | This | Work | [48] | [42] | [46] | [45] | [43] | [50] | |-----------------------------------------------------------------------------------|----------------|---------------------|---------------------------|--------------------------------------|----------------------------------------|---------------------------------------|-----------------------------------------|--------------------------------------|---------------------------------------| | Technology | | 65 | nm | 65nm | 28nm | 40nm | 40nm | 55nm | 40nm | | Integration Level | | RF<br>+DPLL<br>+DBB | | RF<br>+ADPLL<br>+ DBB | RF<br>+ADPLL | RF<br>+PLL<br>+DBB<br>+MCU SoC | RF<br>+PLL<br>+PMU | RF<br>+PLL<br>+DBB<br>+PMU | RF<br>+PLL<br>+DBB | | RX Sensitivi | RX Sensitivity | | dBm | -90dBm | -95dBm | -94dBm | -94.5dBm | -94.5dBm | -95dBm | | RX ACR<br>@1MHz,<br>@2MHz,<br>@3MHz | | 310 | lB,<br>dB,<br>dB | N.A.,<br>24dB,<br>29dB | N.A. | 4dB,<br>25dB,<br>35dB | 2dB,<br>32dB,<br>N.A. | N.A. | N.A.,<br>18dB,<br>30dB | | Blocker Power<br>(30~2000MHz,<br>2003~2399MHz,<br>2484~2997MHz,<br>3000~12750MHz) | | -13d<br>-12d | Bm,<br>IBm,<br>IBm,<br>Bm | -6dBm,<br>-22dBm,<br>-16dBm,<br>0dBm | -20dBm,<br>-25dBm,<br>-24dBm,<br>-7dBm | -42dBm,<br>-25dBm,<br>-24dBm,<br>N.A. | -18dBm,<br>-28dBm,<br>-28dBm,<br>-13dBm | 4.5dBm,<br>-9dBm,<br>-9dBm,<br>>9dBm | -1dBm,<br>-15dBm,<br>-17dBm,<br>-8dBm | | TX Architecture | | Single-point polar | | N.A. | 2-point polar | 2-point polar | Up<br>conversion | Up<br>conversion | 2-point polar | | TX<br>Modulation Error | | 1.89% | | N.A. | 2.67% | 4.8% | N.A. | N.A. | 2% | | TX Output Power | | -3dBm | | N.A. | 0dBm | -2dBm | 0dBm | 0dBm | 1.8dBm | | Supply Voltage | | 1V | | 0.6/1.1V | 0.5/1V | 1V | 1.1V | 0.9~3.3V | 0.8V | | Power | RX | DBB<br>Analog | 0.3mW<br>2.3mW | 0.5mW<br>5.5mW | N.A.<br>3.75mW | 0.4mW<br>3.3mW | N.A.<br>6.3mW | 11.2mW | 0.74mW<br>2.3mW | | Consumption | TX | DBB<br>Analog | 0.2mW<br>2.9mW | N.A.<br>N.A. | N.A.<br>4.7mW | 0.2mW<br>4.2mW | N.A.<br>7.7mW | 10.1mW | N.A.<br>6.1mW | | TRX Active Area | | 1.64 | mm <sup>2</sup> | N.A. | 1.9mm <sup>2</sup> | 1.3mm <sup>2</sup> | 1.1mm <sup>2</sup> | 2.9mm <sup>2</sup> | 0.8mm <sup>2</sup> | Table 4.1: Comparison Table of The State-of-The-Art BLE 4.0 TR/RX Figure 4.32: Chip photo of BLE TRX. times bigger output power will make the PA integration much more difficult than the TX design in Chapter 4.2.2. One of the most significant challenge if the harmonics that raised from the nonlinear operation of the power amplifier. A 20dBm output will require at least -61dBc suppression on the 2nd and 3rd harmonics at the PA side. Such a substantial re- | Table 4.2. Major Differences from BLE 4.2 | | | | | | | |-------------------------------------------|--------------------------------|--|--|--|--|--| | Output Power | -20dBm to 20dBm | | | | | | | Symbol Rate | 125kbps, 500kbps, 1Mbps. 2Mbps | | | | | | | Range | >200m in long range mode | | | | | | Table 4.2: Major Differences from BLE 4.2 Figure 4.33: The present TRX RF I/O solution. Figure 4.34: Harmonics from the present BLE TX. jection will require external components such as capacitors and inductors outside the chip which causes additional area on PCB. Furthermore, the required time for the developer to release their products will be influenced due to the time spend on the external components. Hence, a single chip solution is highly appreciated. Fig.4.33 shows the present implementations of the current RF input/output (I/O) solution. Even though the LNA and PA are internally matched to a 50ohm impedance for the 4.5 Conclusion 105 Figure 4.35: RF-FE with integrated matching network and antenna switch. external antenna, the antenna switch is not integrated. This means either two antennas are required, or an external antenna switch is required. The former situation is not preferred because of the huge size of the antenna at a lower frequency. However, the external switch is also not preferred because of our mentioned single chip solution. Hence, an internal TRX switch is required at the antenna port of the chip while matching networks should also be realized simultaneously as shown in Fig.4.35. Fig.4.34 shows the harmonics from the present TX output. The second and third harmonics achieves a worst of -48dBm power when the TX delivers an around -6dBm output. If we keep increasing the PA output, the harmonics will become larger and larger which potentially degrades other surrounded receivers. Another challenge is to suppress the 2nd and 3rd harmonics further when the considerable output power is delivered. Those harmonics will significantly interfere with other receivers operating around the same frequency such as Wi-Fi. The more suppression we have, the less interference it will be. Notice that the matching network for PA can perform as a bandpass filter. However, the limited filter order will only produce a minimal suppression. It is not enough to satisfy the FCC regulations. One of the most natural way if to add one or two more band-pass filters at the PA output using inductors and capacitors. However, the additional area of large inductors will significantly increase the cost of the chip. Fewer inductors should be included in the antenna switch which challenges the RF-FE design. ### 4.5 Conclusion A BLE TRX for IoT applications is demonstrated in a 65nm CMOS technology. A wide-BW fractional-N DPLL plays a centric role in the BLE TRX which maximally reduces the ADC with the dynamic-range enhancement technique is proposed, and greatly improves the sensitivity level and the interference tolerance. The proposed DPLL-based ADC can be utilized in narrow-band wireless applications. Loop-latency reduction and the reference doubler helps to mitigate the jitter peaking at the 5-MHz-BW of the DPLL using only a 26-MHz reference clock and improves the stability of the RX. Phase and frequency synchronization loop assists the proper demodulation of the single-path downconversion demodulation. For the single-point DFM TX , the wide BW of the DPLL improves the settling time of the DPLL at TX start up. # **Chapter 5** # **Conclusion and Future Directions** This thesis presented the newly proposed concept and the design methodology of low-power, low sensitivity and high blocker immunity BLE transceiver using the advanced sub-mW fractional-N DPLL in CMOS technology. Particular emphasis has been placed in the investigation of incorporation fractional-N DPLL into the whole transceiver operations. The fractional-N DPLL plays multiple rules of ADC, the phase and frequency synchronizer, a local oscillator and a frequency modulator simultaneously. The digital intensive DPLL takes advantages of deep sub-micron CMOS technology which significantly cut the chip area and brings the merits of intensive digital I/Os. These digital I/Os can be used for PVT calibrations as well as data converter outputs. In the DPLL design, a novel isolated constant-slope DTC is proposed to reduce the power consumption while improving the DTC linearity. TDC range is assisted by the DTC which significantly reduced the high-resolution TDC power. The power consumption of the transceiver is reduced significantly compared with previous state-of-the-art works. A summary of the techniques and results presented in this dissertation is given in the following sections. ### 5.1 Conclusion This thesis presents techniques for realizing BLE transceiver that can be used in IoT applications. The extremely low power operation and the good blocker performance are achieved by the DPLL-centric receiver architecture and the single-point transmitter architecture. As shown in Fig. 5.1, the proposed receiver architecture lowers the power consumption in the following three ways: 1) Single-path demodulation receiving method with phase and frequency synchronization is utilized to cut almost half of the power consumption from the analog baseband of the conventional I/Q-based receiving method; 2) DPLL- Figure 5.1: Low-power DPLL-centric receiver architecture. | Table 5.1: Com | parison Table | e of The | State-of-The- | Art BLE | 4.0 TR/RX | |------------------|---------------|-----------|---------------|---------|-----------| | I WOID DITE COIN | parison raci | 0 01 1110 | Diate of The | | 110 11411 | | | | This Work | | Toshiba | TSMC | IMEC | Renesas | Dialog | |------------------|-------|---------------------|-----------|----------|--------------------|-------------------|---------------------|-------------------| | | | | THIS WOLK | | 2017 | 2015 | 2015 | 2015 | | Technology | | 65 | nm | 65nm | 28nm | 40nm | 40nm | 55nm | | RX Sensitivity | | -94dBm | | -90dBm | -95dBm | -94dBm | -94.5dBm | -94.5dBm | | RX ACR | | 1dB, | | N.A., | | 4dB, | 2dB, | | | @1MHz, | | 31dB, | | 24dB, | N.A. | 25dB, | 32dB, | N.A. | | @2MHz, | | 36dB | | 29dB | IV.A. | 35dB | N.A. | и.л. | | @3MHz | @3MHz | | JUUD 29 | | | 3300 | IV.A. | | | Blocker Power | | -1dBm, | | -6dBm, | -20dBm, | -42dBm, | -18dBm, | 4.5dBm, | | (30~2000MHz, | | -1dBiii,<br>-13dBm, | | -0dBm, | -25dBm, | -42dBm, | -18dBm, | -9dBm, | | 2003~2399MHz, | | -13dBm, | | -22dBm, | -23dBm, | -23dBm, | -28dBm, | -9dBm, | | 2484~2997MHz, | | 1dBm | | OdBm | -7dBm | N.A. | -28dBm, | >9dBm, | | 3000~12750MHz) | | TUDIII | | Oubin | -/uDili | IV.A. | -13dDill | / Jubin | | TX | | 1.89% | | N.A. | 2.67% | 4.8% | N.A. | N.A. | | Modulation Error | | | | 1 1.7 1. | 2.0770 | 7.070 | | | | TX Output Power | | -3dBm | | N.A. | 0dBm | -2dBm | 0dBm | 0dBm | | Power | RX | DBB | 0.3mW | 0.5mW | N.A. | 0.4mW | N.A. | 11.2mW | | | | Analog | 2.3mW | 5.5mW | 3.75mW | 3.3mW | 6.3mW | | | Consumption | TX | DBB | 0.2mW | N.A. | N.A. | 0.2mW | N.A. | 10.1mW | | | | Analog | 2.9mW | N.A. | 4.7mW | 4.2mW | 7.7mW | | | TRX Active Area | | 1.64mm <sup>2</sup> | | N.A. | 1.9mm <sup>2</sup> | $1.3 \text{mm}^2$ | $1.1 \mathrm{mm}^2$ | $2.9 \text{mm}^2$ | based ADC with DAC feedback is proposed to further reduce the required high-dynamic range ADC design, and the DPLL is also reused as local oscillator; 3) Low-power wide bandwidth DPLL is developed to improve the power efficiency of the entire system. The comparison with the state-of-the-art BLE transceivers is shown in Table. 5.1. It achieves the lowest power consumption while delivering excellent blocker performances and a good sensitivity level. As a key building block of the proposed DPLL-centric BLE transceiver, the power consumption of the fractional-N DPLL is also reduced by a variety of techniques pro- 5.2 Future Direction 109 Figure 5.2: FOM comparison with the state-of-the-art fractional DPLLs under 5mW. posed in this thesis: 1) 1st-order DSM-based fractional controller is used to reducing the required DTC range, which minimizes the jitter contributions from the DTC to the DPLL; 2) Isolated constant-slope DTC is proposed to improve the power efficiency from the conventional constant-slope DTC while achieving an excellent linearity performance; 3) TDC gain calibration is proposed to minimize the in-band phase noise variation due to voltage and temperature variations. As a result, a sub-mW DPLL is realized with an unprecedented FOM of -246dB. The worst-case in-band fractional spur is well below -56dBc. The achieved FOM performance are listed with power consumption. It is the only DPLL which breaks the -240dB FOM barrier under a 1-mW power consumption. ### **5.2** Future Direction #### 5.2.1 Fractional-N DPLL The 1st-order DSM-based architecture and the proposed isolated constant-slope DTC presented in this thesis are useful to improve the power efficiency of the fractional-N DPLL. Also, the TA gain calibration technique is proposed to improve the variations of the TDC gain. However, the techniques are not without shortcomings. As discussed in the chapter. 3.1, a smaller DTC range means smaller random jitter contribution and smaller peak INL. However, due to the sawtooth operation of the 1st- order DSM, the DTC gain calibration cannot correctly converge to the optimized value if the frequency control word is too small. It is because the gain error of the DTC will cause a prolonged sawtooth wave at a small frequency controlled word, which requires the bandwidth of the LMS loop to be very narrow, such as a bandwidth of less than 100Hz. The gain mismatch introduces INL to the DTC, and significantly worsen the fractional spurs. Limiting the BW of the LMS calibration loop will solve the convergence issue, while the convergence time will become very long (>1s). This is not desired because of the fractional spur degradation during the DTC gain calibration. New and improved calibration scheme instead of the LMS algorithm is demanded to solve the above issues. The second issue is coming from the TA gain calibrations. As a TDC, the physical resolution of the TDC is essential. The resolution of the TDC will potentially set the inband phase noise of the DPLL if the quantization noise of the TDC is the dominant noise source. The proposed TA-TDC is very sensitive to this gain variation due to the PVT variation of the TA. The TDC resolution is also decided by the coarse quantizer resolution, where it is equal to a buffer delay. A slow process (due to process variation) leads to higher TA gain and a larger buffer delay, while a fast corner leads to smaller TA gain and smaller buffer delay. The effective resolution of the TA-TDC is the ratio between the buffer delay and the TA gain, and it is affected by both factors. So calibrating TA gain only is not useful under process variation because there will still be a large variation (around $\pm 35\%$ ) for the coarse-TDC resolution. Hence, a new TDC gain calibration method is required to accurately control the TDC gain if the TDC quantization noise dominates the in-band phase noise of the fractional-N DPLL. ### 5.2.2 Bluetooth Low-Energy Transceiver The DPLL-centric BLE receiver was proposed to improve the interference performance while lowering the power consumption from the conventional I/Q receiver. The DAC feedback technique is proposed to enhance the SNDR performance of the DPLL-based ADC. Loop latency reduction technique is used to widen the DPLL loop bandwidth. The phase and frequency synchronization loop are proposed to improve the SNR of the received signal in a mixed signal domain through DPLL. Those techniques help to achieve good sensitivity, a good blocker performance, and good power efficiency. However, the proposed architecture and methods are not without issues and need to be further addressed by new techniques. It can be the future directions of the BLE transceiver. One of the issues is the potential DC offset in the DAC feedback path of the DPLL-based ADC, as shown in Fig. 5.3. This DC offset is from the frequency locking process of the proposed DPLL. The proposed DPLL-based ADC will quantize the analog signal to 5.2 Future Direction 111 Figure 5.3: (a) Proposed closed-loop DPLL-based ADC with DC offset at DAC output (b) Conversion diagrams. the digital one. The FLL is used to lock the frequency of the DPLL. After the FLL locks the frequency of the DPLL to the desired value for the RX demodulation, it sets the DC voltage of the DAC output to 0.5 V. If no signal is input to the varactor, $D_{OUT}$ will be 0.5 V DC voltage. However, the DCO frequency will gradually drift due to PVT variations, for example, the frequency will gradually drift at a several-kHz rate under temperature variation. The DPLL will track this frequency drift and compensate in the PLL path due to the negative feedback. If a large frequency drift happens, the DC-level of $V_{tune}$ will drift accordingly, causing a deviation from the linear region of the varactor conversion gain. This drift will significantly distort the A/D conversion due to the distorted $V_{tune}$ signal and nonlinearity of the varactor gain, as shown in Fig. 5.3(b). It will corrupt the SFDR performance. This effect is not desired. Hence, new techniques to mitigate this issue is highly demanded. Another issue is from the frequency and phase synchronization loop. As discussed in the chapter. 4.1.3, the long delay from the LPF to the FIR output will considerably limit the convergence time of the phase and frequency synchronization loop. The worst-case convergence time will be as large as more than $30\mu$ s, which is much longer than the standard specified $8\mu$ s (8-symbol preamble time). Hence, new techniques to improve the convergence time is highly desired to full fill the BLE standard. # **Bibliography** - [1] "Bluetooth Core Specification v5.0," Bluetooth Special Interest Group. - [2] "Specification of the Bluetooth System v4.2," Bluetooth Special Interest Group. - [3] A. Elkholy, T. Anand, W. S. Choi, A. Elshazly, and P. K. Hanumolu, "A 3.7 mW Low-Noise Wide-Bandwidth 4.5 GHz Digital Fractional-N PLL Using Time Amplifier-Based TDC," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 4, pp. 867–881, Apr. 2015. - [4] S. Henzler, S. Koeppe, D. Lorenz, W. Kamp, R. Kuenemund, and D. Schmitt-Landsiedel, "A Local Passive Time Interpolation Concept for Variation-Tolerant High-Resolution Time-to-Digital Conversion," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 7, pp. 1666–1676, Jul. 2008. - [5] M. Lee and A. A. Abidi, "A 9 b, 1.25 ps Resolution Coarse-Fine Time-to-Digital Converter in 90 nm CMOS that Amplifies a Time Residue," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 4, pp. 769–777, Apr. 2008. - [6] S. K. Lee, Y. H. Seo, Y. Suh, H. J. Park, and J. Y. Sim, "A 1GHz ADPLL with a 1.25ps Minimum-Resolution Sub-Exponent TDC in 0.18μ CMOS," *Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Paper*, pp. 482–483, Feb. 2010. - [7] A. Sai, S. Kondo, T. T. Ta, H. Okuni, M. Furuta, and T. Itakura, "A 65nm CMOS ADPLL with 360μW 1.6ps-INL SS-ADC-based Period-Detection-Free TDC," *Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, pp. 336–337, Jan. 2016. - [8] Z. Xu, M. Miyahara, K. Okada, and A. Matsuzawa, "A 3.6 GHz Low-Noise Fractional-N Digital PLL Using SAR-ADC-Based TDC," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 10, pp. 2345–2356, Oct. 2016. - [9] M. Z. Straayer and M. H. Perrott, "A Multi-Path Gated Ring Oscillator TDC with First-Order Noise Shaping," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 4, pp. 1089–1098, Apr. 2009. [10] N. Markulic, K. Raczkowski, P. Wambacq, and J. Craninckx, "A 10-bit, 550-fs Step Digital-to-Time Converter in 28nm CMOS," *European Solid State Circuits Conference (ESSCIRC)*, pp. 79–82, Sep. 2014. - [11] A. A. Abidi, "Phase Noise and Jitter in CMOS Ring Oscillators," *IEEE Journal of Solid-State Circuits*, vol. 41, no. 8, pp. 1803–1816, Aug. 2006. - [12] J. Z. Ru, C. Palattella, P. Geraedts, E. Klumperink, and B. Nauta, "A High-Linearity Digital-to-Time Converter Technique: Constant-Slope Charging," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 6, pp. 1412–1423, Jun. 2015. - [13] D. B. Leeson, "A Simple Model of Feedback Oscillator Noise Spectrum," *Proc. IEEE.*, vol. 54, pp. 329–330, 1966. - [14] T. H. Lee and A. Hajimiri., "Oscillator Phase Noise: A Tutorial," *IEEE Journal of Solid-State Circuits*, vol. 35, no. 3, pp. 326–336, Mar. 2000. - [15] A. Demir, A. Mehrotra, and J. Roychowdhury, "Phase Noise in Oscillators: a Unifying Theory and Numerical Methods for Characterization," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 47, no. 5, pp. 655–674, May 2000. - [16] A. Hajimiri and T. H. Lee, "Design Issues in CMOS Differential LC Oscillators," *IEEE Journal of Solid-State Circuits*, vol. 34, no. 5, pp. 717–724, May 1999. - [17] B. Razavi, "A Study of Phase Noise in CMOS Oscillators," *IEEE Journal of Solid-State Circuits*, vol. 31, no. 3, pp. 311–343, May 1996. - [18] J. J. Rael and A. A. Abidi., "A Study of Phase Noise in CMOS Oscillators," *Proceedings of IEEE Custom Integrated Circuits Conference*, pp. 569–572, May 2000. - [19] A. Hajimiri and T. H. Lee, "A General Theory of Phase Noise in Electrical Oscillators," *IEEE Journal of Solid-State Circuits*, vol. 33, no. 2, pp. 179–194, Feb. 1998. - [20] H. Liu, Z. Sun, D. Tang, H. Huang, T. Kaneko, W. Deng, R. Wu, K. Okada, and A. Matsuzawa, "An ADPLL-Centric Bluetooth Low-Energy Transceiver with 2.3mW Interference-Tolerant Hybrid-Loop Receiver and 2.9mW Single-Point Polar Transmitter in 65nm CMOS," *Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, pp. 444–445, Feb. 2018. - [21] H. Liu, D. Tang, Z. Sun, W. Deng, H. C. Ngo, K. Okada, and A. Matsuzawa, "A 0.98mW Fractional-N ADPLL Using 10b Isolated Constant-Slope DTC with FOM of -246dB for IoT Applications in 65nm CMOS," *Int. Solid-State Circuits Conf.* (ISSCC) Dig. Tech. Papers, pp. 246–248, Feb. 2018. [22] F. W. Kuo, S. Pourmousavian, T. Siriburanon, R. Chen, L. c. Cho, C. P. Jou, F. L. Hsueh, and R. B. Staszewski, "A 0.5V 1.6mW 2.4GHz Fractional-N All-Digital PLL for Bluetooth LE with PVT-insensitive TDC Using Switched-Capacitor Doubler in 28nm CMOS," *IEEE Symp. VLSI Circuits (VLSIC)*, pp. C178–C179, Jun. 2017. - [23] V. K. Chillara, Y. H. Liu, B. Wang, A. Ba, M. Vidojkovic, K. Philips, H. de Groot, and R. B. Staszewski, "An 860µW 2.1-to-2.7GHz All-Digital PLL-Based Frequency Modulator with a DTC-Assisted Snapshot TDC for WPAN (Bluetooth Smart and ZigBee) Applications," *Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, pp. 172–173, Feb. 2014. - [24] Y. He, Y. H. Liu, T. Kuramochi, J. van den Heuvel, B. Busze, N. Markulic, C. Bachmann, and K. Philips, "A 673μW 1.8-to-2.5GHz Dividerless Fractional-N Digital PLL with an Inherent Frequency-Capture Capability and a Phase-Dithering Spur Mitigation for IoT Applications," *Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Paper*, pp. 420–421, Feb. 2017. - [25] X. Gao, L. Tee, W. Wu, K. S. Lee, A. A. Paramanandam, A. Jha, N. Liu, E. Chan, and L. Lin, "A 28nm CMOS Digital Fractional-N PLL with -245.5dB FOM and a Frequency Tripler for 802.11abgn/ac Radio," *Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Paper*, pp. 1–3, Feb. 2015. - [26] D. Tasca, M. Zanuso, G. Marzin, S. Levantino, C. Samori, and A. L. Lacaita, "A 2.9-4.0GHz Fractional-N Digital PLL With Bang-Bang Phase Detector and 560-fs<sub>rms</sub> Integrated Jitter at 4.5-mW Power," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 12, pp. 2745–2758, Dec. 2011. - [27] R. B. Staszewski, J. L. Wallberg, S. Rezeq, C.-M. Hung, O. E. Eliezer, S. K. Vemulapalli, C. Fernando, K. Maggio, R. Staszewski, N. Barton, M.-C. Lee, P. Cruise, M. Entezari, K. Muhammad, and D. Leipold, "All-Digital PLL and Transmitter for Mobile Phones," *IEEE Journal of Solid-State Circuits*, vol. 40, no. 12, pp. 2469–2482, Dec. 2005. - [28] C. W. Yao, R. Ni, C. Lau, W. Wu, K. Godbole, Y. Zuo, S. Ko, N. S. Kim, S. Han, I. Jo, J. Lee, J. Han, D. Kwon, C. Kim, S. Kim, S. W. Son, and T. B. Cho, "A 14-nm 0.14-ps<sub>rms</sub> Fractional-*N* Digital PLL With a 0.2-ps Resolution ADC-Assisted Coarse/Fine-Conversion Chopping TDC and TDC Nonlinearity Calibration," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 12, pp. 3446–3457, Dec. 2017. - [29] A. T. Narayanan, M. Katsuragi, K. Kimura, S. Kondo, K. K. Tokgoz, K. Nakata, W. Deng, K. Okada, and A. Matsuzawa, "A Fractional-N Sub-Sampling PLL Using a Pipelined Phase-Interpolator With an FoM of -250dB," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 7, pp. 1630–1640, Jul. 2016. - [30] W. S. Chang, P. C. Huang, and T. C. Lee, "A Fractional-N Divider-Less Phase-Locked Loop with a Subsampling Phase Detector," *IEEE Journal of Solid-State Circuits*, vol. 49, no. 12, pp. 2964–2975, Dec. 2014. - [31] K. Raczkowski, N. Markulic, B. Hershberg, and J. Craninckx, "A 9.2-12.7 GHz Wideband Fractional-N Subsampling PLL in 28 nm CMOS with 280fs RMS Jitter," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 5, pp. 1203–1213, May 2015. - [32] X. Gao, E. A. M. Klumperink, M. Bohsali, and B. Nauta, "A Low Noise Sub-Sampling PLL in Which Divider Noise is Eliminated and PD/CP Noise is Not Multiplied by *N*<sup>2</sup>," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 12, pp. 3253–3263, Dec. 2009. - [33] H. Huh, Y. Koo, K.-Y. Lee, Y. Ok, S. Lee, D. Kwon, J. Lee, J. Park, K. Lee, D.-K. Jeong, and W. Kim, "A CMOS Dual-Band Fractional-N Synthesizer with Reference Doubler and Compensated Charge Pump," *Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, pp. 100–516 Vol.1, Feb. 2004. - [34] Y. L. Hsueh, L. C. Cho, C. H. Shen, Y. C. Tsai, T. C. Chueh, T. Y. Chang, J. L. Hsu, and J. H. C. Zhan, "A 0.29mm<sup>2</sup> Frequency Synthesizer in 40nm CMOS with 0.19ps<sub>rms</sub> Jitter and <-100dBc Reference Spur for 802.11ac," *Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, pp. 472–473, Feb. 2014. - [35] N. Pavlovic and J. Bergervoet, "A 5.3GHz Digital-to-Time-Converter-Based Fractional-N All-Digital PLL," *Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Paper*, pp. 54–56, Feb. 2011. - [36] Y. H. Seo, J. S. Kim, H. J. Park, and J. Y. Sim, "A 0.63ps Resolution, 11b Pipeline TDC in 0.13μm CMOS," *IEEE Symp. VLSI Circuits (VLSIC)*, pp. 152–153, Jun. 2011. - [37] G. Marzin, S. Levantino, C. Samori, and A. L. Lacaita, "A Background Calibration Technique to Control Bandwidth in Digital PLLs," *Int. Solid-State Circuits Conf.* (ISSCC) Dig. Tech. Papers, pp. 54–55, Feb. 2014. - [38] Y. H. Liu, J. V. D. Heuvel, T. Kuramochi, B. Busze, P. Mateman, V. K. Chillara, B. Wang, R. B. Staszewski, and K. Philips, "An Ultra-Low Power 1.7-2.7 GHz Fractional-N Sub-Sampling Digital Frequency Synthesizer and Modulator for IoT Applications in 40 nm CMOS," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 64, no. 5, pp. 1094–1105, May 2017. - [39] C. Palattella, E. A. M. Klumperink, J. Z. Ru, and B. Nauta, "A Sensitive Method to Measure the Integral Nonlinearity of a Digital-to-Time Converter Based on Phase Modulation," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 62, no. 8, pp. 741–745, Aug. 2015. - [40] M. S. Yuan, C. C. Li, C. C. Liao, Y. T. Lin, C. H. Chang, and R. B. Staszewski, "A 0.45V Sub-mW All-Digital PLL in 16nm FinFET for Bluetooth Low-Energy (BLE) Modulation and Instantaneous Channel Hopping Using 32.768kHz Reference," *Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, pp. 448–450, Feb. 2018. - [41] Y. H. Liu, X. Huang, M. Vidojkovic, A. Ba, P. Harpe, G. Dolmans, and H. d. Groot, "A 1.9nJ/b 2.4GHz Multistandard (Bluetooth Low Energy/Zigbee/IEEE802.15.6) Transceiver for Personal/Body-Area Networks," *Int. Solid-State Circuits Conf.* (*ISSCC*) *Dig. Tech. Papers*, pp. 446–447, Feb. 2013. - [42] F. W. Kuo, S. B. Ferreira, H. N. R. Chen, L. C. Cho, C. P. Jou, F. L. Hsueh, I. Madadi, M. Tohidian, M. Shahmohammadi, M. Babaie, and R. B. Staszewski, "A Bluetooth Low-Energy Transceiver With 3.7-mW All-Digital Transmitter, 2.75-mW High-IF Discrete-Time Receiver, and TX/RX Switchable On-Chip Matching Network," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 4, pp. 1144–1162, Apr. 2017. - [43] J. Prummel, M. Papamichail, J. Willms, R. Todi, W. Aartsen, W. Kruiskamp, J. Haanstra, E. Opbroek, S. Rievers, P. Seesink, J. van Gorsel, H. Woering, and C. Smit, "A 10 mW Bluetooth Low-Energy Transceiver with On-Chip Matching," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 12, pp. 3077–3088, Dec. 2015. - [44] Y. H. Liu, A. Ba, J. H. C. van den Heuvel, K. Philips, G. Dolmans, and H. de Groot, "A 1.2 nJ/bit 2.4 GHz Receiver With a Sliding-IF Phase-to-Digital Converter for Wireless Personal/Body Area Networks," *IEEE Journal of Solid-State Circuits*, vol. 49, no. 12, pp. 3005–3017, Dec. 2014. - [45] T. Sano, M. Mizokami, H. Matsui, K. Ueda, K. Shibata, K. Toyota, T. Saitou, H. Sato, K. Yahagi, and Y. Hayashi, "A 6.3mW BLE Transceiver Embedded RX Image-Rejection Filter and TX Harmonic-Suppression Filter Reusing On-Chip Matching Network," *Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, pp. 1–3, Feb. 2015. [46] Y. H. Liu, C. Bachmann, X. Wang, Y. Zhang, A. Ba, B. Busze, M. Ding, P. Harpe, G. J. van Schaik, G. Selimis, H. Giesen, J. Gloudemans, A. Sbai, L. Huang, H. Kato, G. Dolmans, K. Philips, and H. de Groot, "A 3.7mW-RX 4.4mW-TX Fully Integrated Bluetooth Low-Energy/IEEE802.15.4/proprietary SoC with an ADPLL-based Fast Frequency Offset Compensation in 40nm CMOS," *Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, pp. 236–237, Feb. 2015. - [47] A. Wong, M. Dawkins, G. Devita, N. Kasparidis, A. Katsiamis, O. King, F. Lauria, J. Schiff, and A. Burdett, "A 1V 5mA Multimode IEEE 802.15.6/Bluetooth Low-Energy WBAN Transceiver for Biotelemetry Applications," *Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, pp. 300–302, Feb. 2012. - [48] A. Sai, H. Okuni, T. T. Ta, S. Kondo, T. Tokairin, M. Furuta, and T. Itakura, "A 5.5 mW ADPLL-Based Receiver With a Hybrid Loop Interference Rejection for BLE Application in 65 nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 12, pp. 3125–3136, Dec. 2016. - [49] Y. H. Liu, V. K. Purushothaman, C. Lu, J. Dijkhuis, R. B. Staszewski, C. Bachmann, and K. Philips, "A 770pJ/b 0.85V 0.3mm<sup>2</sup> DCO-Based Phase-Tracking RX Featuring Direct Demodulation and Data-Aided Carrier Tracking for IoT Applications," *Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Paper*, pp. 408–409, Feb. 2017. - [50] M. Ding, X. Wang, P. Zhang, Y. He, S. Traferro, K. Shibata, M. Song, H. Korpela, K. Ueda, Y. H. Liu, C. Bachmann, and K. Philips, "A 0.8V 0.8mm² Bluetooth 5/BLE Digital-Intensive Transceiver with a 2.3mW Phase-Tracking RX Utilizing a Hybrid Loop Filter for Interference Resilience in 40nm CMOS," *Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, pp. 446–448, Feb. 2018. - [51] P. Madoglio, M. Zanuso, S. Levantino, C. Samori, and A. L. Lacaita, "Quantization Effects in All-Digital Phase-Locked Loops," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 54, no. 12, pp. 1120–1124, Dec. 2007. - [52] C. W. Yao and A. N. Willson, "A 2.8-3.2GHz Fractional-N Digital PLL With ADC-Assisted TDC and Inductively Coupled Fine-Tuning DCO," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 3, pp. 698–710, Mar. 2013. - [53] H. Darabi, *Radio Frequency Integrated Circuits and Systems*. Cambridge University Press, 2015. - [54] Y. H. Liu, J. V. D. Heuvel, T. Kuramochi, B. Busze, P. Mateman, V. K. Chillara, B. Wang, R. B. Staszewski, and K. Philips, "An Ultra-Low Power 1.7-2.7 GHz Fractional-N Sub-Sampling Digital Frequency Synthesizer and Modulator for IoT Applications in 40 nm CMOS," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 64, no. 5, pp. 1094–1105, May 2017. - [55] F. Gardner, "Charge-Pump Phase-Lock Loops," *IEEE Transactions on Communications*, vol. 28, no. 11, pp. 1849–1858, Nov. 1980. - [56] T. K. Kuan and S. I. Liu, "A Bang Bang Phase-Locked Loop Using Automatic Loop Gain Control and Loop Latency Reduction Techniques," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 4, pp. 821–831, Apr. 2016. - [57] F. Gardner, "A BPSK/QPSK Timing-Error Detector for Sampled Receivers," *IEEE Transactions on Communications*, vol. 34, no. 5, pp. 423–429, May 1986. - [58] J. Masuch and M. Delgado-Restituto, "A 1.1mW-RX -81.4dBm Sensitivity CMOS Transceiver for Bluetooth Low Energy," *IEEE Transactions on Microwave Theory and Techniques*, vol. 61, no. 4, pp. 1660–1673, Apr. 2013. - [59] Z. Lin, P. I. Mak, and R. P. Martins, "A 2.4 GHz ZigBee Receiver Exploiting an RF-to-BB-Current-Reuse Blixer + Hybrid Filter Topology in 65 nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 49, no. 6, pp. 1333–1344, Jun. 2014. - [60] S. Akhtar, R. Taylor, and P. Litmanen, "A High Magnetic Coupling, Low Loss, Stacked Balun in Digital 65nm CMOS," *IEEE Radio Frequency Integrated Circuits Symposium*, pp. 513–516, Jun. 2009. - [61] F. Chen, S. Lin, X. Duo, and X. Sun, "A L-Band Gain Controllable CMOS LNA," *Asia Pacific Microwave Conference*, pp. 1124–1127, Dec. 2009. - [62] X. Peng, J. Yin, P. I. Mak, W. H. Yu, and R. P. Martins, "A 2.4-GHz ZigBee Transmitter Using a Function-Reuse Class-F DCO-PA and an ADPLL Achieving 22.6% (14.5%) System Efficiency at 6-dBm (0-dBm) P<sub>out</sub>," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 6, pp. 1495–1508, Jun. 2017. - [63] J. Yin, S. Yang, H. Yi, W. H. Yu, P. I. Mak, and R. P. Martins, "A 0.2V Energy-Harvesting BLE Transmitter with a Micropower Manager Achieving 25% System Efficiency at 0dBm Output and 5.2nW Sleep Power in 28nm CMOS," *Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, pp. 450–452, Feb. 2018. - [64] W. Yu, X. Peng, P. Mak, and R. P. Martins, "A High-Voltage-Enabled Class-D Polar PA Using Interactive AM-AM Modulation, Dynamic Matching, and Power-Gating for Average PAE Enhancement," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 64, no. 11, pp. 2844–2857, Nov. 2017. # Appendix A ## **Publication List** ## A.1 Journal Papers - Hanli Liu, Zheng Sun, Dexian Tang, Hongye Huang, Tohru Kaneko, Zhijie Chen, Wei Deng, Rui Wu, and Kenichi Okada, "A DPLL-Centric Bluetooth Low-Energy Transceiver with a 2.3-mW Interference-Tolerant Hybrid-Loop Receiver in 65nm CMOS" IEEE Journal of Solid-State Circuits (JSSC), Vol. 53, No. 12, Dec. 2018. - Hanli Liu, Dexian Tang, Zheng Sun, Wei Deng, Huy Cu Ngo, and Kenichi Okada, "A Sub-mW Fractional-N ADPLL with FOM of -246dB for IoT Applications," IEEE Journal of Solid-State Circuits (JSSC), Vol. 53, No. 12, Dec. 2018. - Hanli Liu, Teerachot Siriburanon, Kengo Nakata, Wei Deng, Ju Ho Son, Dae Young Lee, Kenichi Okada, and Akira Matsuzawa, "A 28-GHz Fractional-N Frequency Synthesizer with Reference and Frequency Doublers for 5G Cellular," IE-ICE Transactions on Electronics (Special Issue), Vol.E101-C, No.4, Apr. 2018. ### A.2 International Conferences and Workshops - Hanli Liu, Zheng Sun, Hongye Huang, Wei Deng, Teerachot Siriburanon, Jian Pang, Yun Wang, Rui Wu, Teruki Someya, Atsushi Shirane, Kenichi Okada, "A 265-μW Fractional-N Digital PLL with Seamless Automatic Switching Subsampling/Sampling Feedback Path and Duty-Cycled Frequency-Locked Loop in 65nm CMOS," IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, Feb. 2019. - Hanli Liu, Dexian Tang, Zheng Sun, Wei Deng, Huy Cu Ngo, Kenichi Okada and Akira Matsuzawa, "A 0.98mW Fractional-N ADPLL Using 10b Isolated Constant- Slop DTC with FoM of -246dB for IoT Applications in 65nm CMOS," IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, pp. 246-248, Feb. 2018. - Hanli Liu, Zheng Sun, Dexian Tang, Hongye Huang, Tohru Kaneko, Wei Deng, Rui Wu, Kenichi Okada and Akira Matsuzawa, "An ADPLL-Centric Bluetooth Low-Energy Transceiver with 2.3mW Interference-Tolerant Hybrid-Loop Receiver and 2.9mW Single-Point Polar Transmitter in 65nm CMOS," IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, pp. 246-248, Feb. 2018. - Hanli Liu, Ning Li, Aravind Tharayil Narayanan, Teerachot Siriburanon, Takuichi Hirano, Kenichi Okada, Akira Matsuzawa, Takeshi Inoue, and Hitoshi Sakane, "A -194.0dBc/Hz FOM CMOS Tail-Filtering VCO using Helium-3 Ion Irradiation Technique," Proc. IEEE European Microwave Integrated Circuits Conference (Eu-MIC), London, UK, pp. 213-216, Oct. 2016. ### A.3 Domestic Conferences and Workshops - **Hanli Liu**, 岡田 健一, "Loop Latency Compensation Technique for Wide Loop Bandwidth ADPLL", 電子情報通信学会 ソサイエティ大会 (於 金沢大学), C-12-28, Sep. 2018. - Hanli Liu, Teerachot Siriburanon, 中田 憲吾, Wei Deng, 岡田 健一, 松澤 昭, "A 28GHz Fractional-N Frequency Synthesizer with Reference and Frequency Doublers for 5G New Radio", 電子情報通信学会 集積回路研究会 (於 石垣島), Dec. 2017. - Hanli Liu, 岡田 健一, 松澤 昭, "A -242dB FoM 4.2-mW ADC-PLL Using Digital Sub-Sampling Architecture", STARCフォーラム (於 新横浜), Nov. 2015. - Hanli Liu, Teerachot Siriburanon, Kenichi Okada, Akira Matsuzawa, "28GHz CMOS LC-VCO Using Frequency Doubling Technique," IEICE General Conference, Sendai, Japan, C-12-3513, Sep. 2015. ### A.4 Co-Author #### A.4.1 Conferences - Jian Pang, Zheng Li, Ryo Kubozoe, Xueting Luo, Rui Wu, Yun Wang, Dongwon You, Ashbir Aviat Fadila, Rattanan Saengchan, Takeshi Nakamura, Joshua Alvin, Daiki Matsumoto, Aravind Tharayil Narayanan, Bangan Liu, Junjun Qiu, Hanli Liu, Zheng Sun, Hongye Huang, Korkut Kaan Tokgoz, Keiichi Motoi, Naoki Oshima, Shinichi Hori, Kazuaki Kunihiro, Tomoya Kaneko, Atsushi Shirane, Kenichi Okada. "A 28GHz CMOS Phased-Array Beamformer Utilizing Neutralized Bi-Directional Technique Supporting Dual-Polarized MIMO for 5G NR," IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, Feb. 2019. - Jian Pang, Korkut Kaan Tokgoz, Shotaro Maki, Zheng Li, Xueting Luo, Ibrahim Abdo, Seitarou Kawai, **Hanli Liu**, Bangan Liu, Makihiko Katsuragi, Kento Kimura, Atsushi Shirane, Kenichi Okada, "A 28.16-Gb/s Area-Efficient 60GHz CMOS Bi-Directional Transceiver for IEEE 802.11ay," IEEE Asian Solid-State Circuits Conference (A-SSCC), Tainan, Taiwan, Nov. 2018. - Zheng Sun, Hanli Liu, Dexian Tang, Hongye Huang, Tohru Kaneko, Rui Wu, Wei Deng, Kenichi Okada, "A 0.85mm² BLE Transceiver with Embedded T/R Switch, 2.6mW Fully-Passive Harmonic Suppressed Transmitter and 2.3mW Hybrid-Loop Receiver," IEEE European Solid-State Circuits Conference (ESSCIRC), Dresden, Germany, Sep. 2018. - Jian Pang, Rui Wu, Yun Wang, Masato Dome, Hisashi Kato, Hongye Huang, Aravind Tharayil Narayanan, Hanli Liu, Bangan Liu, Takeshi Nakamura, Takuya Fujimura, Masaru Kawabuchi, Ryo Kubozoe, Tsuyoshi Miura, Daiki Matsumoto, Naoki Oshima, Keiichi Motoi, Shinichi Hori, Kazuaki Kunihiro, Tomoya Kaneko, and Kenichi Okada, "A 28GHz CMOS Phased-Array Transceiver Using Gain-Invariant LO Phase Shifter with 0.1 Degree Beam-Steering Resolution for 5G New Radio," IEEE Radio Frequency Integrated Circuits Symposium (RFIC), Philadelphia, PA, June 2018. - Bangan Liu, Huy Cu Ngo, Kengo Nakata, Wei Deng, Yuncheng Zhang, Junjun Qiu, Torn Yoshioka, Jun Emmei, Haosheng Zhang, Jian Pang, Aravind Tharayil Narayanan, Dongsheng Yang, Hanli Liu, Kenichi Okada, Akira Matsuzawa, "A 1.2 ps-Jitter Fully-Synthesizable Fully-Calibrated Fractional-N Injection-Locked PLL Using True Arbitrary Nonlinearity Calibration Technique," IEEE Custom Integrated Circuits Conference (CICC), San Diego, CA, pp. 1-4, Apr. 2018. - Yun Wang, Bangan Liu, Hanli Liu, Aravind Tharayil Narayanan, Jian Pang, Ning Li, Toru Yoshioka, Yuki Terashima, Haosheng Zhang, Dexian Tang, Makihiko Katsuragi, Daeyoung Lee, Sungtae Choi, Rui Wu, Kenichi Okada, and Akira Matsuzawa, "A 100mW 3.0Gb/s Spectrum Efficient 60GHz Bi-Phase OOK CMOS Transceiver," IEEE Symposium on VLSI Circuits (VLSI Circuits), Kyoto, pp. 298-299, June 2017. - Jian Pang, Shotaro Maki, Seitarou Kawai, Noriaki Nagashima, Yuuki Seo, Masato Dome, Hisashi Kato, Makihiko Katsuragi, Kento Kimura, Satoshi Kondo, Yuki Terashima, Hanli Liu, Teerachot Siriburanon, Aravind Tharayil Narayanan, Nurul Fajiri, Tohru Kaneko, Toru Yoshioka, Bangan Liu, Yun Wang, Rui Wu, Ning Li, Korkut Kaan Tokgoz, Masaya Miyahara, Kenichi Okada, Akira Matsuzawa "A 128-QAM 60GHz Transceiver for IEEE802.11ay with Calibration of LO Feedthrough and I/Q Imbalance," IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, pp.424-425, Feb. 2017. - Teerachot Siriburanon, Hanli Liu, Kengo Nakata, Wei Deng, Ju Ho Son, Dae Young Lee, Kenichi Okada, Akira Matsuzawa, "A 28-GHz Fractional-N Frequency Synthesizer with Reference and Frequency Doublers for 5G Cellular," IEEE European Solid-State Circuits Conference (ESSCIRC), Graz, Austria, pp. 76-79, Sep. 2015. ### A.4.2 Journal Papers • Teerachot Siriburanon, Satoshi Kondo, Makihiko Katsuragi, Hanli Liu, Kento Kimura, Wei Deng, Kenichi Okada, Akira Matsuzawa, "A Low-Power Low-Noise mm-Wave Subsampling PLL Using Dual-Step-Mixing ILFD and Tail-Coupling Quadrature Injection-Locked Oscillator for IEEE 802.11 ad," IEEE Journal of Solid-State Circuits (JSSC), vol. 51, no. 5, pp. 1246-1260, May 2016. ## **A.4.3 Domestic Conferences and Workshops** - Zheng Sun, **Hanli Liu**, Hongye Huang, 染谷 晃基, 白根 篤史, 岡田 健一, "A High Dynamic Range BLE Front-End with On-Chip Matching Network", 電子情報通信学会 ソサイエティ大会 (於金沢大学), C-12-21, Sep. 2018. - Hongye Huang, Zheng Sun, Hanli Liu, Rui Wu, 染谷 晃基, 白根 篤史, 岡田 健一, "A 2.6mW BLE Transmitter Front-End with Fully-Passive Harmonic Suppression", 電子情報通信学会 ソサイエティ大会 (於 金沢大学), C-12-22, Sep. 2018. - Zheng Sun, **Hanli Liu**, Dexian Tang, Hongye Huang, 金子 徹, Wei Deng, Rui Wu, 白根 篤史, 岡田 健一, "An ADPLL-Centric Bluetooth Low-Energy Transceiver with 2.3mW Interference-Tolerant Hybrid-Loop Receiver in 65nm CMOS", 電子情報通信学会 LSIとシステムのワークショップ (於 東京大学), May 2018. - Hongye Huang, **Hanli Liu**, Dexian Tang, Zheng Sun, Wei Deng, Huy Cu Ngo, 白根 篤史, 岡田 健一, "An Ultra-Low-Power Fractional-N All-Digital PLL Using 10-bit Isolated Constant-Slope Digital-to-Time Converter", 電子情報通信学会 LSIとシステムのワークショップ (於 東京大学), May 2018. - Zheng Sun, Hanli Liu, Dexian Tang, Hongye Huang, 岡田 健一, 松澤 昭, "An ADPLL-Based High Interference Tolerant BLE Receiver with DAC Feedback Loop", 電子情報通信学会 総合大会 (於 東京電機大学), C-12-6, March 2018. - Hongye Huang, Zheng Sun, Hanli Liu, Dexian Tang, 岡田 健一, 松澤 昭, "Current-Reuse LNA for Low Power 2.4-GHz Receivers", 電子情報通信学会 総合大会(於東京電機大学), C-12-7, March 2018. - Van Tuan Pham, **Hanli Liu**, Haosheng Zhang, 岡田 健一, 松澤 昭, "A 0.65mW 4.6GHz VCO with Low Phase Noise for Chip Scale Atomic Clock", 電子情報通信学会 総合大会 (於 東京電機大学), C-12-27, March 2018. - Dexian Tang, **Hanli Liu**, Zheng Sun, Hongye Huang, 岡田 健一, 松澤 昭, "An Isolated Constant-Slope Digital-to-Time Converter", 電子情報通信学会 総合大会(於東京電機大学), C-12-33, March 2018. - Yun Wang, Bangan Liu, Hanli Liu, Aravind Tharayil Narayanan, Jian Pang, Ning Li, 吉岡 透, 寺島 友樹, Haosheng Zhang, Dexian Tang, 桂木 真希彦, Daeyoung Lee, Sungtae Choi, Rui Wu, 岡田 健一, 松澤 昭, "A 100mW 3.0Gb/s Spectrum Efficient 60GHz Bi-Phase OOK CMOS Transceiver" (招待講演), IEEE SSCS Japan Chapter VLSI Circuits報告会 (於 東京大学), June 2017. - Jian Pang, 眞木 翔太郎, 河合 誠太郎, 永島 典明, 瀬尾 有輝, 桂木 真希彦, 木村 健将, 近藤 智史, Hanli Liu, Teerachot Siriburanon, 金子 徹, 宮原 正也, 岡田健一, 松澤 昭, "A 128-QAM 60GHz CMOS Transceiver for IEEE802.11ay with Calibration of LO Feedthrough and I/Q Imbalance", 電子情報通信学会 LSIとシステムのワークショップ (於 東京大学), May 2017. - Jian Pang, 眞木 翔太郎, 河合 誠太郎, 永島 典明, 瀬尾 有輝, 桂木 真希彦, 木村 健将, 近藤 智史, **Hanli Liu**, Teerachot Siriburanon, 岡田 健一, 松澤 昭, "IEEE802.11ayに向けたCMOS ミリ波トランシーバーに関する研究", 電 子情報通信学会 集積回路研究会 (於 岡山県立大学), Vol.ICD2016-138, pp. 113-118, March 2017. ### A.4.4 Books • Teerachot Siriburanon, **Hanli Liu**, Kenichi Okada, Akira Matsuzawa, Wei Deng, Satoshi Kondo, Makihiko Katsuragi, and Kento Kimura, "IoT and Low-Power Wireless: Circuits, Architectures, and Techniques," CRC Press, ISBN 9780815369714, July 2018.