Ace Your System Design Interview with 7 Must-Read Papers in 2023
Ace your system design interview with 7 must-read papers.
Learning System Design in 2023 This post presents top 7 must-read research papers to help you understand the key concepts of system design and prepare for your interview.
From basic distributed systems to the latest industry trends, these papers cover it all. Whether you're new to system design or a pro, these papers will give you the knowledge and skills you need to excel in your interview and career.
Let's get started.
1. The Google File System (GFS)
The Google File System (GFS) is a distributed file system developed by Google to store and manage large amounts of data across a cluster of machines.
This paper describes the design and implementation of GFS. GFS is designed to be highly available, scalable, and fault-tolerant. It addresses the challenges of storing and processing large amounts of data with a relatively small number of machines.
GFS is based on a master-slave architecture where a single master coordinates all access to the file system and multiple ChunkServers store the data. The system is optimized for the high-throughput, low-latency workloads that are typical of Google's applications, such as Google Search and Google Maps.
Architecture and System Design , or Paper
2. Bigtable: A Distributed Storage System for Structured Data
This paper describes the design and implementation of Bigtable, a distributed storage system used by Google to store and manage large amounts of structured data such as web pages, images, and other types of data. The paper describes how Bigtable was built to overcome the limitations of traditional relational databases and how it is optimized for high write throughput, low latency, and scalability.
Bigtable uses a highly partitioned, distributed, and persistent multi-dimensional sorted map. The data is partitioned into tablets and each tablet is stored on a different machine. The paper describes the design choices that were made to achieve high performance, scalability, and reliability, including data partitioning, replication, and performance optimization. The paper also describes how Bigtable can be used to build other systems like Google's search engine and Google Earth.
3. Dynamo: Amazon’s Highly Available Key-value Store
This paper describes the design and implementation of Dynamo, a highly available key-value storage system used by Amazon to provide low-latency data access for its e-commerce platform. The paper describes how Dynamo was built to overcome the limitations of traditional centralized systems and how it is optimized for high write throughput, low latency, and scalability.
Dynamo uses a distributed hash table (DHT) to partition data across a set of nodes. Each node is responsible for a subset of the data and can handle read and write requests for that data. The paper describes the design choices that were made to achieve high performance, scalability, and reliability, including data partitioning, replication, and performance optimization. The system also has a mechanism for handling node failures, which ensures that data is still available even in the event of a node failure.
4. Cassandra - A Decentralized Structured Storage System
This paper describes the design and implementation of Cassandra, a decentralized structured storage system used by companies such as Facebook, Twitter, and Netflix. It covers key concepts such as data partitioning, replication, and performance optimization.
Architecture and System Design
5. The Chubby Lock Service for Loosely-Coupled Distributed Systems
This paper describes the design and implementation of Chubby, a highly available, distributed lock service used by Google to provide coordination between loosely-coupled distributed systems. The paper explains how
Chubby provides a simple, highly available, and low-latency mechanism for distributed systems to coordinate access to shared resources, such as configuration data and service-level agreements. Chubby uses a master-slave architecture where a single master coordinates all access to the lock service and multiple replicas store the data. The system is designed to be fault-tolerant and provides a mechanism for handling master failures and replica failures, which ensures that the service is still available even in the event of a failure.
This paper is considered a seminal work in the field of distributed systems, and it's a must-read for anyone interested in understanding how to design and build highly available and fault-tolerant distributed systems. The concepts and principles presented in this paper have been widely adopted and influenced many other systems like ZooKeeper, etcd and etc.
6. HDFS: Hadoop Distributed File System
HDFS is a distributed file system and was built to store unstructured data. It is designed to store huge files reliably and stream those files at high bandwidth to user applications.
7. The Log: What every software engineer should know about real-time data's unifying abstraction
This paper discusses the importance of log data structure and its role in real-time data processing. The paper argues that logs provide a simple, unified abstraction for dealing with data that can be used to build fault-tolerant, scalable systems and it's a must-read for anyone interested in distributed systems and real-time data processing.
➡ These research papers provide a comprehensive understanding of the key concepts and principles of system design, as well as practical tips for approaching problems and staying current with industry trends. By reading and understanding these papers, you will be well-prepared for your system design interview and have the knowledge and skills necessary to excel in your career.
➡ Learn more on system design interview in Grokking the System Design Interview and Grokking the Advanced System Design Interview .
Read more on system design interview.  System Design Interviews: What distinguishes you from others?  Top LeetCode Patterns for FAANG Coding Interviews  The Complete Guide to Ace the System Design Interview
Formal Methods in System Design
An International Journal
- Reports on formal methods for designing, implementing, and validating hardware and software systems.
- Publishes high quality, original papers spanning all aspects of formal methods.
- Aims to build a valuable collection of widely applicable formal methods.
- Serves as an international platform for the dissemination of research related to the application and development of formal methods.
- Invites papers describing original work in all aspects of formal methods as they relate to system design.
- Nir Piterman
Volume 61, Issue 1
Special Issue 'FM2021'
Parameter synthesis for markov models: covering the parameter space.
- Sebastian Junges
- Erika Ábrahám
- Matthias Volk
Bounded-memory runtime enforcement with probabilistic and performance analysis
- Saumya Shankar
- Ankit Pradhan
- Yliès Falcone
Preface for the formal methods in system design special issue on SYNT 2021
- Elizabeth Polgreen
- Guillermo Alberto Perez
Synbit: synthesizing bidirectional programs using unidirectional sketches
- Masaomi Yamaguchi
- Kazutaka Matsuda
Termination of triangular polynomial loops
- Marcel Hark
- Florian Frohn
- Jürgen Giesl
Editor's choice papers 2021.
Please enjoy free access to these hand-picked articles, courtesy of our Editor-in-Chief.
- ACM Digital Library
- EI Compendex
- Google Scholar
- Japanese Science and Technology Agency (JST)
- OCLC WorldCat Discovery Service
- Science Citation Index Expanded (SCIE)
- TD Net Discovery Service
- UGC-CARE List (India)
Rights and permissions
© Springer Science+Business Media, LLC, part of Springer Nature
- Find a journal
- Publish with us
- Track your research
Software Design and Architecture The once and future focus of software engineering
- Change Username/Password
- Update Address
- Payment Options
- Order History
- View Purchased Documents
- Communications Preferences
- Profession and Education
- Technical Interests
- US & Canada: +1 800 678 4333
- Worldwide: +1 732 981 0060
- Contact & Support
- About IEEE Xplore
- Nondiscrimination Policy
- Privacy & Opting Out of Cookies
A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.
- Conference proceedings
- © 2021
Intelligent System Design
Proceedings of Intelligent System Design: INDIA 2019
- Suresh Chandra Satapathy 0 ,
- Vikrant Bhateja 1 ,
- B. Janakiramaiah 2 ,
- Yen-Wei Chen 3
School of Computer Engineering, KIIT Demmed to be University, Bhubaneswar, India
You can also search for this editor in PubMed Google Scholar
Department of Electronics and Communication Engineering, Shri Ramswaroop Memorial Group of Professional Colleges (SRMGPC), Lucknow, India
Department of computer science and engineering, pvp siddhartha institute of technology, vijayawada, india, college of information science and engineering, ritsumeikan university, kyoto, japan.
Presents research papers written by active researchers in the field of intelligent systems
Discusses the outcomes of INDIA 2019, organized by Lendi Institute of Engineering & Technology, India
Serves as a reference resource for practitioners and researchers in academia and industry
Part of the book series: Advances in Intelligent Systems and Computing (AISC, volume 1171)
- Table of contents
About this book
Editors and affiliations, about the editors, bibliographic information.
- Publish with us
- Available as EPUB and PDF
- Read on any device
- Instant download
- Own it forever
- Compact, lightweight edition
- Dispatched in 3 to 5 business days
- Free shipping worldwide - see info
Tax calculation will be finalised at checkout
Other ways to access
This is a preview of subscription content, log in via an institution to check for access.
Table of contents (83 papers)
Front matter, acceptance of technology in the classroom: a qualitative analysis of mathematics teachers’ perceptions.
- Perienen Appavoo
Smart Agriculture Using IOT
- Bammidi Deepa, Chukka Anusha, P. Chaya Devi
CSII-TSBCC: Comparative Study of Identifying Issues of Task Scheduling of Big data in Cloud Computing
- Chetana Tukkoji, K. Seetharam, T. Srinivas Rao, G. Sandhya
De-Centralized Cloud Data Storage for Privacy-Preserving of Data Using Fog
- Gadu Srinivasa Rao, G. Himaja, V. S. V. S. Murthy
Multilayer Perceptron Back propagation Algorithm for Predicting Breast Cancer
- K. Satish Kumar, V. V. S. Sasank, K. S. Raghu Praveen, Y. Krishna Rao
IOT-Based Borewell Water-Level Detection and Auto-Control of Submersible Pumps
- Sujatha Karimisetty, Vaikunta Rao Rugada, Dadi Harshitha
Institute Examcell Automation with Mobile Application Interface
- Sujatha Karimisetty, Sujatha Thulam, Surendra Talari
Plasmonic Square Ring Resonator Based Band-Stop Filter Using MIM Waveguide
- P. Osman, P. V. Sridevi, K. V. S. N. Raju
Interactive and Assistive Gloves for Post-stroke Hand Rehabilitation
- Riya Vatsa, Suresh Chandra Satapathy
An Approach for Collaborative Data Publishing Using Self-adaptive Genetic Grey Wolf Optimizer
- T. Senthil Murugan, Yogesh R. Kulkarni
Review of Optical Devanagari Character Recognition Techniques
- Sukhjinder Singh, Naresh Kumar Garg
A Comprehensive Review on Deep Learning Based Lung Nodule Detection in Computed Tomography Images
- Mahender G. Nakrani, Ganesh S. Sable, Ulhas B. Shinde
ROS-Based Pedestrian Detection and Distance Estimation Algorithm Using Stereo Vision, Leddar and CNN
- Anjali Mukherjee, S. Adarsh, K. I. Ramachandran
An Enhanced Prospective Jaccard Similarity Measure (PJSM) to Calculate the User Similarity Score Set for E-Commerce Recommender System
- H. Mohana, M. Suriakala
Spliced Image Detection in 3D Lighting Environments Using Neural Networks
- V. Vinolin, M. Sucharitha
Two-Level Text Summarization Using Topic Modeling
- Dhannuri Saikumar, P. Subathra
A Robust Blind Oblivious Video Watermarking Scheme Using Undecimated Discrete Wavelet Transform
- K. Meenakshi, K. Swaraja, Padmavathi Kora, G. Karuna
Recognition of Botnet by Examining Link Failures in Cloud Network by Exhausting CANFES Classifier Approach
- S. Nagendra Prabhu, D. Shanthi Saravanan, V. Chandrasekar, S. Shanthi
Low Power, Less Leakage Operational Transconductance Amplifier (OTA) Circuit Using FinFET
- Maram Anantha Guptha, V. M. Senthil Kumar, T. Hari Prasad, Ravindrakumar Selvaraj
- Conference Proceedings
- INDIA Proceedings
- Data Mining and Data Warehousing
- Cloud Computing
- Image Processing
- Cognitive Computing
Suresh Chandra Satapathy
Suresh Chandra Satapathy is a Professor , School of Computer Engg, KIIT Deemed to be University, Bhubaneswar, India. His research interest includes machine learning, data mining, swarm intelligence studies and their applications to engineering. He has more than 140 publications to his credit in various reputed international journals and conference Proceedings. He has edited many volumes from Springer AISC, LNEE, SIST etc. He is a senior member of IEEE and Life Member of Computer society of India.
Vikrant Bhateja is Associate Professor, Department of ECE in SRMGPC, Lucknow. His areas of research include digital image and video processing, computer vision, medical imaging, machine learning, pattern analysis and recognition. He has around 150 quality publications in various international journals and conference proceedings. He is associate editor of IJSE and IJACI. He has edited more than 22 volumes of conference proceedings with Springer Nature and is presently EiC of IGI Global: IJNCR journal.
Book Title : Intelligent System Design
Book Subtitle : Proceedings of Intelligent System Design: INDIA 2019
Editors : Suresh Chandra Satapathy, Vikrant Bhateja, B. Janakiramaiah, Yen-Wei Chen
Series Title : Advances in Intelligent Systems and Computing
DOI : https://doi.org/10.1007/978-981-15-5400-1
Publisher : Springer Singapore
eBook Packages : Intelligent Technologies and Robotics , Intelligent Technologies and Robotics (R0)
Copyright Information : Springer Nature Singapore Pte Ltd. 2021
Softcover ISBN : 978-981-15-5399-8 Published: 11 August 2020
eBook ISBN : 978-981-15-5400-1 Published: 10 August 2020
Series ISSN : 2194-5357
Series E-ISSN : 2194-5365
Edition Number : 1
Number of Pages : XX, 897
Number of Illustrations : 109 b/w illustrations, 322 illustrations in colour
Topics : Computational Intelligence , Engineering Design , Machine Learning , Communications Engineering, Networks
Policies and ethics
- Find a journal
- Track your research
- Utility Menu
Systems Research at Harvard
Harvard John A. Paulson School of Engineering & Applied Sciences
Storage and file system design research.
People | Publications | Tech Reports | Theses | Internal
We use experimentation and analysis to understand how systems interact, how the world is changing, and how file systems and storage systems should evolve.
Current work includes:
- Provenance-aware storage for scientific applications.
Our previous work (papers below) has aimed to answer the following questions [abstracts hyperlinked]:
The Self-Organizing Storage (SOS) project: how can storage systems tune themselves to their workload? [ ICAC'04 , FAST'03 , LISA'03 , FREENIX'03 ]
How can distributed file systems take advantage of zero-copy (RDMA) architectures? [ Kostas' thesis , Salimah's thesis , FAST'03 , USENIX'02 , BSDCon'02 ]
How should the performance of file systems be evaluated? [ Keith's thesis ]
How do soft updates and journaling differ in performance and semantics? [ USENIX'01 , USENIX'00 ]
How do the different FFS allocation algorithms compare? [ USENIX'96 ]
How do clustering and file system logging compare? [ USENIX'95 ]
This research is part of the SYRAH group.
- Professor Margo Seltzer
- Elaine Angelino
- David Holland
- Peter Macko
- Daniel Margo
- Nicholas Murphy
- Robin Smogor
- Kiran-Kumar Muniswamy-Reddy . PhD 2010, Foundations for Provenance-Aware Systems (Now at Amazon).
- Lex Stein . PhD 2007, Adaptive Parallel Computation for Heterogeneous Processors (Now at Microsoft Research Asia).
- Daniel J. Ellard . PhD 2004, Trace-Based Analyses and Optimizations for Network Storage Servers (Now at Network Appliance, Advanced Technology Group).
- Kostas Magoutis . PhD 2003, Exploiting Direct-Access Networking in Network Attached Storage Systems (Now at IBM Research, TJ Watson).
- Keith A. Smith . PhD 2000, Workload-Specific File System Benchmarks (Now at Network Appliance, Advanced Technology Group).
- Salimah Addetia . ME 2002, Caching in DAFS (Now at Network Appliance).
- Kostas Magoutis, Salimah Addetia, Alexandra Fedorova, Margo I. Seltzer. Making the Most out of Direct Access Network-Attached Storage , ( PDF ) In Proceedings of Second USENIX Conference on File and Storage Technologies (FAST'03), San Francisco, CA, March 31-April 2, 2003.
- Lex Stein, Michael J. Tucker, and Margo I. Seltzer. Building a Reliable Mutable File System on Peer-to-peer Storage , ( Postscript ) In Proceedings of the International Workshop on Reliable Peer-to-peer Distributed Systems, Osaka, Japan, October, 2002.
- Kostas Magoutis, Salimah Addetia, Alexandra Fedorova, Margo I. Seltzer, Jeffrey S. Chase, Andrew J. Gallatin, Richard Kisley, Rajiv G. Wickremesinghe, Eran Gabber. Structure and Performance of the Direct Access File System , ( PDF) In Proceedings of 2002 USENIX Annual Technical Conference , Monterey, CA, June 9-14, 2002.
- Kostas Magoutis. Design and Implementation of a Direct Access File System (DAFS) Kernel Server for FreeBSD ( Postscript , PDF ), Appears in Proceedings of USENIX BSDCon 2002 Conference , San Franscisco, CA, February 11-14, 2002.
- Kostas Magoutis. The Optimistic Direct Access File System: Design and Network Interface Support ( Postscript , PDF ) Appears in Proceedings of Workshop on Novel Uses of System Area Networks (Held in conjunction with HPCA-8), Cambridge, MA, February 2, 2002.
- Seltzer, M., Smith, K., Balakrishnan, H., Chang, J., McMains, S., and Padmanabhan, V. File System Logging versus Clustering: A Performance Comparison ( Postscript ), Appears in Proceedings of the 1995 USENIX Annual Technical Conference , January 1995, New Orleans, LA, pp. 249--264.
- Keith A. Smith and Margo Seltzer. File Layout and File System Performance ( Postscript ) [Overview] , Harvard Computer Science Technical Report TR-35-94.
- Salimah Addetia. CacheDAFS: User Level Client-side Caching for the Direct Access File System (DAFS) ( Abstract ), A thesis presented by Salimah Addetia to The School of Engineering and Applied Sciences in partial fulfillment of the requirements for the Master of Engineering in the subject of Computer Science, Harvard University, Cambridge, MA, 2001
SYRAH Research Projects
- CacheDAFS: User Level Client-side Caching for the Direct Access File System (DAFS)
- Exploiting Direct-Access Networking in Network Attached Storage Systems
- File classification in self-* storage systems
- Journaling versus Soft Updates
- Trace-Based Analyses and Optimizations for Network Storage Servers
- Workload-Specific File System Benchmarks: Abstract
digital system design Recently Published Documents
- Latest Documents
- Most Cited Documents
- Contributed Authors
- Related Sources
- Related Keywords
Digital System Design for Quantum Error Correction Codes
Quantum computing is a computer development technology that uses quantum mechanics to perform the operations of data and information. It is an advanced technology, yet the quantum channel is used to transmit the quantum information which is sensitive to the environment interaction. Quantum error correction is a hybrid between quantum mechanics and the classical theory of error-correcting codes that are concerned with the fundamental problem of communication, and/or information storage, in the presence of noise. The interruption made by the interaction makes transmission error during the quantum channel qubit. Hence, a quantum error correction code is needed to protect the qubit from errors that can be caused by decoherence and other quantum noise. In this paper, the digital system design of the quantum error correction code is discussed. Three designs used qubit codes, and nine-qubit codes were explained. The systems were designed and configured for encoding and decoding nine-qubit error correction codes. For comparison, a modified circuit is also designed by adding Hadamard gates.
Digital System Design for Traffic Light Controller System: A Systematic Approach
In this paper, we are going to present the finite state machine, how to implement it via hardware description language (HDL), and how to use it in a real application. At first, the specification and requirements of traffic light controller are stated. Then, the system architecture based on finite state machine (FSM) are conducted. Finally, the way of using HDL as well as the test-bench simulation are given in detail. Keywords : Digital system design, System on chip, Finite State Machine, Digital Design Education, Smart Classroom.
Linear Feedback Shift Register and its Applications in Digital System Design
In digital system design, the Linear Feedback Shift Register (LFSR) is the queen of logic functions, and the design engineers can use LFSR in both hardware (HW) or software (SW) implementation. In this paper, LFSR will be discussed in its HW implementation via Hardware description language. In addition, the application of LFSR in of pseudorandom number generator (PRNG), direct sequence spread spectrum (DSSS), cyclic redundancy check (CRC) is also given. Keywords-- Digital system design, System on chip, ASIC digital design, Linear feedback shift register
Improvement on Basic Digital System Design using Problem-based and Outcome-based Learning
Robust and energy-efficient hardware: the case for asynchronous design.
The current technologies behind the design of semiconductor integrated circuits allow embedding billions of components in a singe silicon die, enabling the construction of very complex circuits in a tiny space, dissipating little energy and producing huge amounts of useful computational work. However, the current levels of integration for electronic components in silicon and similar materials are not easily managed, as parameter variations grow steadily, making the design tasks increasingly challenging. Synchronous techniques have dominated the digital system design landscape for many decades, but their costs are increasingly hard to cope with. Asynchronous design and particularly quasi-delay insensitive design promises to deal with the same challenges more gracefully in current advanced nodes, and possibly irrevocably in future technology nodes. This article proposes a review of the state of the art in using asynchronous circuit design techniques to achieve energy-efficient and robust digital circuit and system design. In particular, the definition of a robust digital circuit comprises addressing several aspects to which a digital system design is expected to be robust to, including: (1) voltage variations; (2) process variations; (3) temperature variations; (4) circuit aging. Besides addressing energy-efficiency and all the mentioned robustness aspects, this work also approaches some of the state-of-the-art tools available to deal with asynchronous design, and points to desirable research development to be conducted in these subjects in the future.
Fast IQ Amplitude Approximation Method for ASIC Digital System
In some modules of digital systems, such as Fast Fourier Transform (FFT), Discrete Fourier transform (DFT), IQ (in-phase and quadrature components) modulation/ demodulation, the outputs use the complex data formed , and the calculation of its magnitude value √ are required. In software digital signal processing platform, the multiplication and square root operations are executed by using its math library; however, in Application specific integrated circuit (ASIC) digital system design, the implementation of those operators via Coordinate Rotation Digital Computer (CORDIC) algorithm requires the numerous resources and delays. So, in this paper, we present a fast approximation method for above problem which takes a small delay but acceptable accuracy for AISC digital system design. Keywords—ASIC, Digital system design, FFT, DFT, Fast amplitude approximation, Max-Min approximation.
Digital System Design using FSMs
Digital system design.
The main objective of this chapter is to study and design various combinational circuits like Verification of Boolean Expression, Multiplexer, Demultiplexer Circuits, Code Converters circuits using LabVIEW tools. This chapter will make the user more comfortable towards learning of Design of Digital Systems. The various types of Boolean Expressions like SOP and POS, Combinational circuits like Adder circuit (Half adder and full adder), Subtractor circuit (Half Subtractor, Full Subtractor), some code converters like Binary to Gray and Gray to Binary, BCD to Gray and Gray to BCD and also Sequential circuits with D flip flop is also being carried out using this LabVIEW.
Research on Practice Teaching Mode of Digital System Design Based on the concept of “OBE-CDIO”
Project-based learning and evaluation in an online digital design course.
This paper reports an experience of an abrupt shift from traditional teaching to distance learning within a course on digital system design using programmable logic platforms. The course organization and evaluation model had to be modified on the fly due to the COVID-19 pandemic. The adopted teaching and assessment methodology puts a strong focus on the laboratory component, assigning a very significant weight to project-based evaluation. As the access to laboratory equipment was cut, all the previously accumulated experience had to be modified and adapted to new circumstances. The paper discusses teaching methods employed within the course and analyzes in detail a project-based evaluation accentuated on modeling of a simplified processor. The advantages and drawbacks of the reported teaching methods are appointed. Possible design extensions are also suggested, which permit assigning the same core project to different students. We believe that the proposed project is a valuable instructional tool, in particular, for remote learning/assessment.
Export Citation Format
Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser .
- We're Hiring!
- Help Center
- Most Cited Papers
- Most Downloaded Papers
- Newest Papers
- Save to Library
- Last »
- MySQL database Follow Following
- Telecomunications Follow Following
- Statistical Machine Translation Follow Following
- Computer Network Follow Following
- Design (Engineering) Follow Following
- Parameter estimation Follow Following
- Electric Vehicles Follow Following
- Network Management Follow Following
- Optical fiber Follow Following
- Performance Evaluation Follow Following
Enter the email address you signed up with and we'll email you a reset link.
- Academia.edu Publishing
- We're Hiring!
- Help Center
- Find new research papers in:
- Health Sciences
- Earth Sciences
- Cognitive Science
- Computer Science
- Academia ©2024
Search code, repositories, users, issues, pull requests...
We read every piece of feedback, and take your input very seriously.
Use saved searches to filter your results more quickly.
To see all available qualifiers, see our documentation .
A list of papers on system design.
Folders and files, repository files navigation, system design papers.
- The UNIX Time-Sharing System [ paper ]
- The UNIX Time-sharing System - A Retrospective [ paper ]
- Program Design in the Unix environment [ paper ]
- Unix implementation [ paper ]
- A Design Methodology for Reliable Software Systems [ paper ]
- On the Criteria to be Used in Decomposing Systems into Modules [ paper ]
- End-to-end Arguments in System Design [ paper ]
- Hints for Computer System Design [ paper ]
- Out of the Tarpit [ paper ]
- The Emperor’s Old Clothes [ paper ]
- Robustness in Complex Systems [ paper ]
Help | Advanced Search
Computer Science > Computation and Language
Title: construction of a syntactic analysis map for yi shui school through text mining and natural language processing research.
Abstract: Entity and relationship extraction is a crucial component in natural language processing tasks such as knowledge graph construction, question answering system design, and semantic analysis. Most of the information of the Yishui school of traditional Chinese Medicine (TCM) is stored in the form of unstructured classical Chinese text. The key information extraction of TCM texts plays an important role in mining and studying the academic schools of TCM. In order to solve these problems efficiently using artificial intelligence methods, this study constructs a word segmentation and entity relationship extraction model based on conditional random fields under the framework of natural language processing technology to identify and extract the entity relationship of traditional Chinese medicine texts, and uses the common weighting technology of TF-IDF information retrieval and data mining to extract important key entity information in different ancient books. The dependency syntactic parser based on neural network is used to analyze the grammatical relationship between entities in each ancient book article, and it is represented as a tree structure visualization, which lays the foundation for the next construction of the knowledge graph of Yishui school and the use of artificial intelligence methods to carry out the research of TCM academic schools.
- Download PDF
References & Citations
- Google Scholar
- Semantic Scholar
BibTeX formatted citation
Bibliographic and Citation Tools
Code, data and media associated with this article, recommenders and search tools.
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .