About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
cc 22(14): e1

Research Article

VTWM: An Incremental Data Extraction Model Based on Variable Time-Windows

Download167 downloads
Cite
BibTeX Plain Text
  • @ARTICLE{10.4108/eai.12-6-2020.166291,
        author={Weixing Jia and Yang Xu and Jie Liu and Guiling Wang},
        title={VTWM: An Incremental Data Extraction Model Based on Variable Time-Windows},
        journal={EAI Endorsed Transactions on Collaborative Computing},
        volume={4},
        number={14},
        publisher={EAI},
        journal_a={CC},
        year={2020},
        month={9},
        keywords={change data capture, incremental data extraction, timestamp, ETL},
        doi={10.4108/eai.12-6-2020.166291}
    }
    
  • Weixing Jia
    Yang Xu
    Jie Liu
    Guiling Wang
    Year: 2020
    VTWM: An Incremental Data Extraction Model Based on Variable Time-Windows
    CC
    EAI
    DOI: 10.4108/eai.12-6-2020.166291
Weixing Jia1, Yang Xu2, Jie Liu3, Guiling Wang1,*
  • 1: North China University of Technology
  • 2: Tianjin E-Hualu Information Technology Co.
  • 3: Beijing Yidian Wangju Technology Co.
*Contact email: wangguiling@ncut.edu.cn

Abstract

Continuously extracting and integrating changing data from various heterogeneous systems based on an appropriate data extraction model is the key to data sharing and integration and also the key to building an incremental data warehouse for data analysis. The traditional data capture method based on timestamp changes is plagued with anomalies in the data extraction process, which leads to data extraction failure and affects the efficiency of data extraction. To address the above problems, this paper improves the traditional data capture model based on timestamp increments and proposes VTWM, an incremental data extraction model based on variable time-windows, based on the idea of extracting a small number of duplicate records before removing duplicate values. The model reduces the influence of abnormalities on data extraction, improves the reliability of the traditional data extraction ETL processes, and improves the data extraction efficiency.

Keywords
change data capture, incremental data extraction, timestamp, ETL
Received
2020-08-27
Accepted
2020-09-06
Published
2020-09-09
Publisher
EAI
http://dx.doi.org/10.4108/eai.12-6-2020.166291

Copyright © 2020 Weixing Jia et al., licensed to EAI. This is an open access article distributed under the terms of the Creative Commons Attribution license, which permits unlimited use, distribution and reproduction in any medium so long as the original work is properly cited.

EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL