Architecting Near Real Time Data Integration Pipelines with PowerExchange and IICS Streaming

Authors

  • Srujana Parepalli Senior Data Engineer, USA Author

DOI:

https://doi.org/10.15662/IJRAI.2019.0201004

Keywords:

Near real time data integration, change data capture, Informatica PowerExchange, IICS Streaming, log based replication, enterprise data integration, cloud data pipelines, asynchronous data movement, transactional consistency, streaming ingestion, These keywords capture the technical and architectural focus of enterprise data integration practices as of February 2019, emphasizing continuous data movement, hybrid on premises and cloud architectures, reliable propagation of transactional changes across distributed systems

Abstract

By February 2019, enterprises operating large scale transactional platforms increasingly required data integration architectures capable of delivering near real time data movement with strong reliability and minimal impact on source systems. Traditional batch oriented extract transform load processes, while still widely used, were insufficient for use cases involving operational analytics, customer facing applications, regulatory reporting, and downstream system synchronization. Organizations sought integration solutions that could propagate data changes continuously while maintaining transactional integrity and operational stability. Change Data Capture had become a well established technique for addressing these requirements by enabling incremental data movement based on committed changes in source systems. Informatica PowerExchange was widely adopted in enterprise environments as a log based CDC solution capable of capturing database changes with low latency and minimal overhead. At the same time, Informatica Intelligent Cloud Services introduced streaming ingestion and processing capabilities that allowed CDC events to be transported and consumed in near real time within cloud based integration pipelines. This paper examines near real time data integration architectures that combine PowerExchange for change capture with IICS Streaming for event transport and processing, as understood and implemented by early 2019. The discussion focuses on architectural patterns for integrating on premises transactional systems with cloud based targets, emphasizing reliability, ordering, scalability, and operational governance. Particular attention is given to how CDC streams were modeled, buffered, and consumed to support continuous data propagation without tightly coupling source and target systems. Finally, the paper synthesizes design considerations and operational trade offs associated with PowerExchange and IICS Streaming based integration pipelines. These considerations include latency management, failure recovery, schema evolution handling, and monitoring of end to end data flow health. The intent is to provide a practical and historically grounded view of near real time data integration patterns that were actively adopted by enterprises in February 2019.

References

1. Adiba Sabtu, Nur Fadzilah Mohd Azmi, Nor Nazihah A. Sjarif, Shafiza A. Ismail, Othman M. Yusop, Haslina Sarkan, Shahida Chuprat (2017). The Challenges of Extract, Transform and Loading (ETL) System Implementation for Near Real-Time Environment. 2017 International Conference on Research and Innovation in Information Systems (ICRIIS), 1-6. https://doi.org/10.1109/ICRIIS.2017.8002467

2. Rui J. Santos, Jorge Bernardino, Marco Vieira (2011). 24/7 Real-Time Data Warehousing: A Tool for Continuous Actionable Knowledge. 2011 IEEE 35th Annual Computer Software and Applications Conference (COMPSAC), 279-288. https://doi.org/10.1109/COMPSAC.2011.44

3. Srividya K. Bansal (2014). Towards a Semantic Extract-Transform-Load (ETL) Framework for Big Data Integration. 2014 IEEE International Congress on Big Data, 522-529. https://doi.org/10.1109/BigData.Congress.2014.82

4. Bhole Rahul Hiraman; Chapte Viresh M.; Karve Abhijeet C. (2018). A Study of Apache Kafka in Big Data Stream Processing. 2018 International Conference on Information, Communication, Engineering and Technology (ICICET), 1-4. https://doi.org/10.1109/ICICET.2018.8533771

5. Paul Le Noac'h; Alexandru Costan; Luc Bougé (2017). A Performance Evaluation of Apache Kafka in Support of Big Data Streaming Applications. 2017 IEEE International Conference on Big Data (Big Data), 4803-4806. https://doi.org/10.1109/BigData.2017.8258548

6. Ruilong Deng, Rongxing Lu, Chengzhe Lai, Tom H. Luan, Hao Liang (2016). Optimal Workload Allocation in Fog-Cloud Computing Toward Balanced Delay and Power Consumption. IEEE Internet of Things Journal, 3(6), 1171-1181. https://doi.org/10.1109/JIOT.2016.2565516

7. Avani Sharma; Tarun Goyal; Emmanuel S. Pilli; Arka P. Mazumdar; M. C. Govil; R.C. Joshi (2015). A Secure Hybrid Cloud Enabled Architecture for Internet of Things. 2015 IEEE 2nd World Forum on Internet of Things (WF-IoT), 286-291. https://doi.org/10.1109/WF-IoT.2015.7389065

8. Sudhir Vishnubhatla. (2017). Migrating Legacy Information Management Systems to AWS and GCP: Challenges, Hybrid Strategies, and a Dual-Cloud Readiness Playbook. In International Journal of Scientific Research & Engineering Trends (Vol. 3, Number 6). Zenodo. https://doi.org/10.5281/zenodo.17298069

9. Hsinchun Chen, Roger H. L. Chiang, Veda C. Storey (2012). Business Intelligence and Analytics: From Big Data to Big Impact. MIS Quarterly, 36(4), 1165-1188. https://doi.org/10.2307/41703503

10. C. Mohan, Don Haderle, Bruce Lindsay, Hamid Pirahesh, Peter Schwarz (1992). ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging. ACM Transactions on Database Systems, 17(1), 94-162. https://doi.org/10.1145/128765.128770

11. Patrick Eugene O'Neil, Edward Cheng, Dieter Gawlick, Elizabeth O'Neil (1996). The Log-Structured Merge-Tree (LSM-Tree). Acta Informatica, 33(4), 351-385. https://doi.org/10.1007/s002360050048

12. Peter Bailis, Ali Ghodsi (2013). Eventual Consistency Today: Limitations, Extensions, and Beyond. Communications of the ACM, 56(5), 55-63. https://doi.org/10.1145/2460276.2462076

13. Werner Vogels (2009). Eventually Consistent. Communications of the ACM, 52(1), 40-44. https://doi.org/10.1145/1435417.1435432

14. Seth Gilbert, Nancy Lynch (2002). Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services. ACM SIGACT News, 33(2), 51-59. https://doi.org/10.1145/564585.564601

15. Matei Zaharia, Tathagata Das, Haoyuan Li, Timothy Hunter, Scott Shenker, Ion Stoica (2013). Discretized Streams: Fault-Tolerant Streaming Computation at Scale. SOSP '13: Proceedings of the 24th ACM Symposium on Operating Systems Principles, 423-438. https://doi.org/10.1145/2517349.2522737

16. Sudhir Vishnubhatla. (2018). From Risk Principles to Runtime Defenses: Security and Governance Frameworks for Big Data in Finance. In International Journal of Science, Engineering and Technology (Vol. 6, Number 1). Zenodo. https://doi.org/10.5281/zenodo.17452405

Downloads

Published

2019-02-05

How to Cite

Architecting Near Real Time Data Integration Pipelines with PowerExchange and IICS Streaming. (2019). International Journal of Research and Applied Innovations, 2(1), 933-943. https://doi.org/10.15662/IJRAI.2019.0201004