Sign in Subscribe

dongwooshin

파스업DIP로 구현하는 현대적 데이터 파이프라인: 프레임워크 적용과 ELT모니터링 대시보드 #4

파스업DIP 개발팀의 spark-batch 프레임워크를 활용해 개발 생산성 50% 향상과 ELT 모니터링 대시보드 구축을 소개합니다. 순수 라이브러리 대비 코드 복잡도 현저히 감소하고 Apache Superset 기반 실시간 운영 모니터링 체계를 구현했습니다.

Modern Data Pipeline Implementation with PAASUP DIP: Subway User Statistics Analysis Project #3

Urban traffic pattern analysis project processing over 10 years of Seoul subway ridership data using modern data lake technology stack. Transformed 73,692 records with Delta Lake and Apache Superset to visualize rush hour commute patterns and transfer station hub effects.

파스업DIP로 구현하는 현대적 데이터 파이프라인: 지하철 이용자 통계 분석 프로젝트 #3

서울시 지하철 10여 년간의 승하차 데이터를 현대적 데이터레이크 기술 스택으로 처리한 도시 교통 패턴 분석 프로젝트를 소개합니다. 73,692개 레코드를 Delta Lake와 Apache Superset으로 변환하여 출퇴근 러시아워와 환승역 허브 효과를 시각화했습니다.

Modern Data Pipeline Implementation with PaasupDIP: Subway User Statistics Analysis Project #2

Subway data pipeline automation project transitioning from Jupyter Notebook development environment to Airflow production environment. Implemented enhanced security using Kubernetes Secret and workflow construction based on scheduling through SparkKubernetesOperator.

Modern Data Pipeline Implementation with PaasupDIP: Subway User Statistics Analysis Project #1

This blog introduces a project using the PAASUP DIP to implement a modern data pipeline that processes and analyzes over 10 years of Seoul subway ridership data through Apache Spark, Delta Lake, and PostgreSQL, combined with Apache Superset dashboards for multi-dimensional data visualization.

파스업DIP로 구현하는 현대적 데이터 파이프라인: 지하철 이용자 통계 분석 프로젝트 #2

Jupyter Notebook 개발 환경에서 Airflow 운영 환경으로 전환한 지하철 데이터 파이프라인 자동화 프로젝트를 소개합니다. Kubernetes Secret 활용한 보안 강화와 Spark Kubernetes Operator를 통한 스케줄링 기반 워크플로우 구축을 구현했습니다.

파스업DIP로 구현하는 현대적 데이터 파이프라인: 지하철 이용자 통계 분석 프로젝트 #1

파스업 DIP(Data Intelligence Platform)를 활용해 Spark, Delta Lake, Superset 등으로 서울 지하철 이용자 통계 데이터를 전처리·시각화해 교통·상권·도시계획 인사이트를 도출한 프로젝트를 소개합니다.