แนวทางการเรียนรู้ Apache Airflow จากประสบการณ์ตรง แบ่งเป็น 3 ระดับตั้งแต่พื้นฐานจนถึงระดับเชี่ยวชาญ พร้อม checkpoint สำหรับประเมินความรู้ในแต่ละระดับ
Data-engineer
Exploring the World of Python and Lets Learn Together
Introduction to Apache Airflow
- Published on
Apache Airflow เป็นเครื่องมือจัดการ Workflow สำหรับงาน Data Engineering และ Data PipelineGetting Started with Apache Cassandra
- Published on
Apache Cassandra is a popular distributed NoSQL database known for its scalability, fault-tolerance, and high performance. Whether you are a beginner or an experienced developer, this blog post will serve as a comprehensive guide to help you get started with Apache Cassandra. We will cover the fundamental concepts, installation process, and provide practical code examples to demonstrate key operations.15 Useful extra columns for ETL jobs
- Published on
Extra columns in your ETL jobs can provide valuable context and information for downstream data consumers, allowing them to better understand the source and quality of the data. By considering these extra columns, you can improve the quality and reliability of your data pipelines, and make it easier for downstream consumers to extract value from your dataSetting up a Spark cluster using Docker Compose
- Published on
Extra columns in your ETL jobs can provide valuable context and information for downstream data consumers, allowing them to better understand the source and quality of the data. By considering these extra columns, you can impr28|ove the quality and reliability of your data pipelines, and make it easier for downstream consumers to extract value from your dataCompare code Pandas, PySpark, and Apache Hive
- Published on
This comparison article provides an overview of data manipulation in three popular tools: Pandas, PySpark, and Apache Hive. By providing code examples and discussing the pros and cons of each approach, the article aims to help data engineers and data scientists choose the best tool for their specific use case.Apache Livy คืออะไร
- Published on
An overview of the apache livy, submmit spark job via Rest API and pylivy this blog including source codePython WebHDFS คืออะไร?
- Published on
An overview of the python for big data project to commands to HDFS using web hdfs