Sharing Materials
Presentations
2024 Dec [View]
This slide deck is based on the book Learn Git in a Month of Lunches (Rick Umali). Git is an essential tool for any developer, providing a robust system for version control and collaboration. While the initial setup may present a learning curve for beginners, Git becomes intuitive and efficient with regular use. Fundamental commands like commit, checkout, and merge are staples of daily workflows. Beyond these basics, advanced techniques enhance collaboration and streamline team projects. The deck also highlights two popular Git workflows, offering a foundation for development teams to establish effective collaboration practices.
2023 Feb [View]
In this talk, I first shared my go-to software setup to work and collaborate well within a team - this will ease the job for MLE/DE/PM to deliver a satisfactory product. Secondly, I emphasized the imperative of assessing at every stage during the development cycle and identifying a clear objective before proceeding to the next stage, with a highlight on "Problem statement" and "Iterated model enhancement with gap-based logical reasoning".
2021 Nov [View]
This presentation provides an introduction to Kedro and its three primary commands:
- kedro new: Create a new project codebase.
- kedro run: Execute your data pipeline.
- kedro viz: Visualize your pipeline and compare experiments.
We'll also demonstrate the following features: optimize performance by stacking and running pipelines in parallel; enable efficient tracking of your experiments; use layers and namespaces to simplify and organize your workflow.
2020 Dec [View]
Inspired by Andrew Ng's course Structuring Machine Learning Projects, I've compiled a practical mindset for building machine learning model.
While some points may be outdated in the context of the recent advancements in large language models (LLM), the majority of the concepts remain relevant.
This deck focuses on setting the right expectations for model performance, the fine-tuning cycle, and the best practices for closing the gap.
Key concepts include choosing a single number evaluation metric and setting constraints, using human-level performance as a reference point, orthogonalization, error analysis, and building a quick and iterative system.
By following these practices, you can effectively build and improve your machine learning model.
Disclaimer: Books' content is restructured into a mind map to capture their general ideas. Notes are mostly direct quotes from the books as reference content unless stated otherwise. The mind maps serve for personal use with no commercializing intent.
2024 Sep [View]
The book focuses on three core aspects: the dashboard development lifecycle, Excel's data querying, transformation, and visualization capabilities, and the use of purpose-driven visualization elements. It emphasizes the importance of clarifying dashboard requirements with stakeholders, iterative feedback, and understanding data details from the outset. The design section highlights breaking data into layers and using Excel's tools like PivotTables and charts, illustrated through case studies. Finally, the book categorizes charts into three types—Comparison, Composition, and Relationship—and discusses non-chart elements like formatting and sparklines for effective data visualization.
2023 Jun [View]
The book describes well the practical aspect of an end-to-end time series development process. It discusses the best practices of EDA, feature engineering, modelling, and data storing with valuable tips: lookahead issue, plotting techniques, temporal characteristics in analysis, etc... The writing and mathematical explanation are not well-written, but its content is best appreciated once you work on a time series forecast use case.