Data Science, ML and Analytics Engineering

Pandas for Data Science

With this note, I am launching a series of articles for beginners in Data Science and Machine Learning. We’ll start by exploring Pandas. While there are many articles on Pandas available online, I want to focus on practical techniques for using Pandas in Data Science projects and model building.

Dataset: We will use the German Credit Risk dataset from Kaggle.

The dataset contains information on credit data:

  • Age
  • Sex
  • Job
  • Housing
  • Saving accounts
  • Checking account
  • Credit amount
  • Duration
  • Purpose
Pandas for Data Science

Read more

Simple steps to make your Python code better

Many of you have GIT code repositories, in this post I’ll show you how to make your Python code better.

I’ll use this repository as an example: https://github.com/Aykhan-sh/pandaseda

Fork it and try to make the code better.

Improving code readability

Improving the readability of your code is easy. We will use libraries for syntax formatting and validation.

First, let’s create configuration files for flake8, mypy and black in the repository.

Let’s install them first:

pip install black flake8 mypy

Read more

Python logging HOWTO: logging vs loguru

In this post we will try to choose a library for logging in Python. Logs help to record and understand what went wrong in the work of your service. Informational messages are often written to the logs. For example: parameters, quality metrics and model training progress. An example of a piece of model training log:

An example of a piece of model training log
An example of a piece of model training log

Read more