In statistics and probability theory we tend to model natural phenomena and try to predict unforeseen circumstances. It's about making consistent and logical prediction which is consistent with our observed data. Sometimes we have large amount of data and we make prediction purely based on that without any prior knowledge then the strategy is frequentist statistics. It has some wonderful set of tools like hypothesis testing, confidence interval etc to make judgement about situations.

Bayesian statistics allows us to incorporate prior knowledge along with observed data to make the inference. Off course when our prior knowledge is non-existent then we…


Model measure should fit for the purpose. One has to choose the right metric which cater to the problem

In today’s world we have abundance of data being generated from various systems. These data are the footprints of various system performance or their character. We would like to predict or monitor the data generating system (or performance) using machine learning (ML) models. These models have their background in Statistics and Probability theory which provides a solid theoretical framework to learn the pattern from data. But wait, how do we know we are learning ‘right’ or ‘correct’ pattern. So, when making an assessment between various model hypothesis and their trained instantiation, how do you know which one is performing the…


Introduction

As the name correctly suggests synthetic means unreal. So synthetic data is unreal or artificially curated data. This data is not real data which has been observed, rather it is mimicking some real data generation process. It has to possess similar statistical patterns as real data. For example, mean, mode(s) of the real and synthetic data should be in comparable range. But wait, why should we even think of creating new data. Aren’t we in the age of data deluge where we are creating and storing enormous amount of data every day?

Yes, we are creating data but still data…


Graph data structure is a versatile way of representing relational information. It has some nice mathematical properties which allows us to analyse the relationships in elegant ways. Graph data is usually non-Euclidean in nature. Graph solves the connectivity problems and provides nice inference on similar questions. However it is not so suitable for common machine learning tasks. The reason can be anything like there is no natural ordering in graph data among node etc. Generally exact solutions graph based algorithms has exponential time complexity. So we have to devise methods to represent graph data in Euclidean space so that we…

Mithun Ghosh

Building intelligent systems and powering prosperity around the world

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store