algorithms

Sliding window

This iterative technique is useful when dealing with problems that involve finding a (longest/shortest) subsequence that satisfies a given criteria in a string or an array. We maintain two pointers, left and right; initially located at the first index.

Recursion

A recursive algorithm is one which calls itself, on a subset of input. To setup a correct recursion and to avoid infinite loops, one must take the following into account:

  • setup the base case
  • recursive calls should lead towards the base case

Binary Trees

A binary tree is a tree data structure where every node has at most two children. Each node can be of three categories - the root which has no parent, the leaf which have no children, and the inner nodes, which have at least one child. A node can be defined as follows:

Arrays

One of the most basic things that one can do with an array is to have a pointer or a set of pointers that traverse the array in some fashion. The number of pointers to book-keep depends on the question at hand. Let’s look at few sample problems to understand this concept:

Back to Top ↑

statistics

Absorbing Markov Chains

A Markov chain containing absorbing states is known as an absorbing Markov chain. So what is an absorbing state. In simple words, if you end up on an absorbing state you can’t go anywhere else; you are stuck there for all eternity. In other words, the probability of transition from an absorbing state $i$ to any other non-absorbing state, also called transient states, is 0.

Logistic Regression

One of the most common test case in supervised machine learning is that of classification: Given a data point, classify it into one of the many labels available

Poisson Process

Let’s imagine rain falling. One obvious parameter describing this process is the rate - whether its drizzling or pouring! Let’s now focus on a tiny patch of land and assume that the rate is constant and will term this as $\lambda$. We can describe rain as a Poisson process.

PDF of a dependent variable

``The Calculus required continuity, and continuity was supposed to require the infinitely little; but nobody could discover what the infinitely little might be."

     -- Bertrand Russell in Mysticism and Logic and Other Essays, The Floating Press, 1 August 2010, p.100

Back to Top ↑

ML

ML Design

A ML design task can broadly be classified into the following components, in order of priority:

  • Defining success metrics
  • Data pipelines
  • Backend model design
  • Online model design
  • Model evaluation
  • Model monitoring
  • Serving

Defining success metrics

IMDB movie reviews

The goal of this notebook is to understand some details about Huggingface’s dataset and transformer libraries and also as a reference point for fine-tuning a LLM model for a classification task.

Back to Top ↑

sql

More SQL

Let’s say we have the following table Users, where user_id is the column with unique values for this table. The goal is to find for each user, the minimum of the two values for col_a and col_b.

+-------------+----------+
| Column Name | Type     |
+-------------+----------+
| user_id     | int      |
| col_a       | int      |
| col_b       | int      |
+-------------+----------+

To do this, we can use the LEAST function (and its counterpart GREATEST):

SELECT
  user_id,
  LEAST(col_a, col_b) AS min_val
FROM Users

SQL Patterns

NOTE: When solving sql questions always make sure that all the required components listed below are taken care of:

Back to Top ↑

arrays

Sliding window

This iterative technique is useful when dealing with problems that involve finding a (longest/shortest) subsequence that satisfies a given criteria in a string or an array. We maintain two pointers, left and right; initially located at the first index.

Back to Top ↑

data_structures

Back to Top ↑